************
Optimization
************

.. toctree::
    :maxdepth: 2
    :caption: Contents:


The core methodology of E2Clab allows for performing reproducible experiments to
understand the application performance. Such applications typically need to comply with
many constraints related to resource usage (*e.g.,* GPU, CPU, memory, storage, and bandwidth
capacities), energy consumption, QoS, security, and privacy. Therefore, enabling their
optimized execution across the Edge-to-Cloud Continuum is challenging. The parameter
settings of the applications and the underlying infrastructure result in a complex
configuration search space.

E2Clab optimization methodology
===============================

Next, we present our optimization methodology. It supports reproducible parallel
optimization of application workflows on large-scale testbeds. It consists of three main
phases illustrated in :ref:`e2clab_optimization`.

.. _e2clab_optimization:
.. figure:: ../figures/E2Clab-methodology-optim.png
    :width: 100%
    :align: center

    Figure 1: E2Clab optimization methodology


Phase I: Initialization
-----------------------
This phase, depicted at the top of :ref:`e2clab_optimization`, consists in defining the
optimization problem. The user must specify:

- The **optimization variables** that compose the search space to be explored (*e.g.,* GPUs
  used for processing, Fog nodes in the scenario, network bandwidth, *etc.*)
- The **objective**, such as minimize end-to-end latency, maximize Fog gateway throughput,
  among others
- The **constraints**, such as the upper and lower bounds of optimization variables,
  budget, latency, *etc.*


Phase II: Evaluation
--------------------
This phase aims at defining the optimization method and algorithm used in the
**optimization cycle** (presented in the middle of :ref:`e2clab_optimization`) to explore
the search space.

The **optimization cycle** consists in:

#. **parallel deployments** of the application workflow in large-scale testbeds
#. **their simultaneous execution**
#. **asynchronous model training and optimization** with data obtained from the workflow execution
#. **reconfiguration** of the application workflow with the configuration suggested by the search algorithm

This cycle continues until model convergence or after a given number of evaluations
defined by the user.


Phase III: Finalization
-----------------------
For **reproducibility** purposes, this last phase illustrated at the bottom of
:ref:`e2clab_optimization`, provides a summary of computations. Therefore, it provides
the definition of the optimization problem (optimization variables, objective, and
constraints); the sample selection method; the surrogate models or search algorithms with
their hyperparameters used to explore the search space of the optimization problem; and
finally, the optimal application configuration found.


The optimization manager
========================
We enhanced the E2Clab framework by implementing our optimization methodology on it. We
added a new manager, named Optimization Manager, to manage the optimization cycle.

The Optimization Manager uses `Ray Tune <https://docs.ray.io/en/latest/tune/index.html>`_
to parallelize the application optimization and explores state-of-the-art Bayesian
Optimization libraries such as Ax, BayesOpt, BOHB, Dragonfly, FLAML, HEBO, Hyperopt,
Nevergrad, Optuna, SigOpt, skopt, and ZOOpt. `Which search algorithm to choose?
<https://docs.ray.io/en/latest/tune/faq.html#which-search-algorithm-scheduler-should-i-choose>`_


.. _ray_tune:
.. figure:: ../figures/tune_overview.png
    :width: 50%
    :align: center

    Figure 2: Ray Tune


How to set up an optimization?
==============================
To set up an optimization, users should define what we named **User-defined optimization**.
For that, E2Clab provides a class-based API that allows users to easily set up and manage
the optimization using the **run()** and **run_objective()** methods.


User-defined optimization
-------------------------

run()
-----
In this method, users have to mainly:

- Define the optimization method (*e.g.,* Bayesian Optimization) and algorithm (*e.g.,*
  Extra Trees Regressor), for instance: **algo = SkOptSearch()**
- The parallelism level of the workflow deployments, for instance:
  **algo = ConcurrencyLimiter(algo, max_concurrent=3)**
- Define the optimization problem, for instance: **objective = tune.run(...)**

run_objective()
---------------

- **prepare()** creates an *optimization directory*. Each application deployment evaluation
  has its own directory.

- **launch(optimization_config=_config)** deploys the application configurations (suggested
  by the search algorithm). It executes all the *E2Clab commands* for the deployment, such
  as *layers_services*, *network*, *workflow (prepare, launch, finalize)* and *finalize*.

- **finalize()** saves the optimization results in the *optimization directory*.


Accessing the optimization variables from your workflow.yaml file
-----------------------------------------------------------------

Users can use ``{{ optimization_config }}`` in their ``workflow.yaml`` file to get access
to the ``_config`` variables (the configuration suggested by the search algorithm, see
**line 33** ``def run_objective(self, _config)``). For instance, to pass the optimization
variables to a Python application, users could do as follows:

.. code-block:: bash

    $ - shell: python my_application.py --config "{{ optimization_config }}"


Example
-------

Below we provide an example of the **User-defined optimization** file created by the user.

.. note::

  The **User-defined optimization** file must be in the **./e2clab/e2clab/optimizer/**
  directory.


.. literalinclude:: ../examples/application_optimization/UserDefinedOptimization.py
   :language: python
   :linenos:


E2Clab CLI
==========

Use the following command to execute the optimization.

.. code-block:: bash

    $ e2clab optimize /path/to/scenario/ /path/to/artifacts/

The command usage is as follows:

.. code-block:: text

    Usage: e2clab optimize [OPTIONS] SCENARIO_DIR ARTIFACTS_DIR

      Optimize application workflow.

    Options:
      --duration INTEGER  Duration of each experiment in seconds.
      --repeat INTEGER    Number of times to repeat the experiment.
      --help              Show this message and exit.

Try some examples
=================

We provide a `toy example here <../examples/application_optimization.html>`_.