************************
Application Optimization
************************

In this tutorial, we show how to optimize the performance of a toy application
(but it could be a real-life application like `Pl@ntNet <https://plantnet.org/en/>`_, see
`our article <https://hal.science/hal-03310540v1>`_). The optimization algorithm **aims to
find the the infrastructure parameters** (*e.g.,* the number of workers/machines to process
the application) and the **application parameters** (*e.g.,* the number of cores per worker
and the memory available) to **minimize the execution time**.

In this example **you will learn how to**:

- Define an optimization problem (mathematical definition) and then express it in E2Clab
  as a **User-Defined Optimization**
- Use **Bayesian Optimization** method and the **Extra Trees Regressor** algorithm
  provided by `scikit-optimize <https://scikit-optimize.github.io/stable/auto_examples/bayesian-optimization.html>`_
  (users can use other libraries such as Ax, BayesOpt, BOHB, Dragonfly, *etc.*).
  `Which search algorithm to choose?
  <https://docs.ray.io/en/latest/tune/faq.html#which-search-algorithm-scheduler-should-i-choose>`_
- Define the **parallelism level** of the application deployment on Grid'5000 (but it
  could be in FIT IoT LAB, Chameleon, or combining resources from various testbeds)
- Use **User-Defined Optimization** to manage the optimization, for instance, change the
  infrastructure (`layers_services.yaml`) and application (`my_application.py`) parameters
- Execute experiments and analyze the optimization results


The optimization problem
========================

**What is the infrastructure configuration and software configuration that minimizes the
user response time?**

The optimization problem to be solved can be stated as follows (**Equation 1**):

| **Find** :math:`(num\_workers, cores\_per\_worker, memory\_per\_worker)`, **in order to**
| **Minimize** `UserResponseTime`
| **Subject to**
|       :math:`1 \leq num\_workers \leq 10`
|       :math:`20 \leq cores\_per\_worker \leq 50`
|       :math:`1 \leq memory\_per\_worker \leq 3`


Experiment Artifacts
====================

.. code-block:: bash

    $ cd ~/git/
    $ git clone https://gitlab.inria.fr/E2Clab/examples/workflow_optimization

In this repository you will find:

- the **E2Clab configuration files** such as layers_services.yaml, network.yaml, and
  workflow.yaml, as well as, the **UserDefinedOptimization.py**
- the toy application **my_application.py**


Defining the Experimental Environment
=====================================

Layers & Services Configuration
-------------------------------

This configuration file presents the **layers** and **services** that compose this example.
We request resources from Grid'5000 ``environment: g5k``. We define the ``cloud`` layer
and add a service to it ``myapplication``. The service runs in a single machine
``quantity: 1``. In our optimization problem ``num_workers`` will change ``quantity:`` to
deploy the application on multiple machines :math:`1 \leq num\_workers \leq 10`.


.. literalinclude:: ./application_optimization/layers_services.yaml
   :language: yaml
   :linenos:

The toy application
-------------------

All the **optimization variables**, such as :math:`1 \leq num\_workers \leq 10`,
:math:`20 \leq cores\_per\_worker \leq 50`, and :math:`1 \leq memory\_per\_worker \leq 3`
are passed to the application as follows:


.. code-block:: bash

    $ python my_application.py --config "{{ optimization_config }}"


To emulate the application behavior based on the infrastructure configuration and
software configuration, we defined the equation presented in **lines 20 to 23**. The
**workload_size = 100** and the **communication_cost = 2** (communication between workers).


.. literalinclude:: ./application_optimization/my_application.py
   :language: python
   :linenos:


Network Configuration
---------------------

In this example, we do not have an **optimizaiton variable** related to the network
configuration. But we could have, like we did for the **layers_services.yaml**.
In this case, no changes are required in the ``network.yaml`` file.

.. literalinclude:: ./application_optimization/network.yaml
   :language: yaml
   :linenos:


Workflow Configuration
----------------------

This configuration file presents the application workflow configuration.

- **The Cloud application** ``cloud.*``:

``prepare`` copies from the local machine to the remote machine the application.

``launch`` executes the application using the configuration suggested by the algorithm.

``finalize`` after experiment ends, copies the result from the remote to the local
machine. The **result.txt file** contains the **user_response_time** (value depends on the
infrastructure and software configuration).


.. literalinclude:: ./application_optimization/workflow.yaml
   :language: yaml
   :linenos:


User-Defined Optimization
-------------------------


run() function:
---------------

- We use **Bayesian Optimization** as the optimization method and the Extra Trees
  Regressor algorithm, `see line 12` **algo = SkOptSearch()**
- 3 is the parallelism level of the workflow deployments, `see line 13`
  **algo = ConcurrencyLimiter(algo, max_concurrent=3)**
- We define the optimization problem (**Equation 1**) in lines 15 to 27, see
  **objective = tune.run(...)**


run_objective()
---------------

- **prepare()** creates an optimization directory. Each application deployment evaluation
  has its own directory. In **lines 40 to 47** we update the ``layers_services.yaml`` file
  to add ``_config["num_workers"]`` in ``quantity:``.

- **launch(optimization_config=_config)** makes a new deployment with the infrastructure
  and application configurations suggested by the search algorithm. It executes all the
  *E2Clab commands* for the deployment, such as *layers_services*, *network*, *workflow
  (prepare, launch, finalize)* and *finalize*. In this example, we have **3 parallel
  deployments** and the search algorithm is **trained asynchronously**.

- **finalize()** saves the optimization results in the ``optimization directory``.


.. literalinclude:: ./application_optimization/UserDefinedOptimization.py
   :language: python
   :linenos:


Running & Verifying Experiment Execution
========================================


Find below the commands to deploy this application and check its execution.

.. code-block:: bash

    $ e2clab optimize ~/git/workflow_optimization/ ~/git/workflow_optimization/


Deployment Validation & Experiment Results
==========================================

As we defined ``num_samples=9`` (see line 22) we have 9 evaluations of the search space
(9 application deployments on G5K). The table bellow summarizes the results. The
configuration found by the algorithm that minimizes the ``user_response_time`` consists of
6 machines (``'num_workers': 6``) each one with 48 cores (``'cores_per_worker': 48``) and
2 memory slots (``'memory_per_worker': 2``). This configuration gives a user response time
of ``20.68 seconds``.


.. code-block:: text

    ╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
    │ Trial name               status         num_workers     cores_per_worker     memory_per_worker     iter     total time (s)     user_response_time │
    ├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
    │ run_objective_21bdbea8   TERMINATED               3                   34                     1        1            76.8974                40.3137 │
    │ run_objective_d8b3048b   TERMINATED               7                   29                     1        1            79.81                  28.7783 │
    │ run_objective_f44d0fd6   TERMINATED               4                   49                     1        1           129.91                  33.5102 │
    │ run_objective_aa1e053e   TERMINATED               6                   48                     2        1            66.3485                20.6806 │
    │ run_objective_9adcfe45   TERMINATED               9                   38                     1        1           134.151                 29.4035 │
    │ run_objective_45150746   TERMINATED               5                   31                     1        1           190.87                  30.6452 │
    │ run_objective_9eaf2742   TERMINATED               8                   38                     1        1           529.561                 28.8289 │
    │ run_objective_a118f595   TERMINATED               2                   22                     1        1           127.343                 56.2727 │
    │ run_objective_214fc572   TERMINATED               3                   40                     1        1           316.947                 40.1667 │
    ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

    Hyperparameters found:  {'num_workers': 6, 'cores_per_worker': 48, 'memory_per_worker': 2}


Find below the 9 directories generated from each deployment and experiment execution.

.. code-block:: text
    
    $ ls -la optimization/
    
        ... Jul 27 16:30 20230727-162917-3f998bb2a73846439fbb1e480a2cb22a
        ... Jul 27 16:30 20230727-162921-aab11d8263654143b6a10bed0d8fd14f
        ... Jul 27 16:31 20230727-162926-c6abfed0dfd84b13b8868d74f39666bc
        ... Jul 27 16:31 20230727-163034-cc4d8bbd9ade4206b7daee8b6be531b6
        ... Jul 27 16:32 20230727-163041-36d62067b97b47d7839be6c61f40ecdc
        ... Jul 27 16:34 20230727-163136-cc9f6537cade4771a8bba2fdbf269e93
        ... Jul 27 16:40 20230727-163140-3aa41ba4e95b4900ab6ff981beab3318
        ... Jul 27 16:35 20230727-163255-a81909fb7869450095f8b83053760542
        ... Jul 27 16:40 20230727-163447-c25333d7554b418b8a3f37f1a0ce6097

The generated files consist of:

.. code-block:: bash

    $ ls -la optimization/20230727-162917-3f998bb2a73846439fbb1e480a2cb22a/

        20230727-162917/        # validation files generated from each deployment
        optimization-results/   # the optimization results
        layers_services.yaml    # E2Clab config files
        network.yaml
        workflow.yaml


For each deployment, in ``20230727-162917/``, we have the **validation files** such as
``layers_services-validate.yaml``, ``results/``, and ``workflow-validate.out``.

.. code-block:: text

    $ ls -la optimization/20230727-162917-3f998bb2a73846439fbb1e480a2cb22a/20230727-162917/

        layers_services-validate.yaml
        results/
        workflow-validate.out

In ``optimization-results/``, we have


.. code-block:: bash

    $ ls -la optimization/20230727-162917-3f998bb2a73846439fbb1e480a2cb22a/optimization-results/

        params.json     # the parameters explored by the algorithm
        params.pkl      # contains state information of the algorithm (for checkpoint)


.. note::

    **Checkpoints:** users can `snapshot the training progress
    <https://docs.ray.io/en/latest/tune/tutorials/tune-trial-checkpoints.html#tune-trial-checkpoint>`_.