Optimization

The core methodology of E2Clab allows for performing reproducible experiments to understand the application performance. Such applications typically need to comply with many constraints related to resource usage (e.g., GPU, CPU, memory, storage, and bandwidth capacities), energy consumption, QoS, security, and privacy. Therefore, enabling their optimized execution across the Edge-to-Cloud Continuum is challenging. The parameter settings of the applications and the underlying infrastructure result in a complex configuration search space.

E2Clab optimization methodology

Next, we present our optimization methodology. It supports reproducible parallel optimization of application workflows on large-scale testbeds. It consists of three main phases illustrated in Figure 1: E2Clab optimization methodology.

../_images/E2Clab-methodology-optim.png

Figure 1: E2Clab optimization methodology

Phase I: Initialization

This phase, depicted at the top of Figure 1: E2Clab optimization methodology, consists in defining the optimization problem. The user must specify:

  • The optimization variables that compose the search space to be explored (e.g., GPUs used for processing, Fog nodes in the scenario, network bandwidth, etc.)

  • The objective, such as minimize end-to-end latency, maximize Fog gateway throughput, among others

  • The constraints, such as the upper and lower bounds of optimization variables, budget, latency, etc.

Phase II: Evaluation

This phase aims at defining the optimization method and algorithm used in the optimization cycle (presented in the middle of Figure 1: E2Clab optimization methodology) to explore the search space.

The optimization cycle consists in:

  1. parallel deployments of the application workflow in large-scale testbeds

  2. their simultaneous execution

  3. asynchronous model training and optimization with data obtained from the workflow execution

  4. reconfiguration of the application workflow with the configuration suggested by the search algorithm

This cycle continues until model convergence or after a given number of evaluations defined by the user.

Phase III: Finalization

For reproducibility purposes, this last phase illustrated at the bottom of Figure 1: E2Clab optimization methodology, provides a summary of computations. Therefore, it provides the definition of the optimization problem (optimization variables, objective, and constraints); the sample selection method; the surrogate models or search algorithms with their hyperparameters used to explore the search space of the optimization problem; and finally, the optimal application configuration found.

The optimization manager

E2Clab now provides an API for you to programmatically run E2Clab experiments so that you can insert them in optimization loops. This new API was designed with Ray Tune compatibility in mind.

Ray Tune allows users to parallelize the application optimization (hence run multiple experiments at the same time) and explores state-of-the-art Bayesian Optimization libraries such as Ax, BayesOpt, BOHB, Dragonfly, FLAML, HEBO, Hyperopt, Nevergrad, Optuna, SigOpt, skopt, and ZOOpt. Which search algorithm to choose?

../_images/tune_overview.png

Figure 2: Ray Tune

How to set up an optimization?

To set up an optimization, users should define what we named User-defined optimization. For that, E2Clab provides a class-based API that allows users to easily set up and manage the optimization using the run() method and launch() methods.

Install dependencies

We recommend that you use the ray[tune] python module which is an industry standard for hyperparameter tuning.

# From repository
pip install -e .[opt]
# From pip
pip install e2clab[opt]

Example

run()

In this method, users have to mainly:

  • Define the optimization method (e.g., Bayesian Optimization) and algorithm (e.g., Extra Trees Regressor), for instance: algo = SkOptSearch()

  • The parallelism level of the workflow deployments, for instance: algo = ConcurrencyLimiter(algo, max_concurrent=3)

  • Define the optimization problem, for instance: objective = tune.run(…)

run_objective()

  • prepare() creates an optimization directory. Each application deployment evaluation has its own directory.

  • launch(optimization_config=_config) deploys the application configurations (suggested by the search algorithm). It executes all the E2Clab commands for the deployment, such as layers_services, network, workflow (prepare, launch, finalize) and finalize.

  • finalize() saves the optimization results in the optimization directory.

Accessing the optimization variables from your workflow.yaml file

Users can use {{ optimization_config }} in their workflow.yaml file to get access to the _config variables (the configuration suggested by the search algorithm, see line 33 def run_objective(self, _config)). For instance, to pass the optimization variables to a Python application, users could do as follows:

$ - shell: python my_application.py --config "{{ optimization_config }}"

Below we provide an example implementation of the User-defined optimization file created by the user.

 1import yaml
 2from pathlib import Path
 3
 4from ray import tune, train
 5
 6from ray.tune.schedulers import AsyncHyperBandScheduler
 7from ray.tune.search import ConcurrencyLimiter
 8from ray.tune.search.hyperopt import HyperOptSearch
 9
10from e2clab.optimizer import Optimizer
11
12
13class UserDefinedOptimization(Optimizer):
14
15    def __init__(self, *args, **kwargs):
16        super().__init__(*args, **kwargs)
17
18    MAX_CONCURRENCY = 3
19    NUM_SAMPLES = 9
20
21    # 'run' abstract method to define
22    def run(self):
23        algo = HyperOptSearch()
24        algo = ConcurrencyLimiter(algo, max_concurrent=self.MAX_CONCURRENCY)
25        scheduler = AsyncHyperBandScheduler()
26        objective = tune.run(
27            self.run_objective,
28            metric="user_response_time",
29            mode="min",
30            name="my_application",
31            search_alg=algo,
32            scheduler=scheduler,
33            num_samples=self.NUM_SAMPLES,
34            config={
35                "num_workers": tune.randint(1, 10),
36                "cores_per_worker": tune.randint(20, 50),
37                "memory_per_worker": tune.randint(1, 3),
38            },
39            fail_fast=True,
40        )
41
42        print("Hyperparameters found: ", objective.best_config)
43
44    # Function to optimize
45    def run_objective(self, _config):
46        # create an optimization directory using "self.prepare()"
47        # accessible in 'self.optimization_dir'
48        optimization_dir = self.prepare()
49
50        # update the parameters of your configuration file(s)
51        # (located in "self.optimization_dir") according to
52        # "_config" (defined by the search algorithm)
53        with open(f"{optimization_dir}/layers_services.yaml") as f:
54            config_yaml = yaml.load(f, Loader=yaml.FullLoader)
55        for layer in config_yaml["layers"]:
56            for service in layer["services"]:
57                if service["name"] in ["myapplication"]:
58                    service["quantity"] = _config["num_workers"]
59        with open(f"{optimization_dir}/layers_services.yaml", "w") as f:
60            yaml.dump(config_yaml, f)
61
62        # deploy the configurations using "self.launch()".
63        # "self.launch()" runs:
64        #   layers_services;
65        #   network;
66        #   workflow (prepare, launch, finalize);
67        #   finalize;
68        # returns the 'result_dir'(Path) where you can access
69        # artifacts pulled from your experiment during the 'finalize' step
70        result_dir = self.launch(optimization_config=_config)
71
72        # Move the optimization results from your experiment folder to
73        # your optimization folder and destroy computing resources
74        result_file = "results/results.txt"
75        result_file = f"{result_dir}/{result_file}"
76        with open(result_file) as file:
77            line = file.readline()
78            user_response_time = float(line.rstrip())
79
80        # report the metric value to Ray Tune
81        train.report({"user_response_time": user_response_time})
82
83        # Free computing resources
84        self.finalize()
85
86
87if __name__ == "__main__":
88    # Programmatically run optimization
89    optimizer = UserDefinedOptimization(
90        scenario_dir=Path(".").resolve(),
91        artifacts_dir=Path("./artifacts/").resolve(),
92        duration=0,
93        repeat=0,
94    )
95    optimizer.run()

API

class e2clab.optimizer.Optimizer(scenario_dir: Path, artifacts_dir: Path, duration: int = 0, repeat: int = 0)

E2Clab class API for optimization loops. Inherit from this class to define your Optimization

Parameters:
  • scenario_dir (Path) – Path to your SCENARIO_DIR

  • artifacts_dir (Path) – Path to your ARTIFACTS_DIR

  • duration (int, optional) – Duration of your ‘deployment’, defaults to 0

  • repeat (int, optional) – Number of ‘deployment’ repreats, defaults to 0

Raises:

Exception – If supplied invalid e2clab configuration

abstract run()

Setup your training: Implement the logic of your optimization.

prepare() Path

Creates a new directory optimization_dir for optimization run. Copies experiment definition files (yaml files) into optimization_dir. The optimization run will be launched from this optimization_dir.

Returns:

Optimization run’s SCENARIO_DIR

Return type:

optimization_dir (Path)

launch(optimization_config: dict | None = None) Path

Deploys the configurations defined by the search algorithm. It runs the following E2Clab commands automatically: - layers_services - network - workflow (prepare, launch, finalize) - finalize

Parameters:

optimization_config (dict, optional) – “config” dictionary, defaults to None Passed to your workflow in {{ optimization_config }}

Returns:

output folder of your optimization experiment run,

Return type:

result_dir (Path)

finalize() None

Destroys optimization run computing resources

Try some examples

We provide a toy example here.