****************************************************** Running COMPSs Applications on G5K (Docker deployment) ****************************************************** .. contents:: :depth: 2 This section is intended to show `how to execute COMPSs applications `_ on `Grid'5000 `_. This example uses the `official Docker image of COMPSs `_. This image comes with COMPSs installed and ready to use. In this example **you will learn how to** (see :ref:`compss-docker-deployment`): - **Define the experimental environment**: - Layers and Services + Monitoring + Define the logic of your COMPSs Service; - Network constraints; - Workflow (tasks: prepare, launch, and finalize). - **Deploy a Docker COMPSs Cluster: 1 Master + 3 Workers** - **Run COMPSs applications** .. _compss-docker-deployment: .. figure:: compss/docker/compss-deployment.png :width: 100% :align: center Figure 1: COMPSs deployment Experiment Artifacts ==================== .. code-block:: bash $ git clone https://gitlab.inria.fr/E2Clab/examples/compss $ cd compss/ $ ls artifacts # Python scripts to generate the COMPSs configuration files (resources.xml and project.xml) docker # contains the COMPSs.py Service and layers_services.yaml, network.yaml, and workflow.yaml files Defining the Experimental Environment ===================================== Layers & Services Configuration ------------------------------- This configuration file presents the **Layers** and **Services** that compose this example. The **COMPSs Service** (a cluster of four nodes, ``quantity: 4``) is composed of one *Master* and three *Workers* (please, see the COMPSs.py Service). The name of the **COMPSs Service** ``- name: COMPSs`` must be the same as the **COMPSs.py** file. Note that we also added a TIG (Telegraf, InfluxDB, and Grafana) Monitoring stack ``monitoring``. Adding ``roles: [monitoring]`` in the **COMPSs Service** we request monitoring of all machines. .. literalinclude:: compss/docker/layers_services.yaml :language: yaml :linenos: Defining the logic of your COMPSs Service ----------------------------------------- Next, we define the logic of our COMPSs Service. It consists mainly in: - installing Docker and pulling the `official image of COMPSs `_; - assigning machines for the **Master** and **Workers**; - creating an `overlay network for standalone containers `_; - adding information about the **Workers** to the **Master** (see ``extra=extra_compss_master`` in **COMPSs.py**) to generate the **resources.xml** and **project.xml** files; - registering the **Master** and **Worker** as a subservice of the COMPSs Service. Please, read the comments in the code for more details. .. literalinclude:: compss/docker/services/COMPSs.py :language: python :linenos: Network Configuration --------------------- The file below presents the network configuration between machines in the COMPSs cluster. In this example, we defined a constraint between the **Master** and **all Workers**. .. literalinclude:: compss/docker/network.yaml :language: yaml :linenos: Workflow Configuration ---------------------- This configuration file presents the application workflow configuration, they are: - Regarding **just the COMPSs Master** ``cloud.compss.*.master.*``: - in ``prepare`` we are copying the Python scripts to genetrate the **COMPSs configuration files** and then we generate such files (``resources.xml`` and ``project.xml``). Note that we used ``--workers {{ _self.workers }}`` since we added this information in the **COMPSs.py Service**. Finally, we copy both files to the container. - in ``launch`` we **run the COMPSs application**. - Regarding **all Workers in the COMPSs cluster** ``cloud.compss.*.worker.*``, in ``prepare`` we add the `COMPSs applications `_. .. literalinclude:: compss/docker/workflow.yaml :language: yaml :linenos: .. note:: Besides ``prepare`` and ``launch``, you could also use ``finalize`` to backup some data (e.g., experiment results). E2Clab first runs on all machines the ``prepare`` tasks. Then, the ``launch`` tasks on all machines, and finally the ``finalize`` tasks. Regarding the ``hosts`` order, it is top to down as defined by the users in the ``workflow.yaml`` file. Running & Verifying Experiment Execution ======================================== Find below the command to **deploy the COMPSs cluster** on G5K and **run COMPSs applications**. Before starting: - make sure that your **COMPSs.py** file is located in: ``e2clab/e2clab/services/``. - in the command bellow, ``compss/docker/`` is the **scenario directory** (where the files ``layers_services.yaml``, ``network.yaml``, and ``workflow.yaml`` must be placed and where the results will be saved). - in the command bellow, ``compss/artifacts/`` is the **artifacts directory** (where the Python scripts to generate the COMPSs configuration files must be placed). .. code-block:: bash $ e2clab deploy compss/docker/ compss/artifacts/ During application runtime, you may want to access the **Grafana** Web interface to visualize the moniotring data of all machines that compose the **COMPSs cluster** (please, check the ``compss/docker/layers_services-validate.yaml`` file to get the instructions to access Grafana). After the application execution, you can check the log files as follows: .. code-block:: bash $ docker exec -it compss_master bash $ cat /root/.COMPSs/simple.py_01/runtime.log $ cat /root/.COMPSs/simple.py_01/jobs/job1_NEW.out Deployment Validation & Experiment Results ========================================== Find below the files generated after the execution of the experiment. It consists of **validation files** (``layers_services-validate.yaml``, ``workflow-validate.out``, and ``network-validate/``) and **monitoring data** ``influxdb-data.tar.gz``. .. code-block:: bash $ ls compss/docker/20220325-152207/ layers_services-validate.yaml # Mapping between layers and services with physical machines workflow-validate.out # Commands used to deploy the application (prepare, launch, and finalize) network-validate/ # Network configuration for each physical machine influxdb-data.tar.gz # Monitoring data .. note:: Providing a **systematic methodology to define the experimental environment** and providing **access to the methodology artifacts** (``layers_services.yaml``, ``network.yaml``, ``workflow.yaml``, and the ``COMPSs.py`` Service) leverages the experiment **Repeatability**, **Replicability**, and **Reproducibility**, see `ACM Digital Library Terminology `_.