Running COMPSs Applications on G5K (standalone deployment)

This section is intended to show how to execute COMPSs applications on Grid’5000. This example assumes you’ve built a custom environment including all COMPSs requirements (please, see How to create a custom environment on G5K)

In this example you will learn how to (see Figure 1: COMPSs deployment):

  • Define the experimental environment:
    • Layers and Services + Define the logic of your COMPSs Service;

    • Network;

    • Workflow (prepare, launch, finalize).

  • Deploy a standalone COMPSs Cluster: 1 Master + 3 Workers

  • Run COMPSs applications

../_images/compss-standalone-deployment.png

Figure 1: COMPSs deployment

Experiment Artifacts

$ git clone https://gitlab.inria.fr/E2Clab/examples/compss
$ cd compss/
$ ls
artifacts     # Python scripts to generate the COMPSs configuration files (resources.xml and project.xml)
standalone    # contains the COMPSs.py Service and layers_services.yaml, network.yaml, and workflow.yaml files

Defining the Experimental Environment

Layers & Services Configuration

This configuration file presents the Layers and Services that compose this example. The COMPSs Service (a cluster of four nodes, quantity: 4) is composed of one Master and three Workers (please, see the COMPSs.py Service). The name of the COMPSs Service - name: COMPSs must be the same as the COMPSs.py file.

 1environment:
 2  job_name: compss
 3  walltime: "00:59:00"
 4  g5k:
 5    job_type: ["deploy"]
 6    env_name: https://api.grid5000.fr/sid/sites/rennes/public/drosendo/compss-image.yaml
 7    cluster: nova
 8layers:
 9- name: cloud
10  services:
11  - name: COMPSs
12    quantity: 4

Defining the logic of your COMPSs Service

Next, we define the logic of our COMPSs Service. It consists mainly in:
  • assigning machines for the Master and Workers;

  • adding information about the Workers to the Master (see extra=extra_compss_master in COMPSs.py) to generate the resources.xml and project.xml files;

  • registering the Master and Worker as a subservice of the COMPSs Service.

Please, read the comments in the code for more details.

 1from e2clab.services import Service
 2from enoslib.api import populate_keys
 3from enoslib.objects import Roles
 4
 5
 6class COMPSs(Service):
 7    def deploy(self):
 8        # ssh keys for the root users must be generated and pushed to all nodes
 9        populate_keys(Roles({"compss": self.hosts}), ".", "id_rsa")
10
11        # Assign machines to COMPSs Master and Workers
12        compss_master = "compss_master"
13        compss_worker = "compss_worker"
14        roles_compss_master = Roles({compss_master: [self.hosts[0]]})
15        roles_compss_worker = Roles({compss_worker: self.hosts[1:len(self.hosts)]})
16
17        # Users may add extra information to Services/sub-Services to access them in "workflow.yaml".
18        # e.g, to access the container name as {{ _self.container_name }} in "workflow.yaml", you can do as follows:
19        workers = [host.alias for host in roles_compss_worker[compss_worker]]
20        extra_compss_master = [{'workers': ','.join(workers)}]  # COMPSs Master
21
22        # Register the Service
23        # register COMPSs Master Service
24        self.register_service(_roles=roles_compss_master, sub_service="master", extra=extra_compss_master)
25        # register COMPSs Worker Service
26        self.register_service(_roles=roles_compss_worker, sub_service="worker")
27
28        return self.service_extra_info, self.service_roles

Network Configuration

The file below presents the network configuration between machines in the COMPSs cluster. In this example, we defined a constraint between the Master and all Workers.

1networks:
2- src: cloud.compss.1.master.1
3  dst: cloud.compss.1.worker.*
4  delay: "2ms"
5  rate: "10gbit"
6  loss: 0.1

Workflow Configuration

This configuration file presents the application workflow configuration, they are:

  • Regarding all machines in the COMPSs cluster cloud.compss.*, in prepare we compile the COMPSs application code and generate the .jar file.

  • Regarding just the COMPSs Master cloud.compss.*.master.*, in prepare we are copying the Python scripts to genetrate the COMPSs configuration files and then we generate such files (resources.xml and project.xml). Note that we used --workers {{ _self.workers }} since we added this information in the COMPSs.py Service. Then, in launch we run the COMPSs application.

 1- hosts: cloud.compss.*
 2  prepare:
 3    - debug:
 4        msg: "[{{ lookup('pipe','date +%Y-%m-%d-%H-%M-%S') }}] Preparing application on the COMPSs Master and Workers."
 5    - shell: export CLASSPATH=$CLASSPATH:/usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/Runtime/compss-engine.jar && cd /root/tutorial_apps/java/hello/src/main/java/hello/ && javac *.java && cd .. && jar cf hello.jar hello
 6- hosts: cloud.compss.*.master.*
 7  prepare:
 8    - debug:
 9        msg: "[{{ lookup('pipe','date +%Y-%m-%d-%H-%M-%S') }}] Generating default_project.xml and default_resources.xml files. My workers are: {{ _self.workers }}"
10    - copy:
11        src: "{{ working_dir }}/generate_project_xml_file.py"
12        dest: "/tmp/generate_project_xml_file.py"
13    - copy:
14        src: "{{ working_dir }}/generate_resources_xml_file.py"
15        dest: "/tmp/generate_resources_xml_file.py"
16    - shell: cd /tmp/ && python generate_project_xml_file.py --workers {{ _self.workers }} --install_dir /usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/ --working_dir /tmp/COMPSsWorker --user root --app_dir /root/ --path_to_new_file /usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/Runtime/configuration/xml/projects/default_project.xml
17    - shell: cd /tmp/ && python generate_resources_xml_file.py --workers {{ _self.workers }} --computing_units 24 --memory_size 125 --min_port_nio 43001 --max_port_nio 43002 --path_to_new_file /usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/Runtime/configuration/xml/resources/default_resources.xml
18  launch:
19    - debug:
20        msg: "Running COMPSs application"
21    - shell: /usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/Runtime/scripts/user/runcompss --project="/usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/Runtime/configuration/xml/projects/default_project.xml" --resources="/usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/Runtime/configuration/xml/resources/default_resources.xml" -d --classpath=/root/tutorial_apps/java/hello/src/main/java/hello.jar hello.Hello

Note

Besides prepare and launch, you could also use finalize to backup some data (e.g., experiment results). E2Clab first runs on all machines the prepare tasks. Then, the launch tasks on all machines, and finally the finalize tasks. Regarding the hosts order, it is top to down as defined by the users in the workflow.yaml file.

Running & Verifying Experiment Execution

Find below the command to deploy the COMPSs cluster on G5K and run COMPSs applications.

Before starting:

  • make sure that your COMPSs.py file is located in: e2clab/e2clab/services/.

  • in the command bellow, compss/standalone/ is the scenario directory (where the files layers_services.yaml, network.yaml, and workflow.yaml must be placed and where the results will be saved).

  • in the command bellow, compss/artifacts/ is the artifacts directory (where the Python scripts to generate the COMPSs configuration files must be placed).

$ e2clab deploy compss/standalone/ compss/artifacts/

Next, you can check the log files after the application execution.

$ cat /root/.COMPSs/hello.Hello_01/runtime.log
$ cat /root/.COMPSs/hello.Hello_01/jobs/job1_NEW.out

Deployment Validation & Experiment Results

Find below the files generated after the execution of the experiment. It consists of validation files layers_services-validate.yaml and workflow-validate.out.

$ ls compss/standalone/20220325-163703/

layers_services-validate.yaml   # Mapping between layers and services with physical machines
workflow-validate.out           # Commands used to deploy the application (prepare, launch, and finalize)

Note

Providing a systematic methodology to define the experimental environment and providing access to the methodology artifacts (layers_services.yaml, network.yaml, workflow.yaml, and the COMPSs.py Service) leverages the experiment Repeatability, Replicability, and Reproducibility, see ACM Digital Library Terminology.