Running COMPSs Applications on G5K (standalone deployment)
This section is intended to show how to execute COMPSs applications on Grid’5000. This example assumes you’ve built a custom environment including all COMPSs requirements (please, see How to create a custom environment on G5K)
In this example you will learn how to (see Figure 1: COMPSs deployment):
- Define the experimental environment:
Layers and Services + Define the logic of your COMPSs Service;
Network;
Workflow (prepare, launch, finalize).
Deploy a standalone COMPSs Cluster: 1 Master + 3 Workers
Run COMPSs applications
Experiment Artifacts
$ git clone https://gitlab.inria.fr/E2Clab/examples/compss
$ cd compss/
$ ls
artifacts # Python scripts to generate the COMPSs configuration files (resources.xml and project.xml)
standalone # contains the COMPSs.py Service and layers_services.yaml, network.yaml, and workflow.yaml files
Defining the Experimental Environment
Layers & Services Configuration
This configuration file presents the Layers and Services that compose this example. The COMPSs Service (a cluster of four nodes,
quantity: 4
) is composed of one Master and three Workers (please, see the COMPSs.py Service). The name of the COMPSs Service
- name: COMPSs
must be the same as the COMPSs.py file.
1environment:
2 job_name: compss
3 walltime: "00:59:00"
4 g5k:
5 job_type: ["deploy"]
6 env_name: https://api.grid5000.fr/sid/sites/rennes/public/drosendo/compss-image.yaml
7 cluster: nova
8layers:
9- name: cloud
10 services:
11 - name: COMPSs
12 quantity: 4
Defining the logic of your COMPSs Service
- Next, we define the logic of our COMPSs Service. It consists mainly in:
assigning machines for the Master and Workers;
adding information about the Workers to the Master (see
extra=extra_compss_master
in COMPSs.py) to generate the resources.xml and project.xml files;registering the Master and Worker as a subservice of the COMPSs Service.
Please, read the comments in the code for more details.
1from e2clab.services import Service
2from enoslib.api import populate_keys
3from enoslib.objects import Roles
4
5
6class COMPSs(Service):
7 def deploy(self):
8 # ssh keys for the root users must be generated and pushed to all nodes
9 populate_keys(Roles({"compss": self.hosts}), ".", "id_rsa")
10
11 # Assign machines to COMPSs Master and Workers
12 compss_master = "compss_master"
13 compss_worker = "compss_worker"
14 roles_compss_master = Roles({compss_master: [self.hosts[0]]})
15 roles_compss_worker = Roles({compss_worker: self.hosts[1:len(self.hosts)]})
16
17 # Users may add extra information to Services/sub-Services to access them in "workflow.yaml".
18 # e.g, to access the container name as {{ _self.container_name }} in "workflow.yaml", you can do as follows:
19 workers = [host.alias for host in roles_compss_worker[compss_worker]]
20 extra_compss_master = [{'workers': ','.join(workers)}] # COMPSs Master
21
22 # Register the Service
23 # register COMPSs Master Service
24 self.register_service(_roles=roles_compss_master, sub_service="master", extra=extra_compss_master)
25 # register COMPSs Worker Service
26 self.register_service(_roles=roles_compss_worker, sub_service="worker")
27
28 return self.service_extra_info, self.service_roles
Network Configuration
The file below presents the network configuration between machines in the COMPSs cluster. In this example, we defined a constraint between the Master and all Workers.
1networks:
2- src: cloud.compss.1.master.1
3 dst: cloud.compss.1.worker.*
4 delay: "2ms"
5 rate: "10gbit"
6 loss: 0.1
Workflow Configuration
This configuration file presents the application workflow configuration, they are:
Regarding all machines in the COMPSs cluster
cloud.compss.*
, inprepare
we compile the COMPSs application code and generate the .jar file.Regarding just the COMPSs Master
cloud.compss.*.master.*
, inprepare
we are copying the Python scripts to genetrate the COMPSs configuration files and then we generate such files (resources.xml
andproject.xml
). Note that we used--workers {{ _self.workers }}
since we added this information in the COMPSs.py Service. Then, inlaunch
we run the COMPSs application.
1- hosts: cloud.compss.*
2 prepare:
3 - debug:
4 msg: "[{{ lookup('pipe','date +%Y-%m-%d-%H-%M-%S') }}] Preparing application on the COMPSs Master and Workers."
5 - shell: export CLASSPATH=$CLASSPATH:/usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/Runtime/compss-engine.jar && cd /root/tutorial_apps/java/hello/src/main/java/hello/ && javac *.java && cd .. && jar cf hello.jar hello
6- hosts: cloud.compss.*.master.*
7 prepare:
8 - debug:
9 msg: "[{{ lookup('pipe','date +%Y-%m-%d-%H-%M-%S') }}] Generating default_project.xml and default_resources.xml files. My workers are: {{ _self.workers }}"
10 - copy:
11 src: "{{ working_dir }}/generate_project_xml_file.py"
12 dest: "/tmp/generate_project_xml_file.py"
13 - copy:
14 src: "{{ working_dir }}/generate_resources_xml_file.py"
15 dest: "/tmp/generate_resources_xml_file.py"
16 - shell: cd /tmp/ && python generate_project_xml_file.py --workers {{ _self.workers }} --install_dir /usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/ --working_dir /tmp/COMPSsWorker --user root --app_dir /root/ --path_to_new_file /usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/Runtime/configuration/xml/projects/default_project.xml
17 - shell: cd /tmp/ && python generate_resources_xml_file.py --workers {{ _self.workers }} --computing_units 24 --memory_size 125 --min_port_nio 43001 --max_port_nio 43002 --path_to_new_file /usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/Runtime/configuration/xml/resources/default_resources.xml
18 launch:
19 - debug:
20 msg: "Running COMPSs application"
21 - shell: /usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/Runtime/scripts/user/runcompss --project="/usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/Runtime/configuration/xml/projects/default_project.xml" --resources="/usr/local/lib/python3.5/dist-packages/pycompss/COMPSs/Runtime/configuration/xml/resources/default_resources.xml" -d --classpath=/root/tutorial_apps/java/hello/src/main/java/hello.jar hello.Hello
Note
Besides prepare
and launch
, you could also use finalize
to backup some data (e.g., experiment results).
E2Clab first runs on all machines the prepare
tasks. Then, the launch
tasks on all machines, and finally the
finalize
tasks. Regarding the hosts
order, it is top to down as defined by the users in the workflow.yaml
file.
Running & Verifying Experiment Execution
Find below the command to deploy the COMPSs cluster on G5K and run COMPSs applications.
Before starting:
make sure that your COMPSs.py file is located in:
e2clab/e2clab/services/
.in the command bellow,
compss/standalone/
is the scenario directory (where the fileslayers_services.yaml
,network.yaml
, andworkflow.yaml
must be placed and where the results will be saved).in the command bellow,
compss/artifacts/
is the artifacts directory (where the Python scripts to generate the COMPSs configuration files must be placed).
$ e2clab deploy compss/standalone/ compss/artifacts/
Next, you can check the log files after the application execution.
$ cat /root/.COMPSs/hello.Hello_01/runtime.log
$ cat /root/.COMPSs/hello.Hello_01/jobs/job1_NEW.out
Deployment Validation & Experiment Results
Find below the files generated after the execution of the experiment. It consists of validation files
layers_services-validate.yaml
and workflow-validate.out
.
$ ls compss/standalone/20220325-163703/
layers_services-validate.yaml # Mapping between layers and services with physical machines
workflow-validate.out # Commands used to deploy the application (prepare, launch, and finalize)
Note
Providing a systematic methodology to define the experimental environment and providing access to the methodology artifacts (layers_services.yaml
, network.yaml
, workflow.yaml
, and the COMPSs.py
Service) leverages the experiment Repeatability, Replicability, and Reproducibility, see ACM Digital Library Terminology.