- Java 8+
- Openstack API access
- git & maven 3.3+ to build bibigrid from sources
Build from sources¶
- Clone GitHub repository:
git clone https://github.com/BiBiServ/bibigrid.git
- Change into project dir:
- Build Java executable for Openstack
mvn -P openstack clean package
java -jar bibigrid-main/target/bibigrid-openstack-<version>.jar -hto get help messages and check if executable works
If you don't want to build the client from sources you can download a prebuilt binary. It is recommend to choose the latest one.
Check the GitHub repository for detailed information about BiBiGrid.
BiBiGrid is an open source tool hosted at github for an easy cluster setup inside a cloud environment. BiBigrid is operating system and cloud provider independent, currently there are existing backend implementations for Amazon (AWS), Google (Google Compute), Microsoft (Azure) and OpenStack. It provides a HPC like environment, providing a shared filesystem between all nodes and a Grid Batch Scheduler.
BiBiGrid configures a classic master / slaves cluster.¶
- One master and one (or more) slave nodes. The used images could be blank Ubuntu 16.04 images or could come with pre-installed software. BiBiGrid uses Ansible to install and configure all instances.
- All instances run in the same security group with default ssh access. Additional ports can easily be configured.
- Local disk space of the master node is provided as a shared spool-disk space (/vol/spool) between master and all slaves. Local storage of the slaves is configured as temporary scratch space.
- Volumes provided by Cinder can be mounted to the master node and optionally distributed to all slaves (as NFS shares).
- Object Store is available to all nodes.
The goal of this session is to setup a small HPC cluster consisting of 4 nodes (1 master, 3 slaves) using BiBiGrid. The template below does the job for the de.NBI cloud site in Bielefeld - you have to replace all XXX's with your project environment. When running BiBigrid on other cloud sites than Bielefeld, you have possibly also adjust the region, availabilityZone, instance type(s) and image id(s).
#use openstack mode: openstack credentialsFile: /path/to/your/credentials.yml #Access sshPrivateKeyFile: path/to/private/key sshPublicKeyFile: path/to/public/key sshUser: ubuntu keypair: XXX region: Bielefeld availabilityZone: default #Network subnet: XXX #BiBiGrid-Master masterInstance: type: de.NBI.small+ephemeral #Ubuntu 16.04 LTS (2019-01-11) image: f33f5e06-95bb-4378-97ce-25e61b2fce03 #BiBiGrid-Slave slaveInstances: - type: de.NBI.default count: 3 #Ubuntu 16.04 LTS (2019-01-11) image: f33f5e06-95bb-4378-97ce-25e61b2fce03 #Firewall/Security Group ports: - type: TCP number: 80 # ----------------------- # services # ---------------------- useMasterAsCompute: yes nfs: yes # Grid batch scheduler # GridEngine is deprecated and only supported on Ubuntu 16.04 oge: yes # Slurm is supported on Ubuntu 16.04 and 18.04 slurm: no # Monitoring # Ganglia is deprecated and only supported on Ubuntu 16.04 ganglia: yes # Zabbix is supportd on Ubuntu 16.04 and 18.04 zabbix: no # Web IDE cloud9: yes
tenantName: XXX username: ELIXIRID@elixir-europe.org password: PASSWORD endpoint: https://openstack.cebitec.uni-bielefeld.de:5000/v3/ domain: elixir tenantDomain: elixir
The openstack credentials requires the name not the id .
You can simply check your configuration using :
java -jar bibigrid-openstack-<version>.jar -ch -o bibigrid.yml
Using an existing volume¶
To mount and share an available volume to a master and between the slave nodes, add the following lines to the
- Mount volume on master:
masterMounts: - source: *volume-id* target: /vol/xxx
nfsShares: - /vol/xxx
Replace the volume-id and use a name for the volume instead of xxx.
Start the Cluster¶
java -jar bibigrid-openstack-<version>.jar -c -o bibigrid.yml
or more verbose:
java -jar bibigrid-openstack-<version>.jar -c -v -o bibigrid.yml
Starting with blank Ubuntu 16.04 images takes up to 20 minutes to finish, depending on the instance performance and bibigrid configuration.
Good to know¶
/vol/spool-> shared filesystem between all nodes.
/vol/scratch-> local diskspace (ephemeral disk, if provided)
Login into the Cluster¶
After a successful setup ...
SUCCESS: Master instance has been configured. Ok : You might want to set the following environment variable: export BIBIGRID_MASTER=XXX.XXX.XXX.XXX You can then log on the master node with: ssh -i /path/to/private/ssh-key ubuntu@$BIBIGRID_MASTER The cluster id of your started cluster is: vnqtbvufr3uovci You can easily terminate the cluster at any time with: ./bibigrid -t XXX
qhostto check if there are 4 execution nodes available.
List running Cluster¶
Since it is possible to start more than one cluster at once, it is possible to list all running clusters:
java -jar bibigrid-openstack-<version>.jar -l
The command returns an informative list about all your running clusters.
Cloud9 is a Web-IDE that allows a more comfortable way to work with your cloud instances. Although cloud9 is in an alpha state, it is stable enough to be used for an environment like ours. Let's see how this works together with BiBiGrid.
If the cloud9 option is enabled in the configuration, cloud9 will be run as systemd service on localhost. For security reasons, cloud9 is not binding to a standard network device. A valid certificate and some kind of authentication is needed to create a safe connection, which is not that easy in a dynamic cloud environment.
However, Bibigrid has the possibility to open a ssh tunnel from the local machine to bibigrids master instance and open up a browser running cloud9 web ide.
java -jar bibigrid-openstack-<version>.jar --cloud9 <clusterid>
Hello World, Hello BiBiGrid!¶
After successful starting a cluster in the cloud, start with a typical example : Hello World !
- Login into your master node and change to the spool directory.
- Create a new shell script
helloworld.shcontaining a "hello world" :
#!/bin/bash echo Hello from $(hostname) ! sleep 10
- Submit this script to each node:
qsub -cwd -t 1-4 -pe multislot 2 helloworld.sh
- See the status of our cluster:
- See the output:
Attaching a volume to a running cluster¶
If you have successfully run a cluster, you may want to attach a volume to an instance. See Using Cinder Volumes to get information about how to work with a volume.
Share a volume between all nodes¶
After attaching a volume to the master, you might want to share it between all slave nodes. One way of sharing data between master and slaves in the BiBiGrid is the spool directory. Instead, you have the possibility to share the volume created before with the Ansible tool. Ansible lets you automatically execute commands on several nodes in your cluster.
When you fulfilled the attaching of a volume to the master node (or any other node) you will see, that the other nodes in your cluster don't have access to it, neither does the volume exist at all.
- Create mount points
- Mount NFS shares
Instead of letting Ansible execute every single command, you can simply create a playbook.
- Create shareVolume.yml
- Copy & Paste the following lines into the file - XXX has to be changed like in the tutorial above:
- hosts: slaves become: yes tasks: - name: Create mount points file: path: "/vol/XXX" state: directory owner: root group: root mode: 0777 - name: Mount shares mount: path: "/vol/XXX" src: "internal.master.ip:/vol/XXX" fstype: nfs4 state: mounted
- Save the changes
Run the playbook:
ansible-playbook -i ansible_hosts shareVolume.yml
To share a volume (or a file) one has to configure the
/etc/exports file and add a line as follows:
Terminate a cluster¶
Terminating a running cluster is quite simple :
java -jar bibigrid-openstack-<version>.jar -t <clusterid>