Skip to content

BiBiGrid tutorial

Prerequisites

  • Java 8+
  • Openstack API access
  • git & maven 3.3+ to build bibigrid from sources

Build from sources

  • Clone GitHub repository: git clone https://github.com/BiBiServ/bibigrid.git
  • Change into project dir: cd bibigrid
  • Build Java executable for Openstack mvn -P openstack clean package
  • Call java -jar bibigrid-main/target/bibigrid-openstack-<version>.jar -h to get help messages and check if executable works

Download Binary

If you don't want to build the client from sources you can download a prebuilt binary. It is recommend to choose the latest one.

Getting started

Check the GitHub repository for detailed information about BiBiGrid.

BiBiGrid is an open source tool hosted at github for an easy cluster setup inside a cloud environment. BiBigrid is operating system and cloud provider independent, currently there are existing backend implementations for Amazon (AWS), Google (Google Compute), Microsoft (Azure) and OpenStack. It provides a HPC like environment, providing a shared filesystem between all nodes and a Grid Batch Scheduler.

BiBiGrid configures a classic master / slaves cluster.

BiBigrid Overview

  1. One master and one (or more) slave nodes. The used images could be blank Ubuntu 16.04 images or could come with pre-installed software. BiBiGrid uses Ansible to install and configure all instances.
  2. All instances run in the same security group with default ssh access. Additional ports can easily be configured.
  3. Local disk space of the master node is provided as a shared spool-disk space (/vol/spool) between master and all slaves. Local storage of the slaves is configured as temporary scratch space.
  4. Volumes provided by Cinder can be mounted to the master node and optionally distributed to all slaves (as NFS shares).
  5. Object Store is available to all nodes.

Configuration

The goal of this session is to setup a small HPC cluster consisting of 4 nodes (1 master, 3 slaves) using BiBiGrid. The template below does the job for the de.NBI cloud site in Bielefeld - you have to replace all XXX's with your project environment. When running BiBigrid on other cloud sites than Bielefeld, you have possibly also adjust the region, availabilityZone, instance type(s) and image id(s).

Templates

bibigrid.yml:

#use openstack
mode: openstack

credentialsFile: /path/to/your/credentials.yml

#Access
sshPrivateKeyFile: path/to/private/key
sshPublicKeyFile: path/to/public/key
sshUser: ubuntu
keypair: XXX
region: Bielefeld
availabilityZone: default

#Network
subnet: XXX

#BiBiGrid-Master
masterInstance:
  type: de.NBI.small+ephemeral
  #Ubuntu 16.04 LTS (2019-01-11)
  image: f33f5e06-95bb-4378-97ce-25e61b2fce03

#BiBiGrid-Slave
slaveInstances:
  - type: de.NBI.default
    count: 3
    #Ubuntu 16.04 LTS (2019-01-11)
    image: f33f5e06-95bb-4378-97ce-25e61b2fce03

#Firewall/Security Group
ports:
  - type: TCP
    number: 80

# -----------------------
# services
# ----------------------
useMasterAsCompute: yes
nfs: yes

# Grid batch scheduler
# GridEngine is deprecated and only supported on Ubuntu 16.04
oge: yes
# Slurm is supported on Ubuntu 16.04 and 18.04 
slurm: no

# Monitoring
# Ganglia is deprecated and only supported on Ubuntu 16.04
ganglia: yes
# Zabbix is supportd on Ubuntu 16.04 and 18.04
zabbix: no

# Web IDE
cloud9: yes

credentials.yml:

tenantName: XXX
username: ELIXIRID@elixir-europe.org
password: PASSWORD
endpoint: https://openstack.cebitec.uni-bielefeld.de:5000/v3/
domain: elixir
tenantDomain: elixir

The openstack credentials requires the name not the id .

You can simply check your configuration using :

java -jar bibigrid-openstack-<version>.jar -ch -o bibigrid.yml

Using an existing volume

To mount and share an available volume to a master and between the slave nodes, add the following lines to the bibigrid.yml. - Mount volume on master:

bibigrid.yml

masterMounts:
  - source: *volume-id*
    target: /vol/xxx
- Share a volume between the nodes:
nfsShares:
  - /vol/xxx

Replace the volume-id and use a name for the volume instead of xxx.

Start the Cluster

java -jar bibigrid-openstack-<version>.jar -c -o bibigrid.yml

or more verbose:

java -jar bibigrid-openstack-<version>.jar -c -v -o bibigrid.yml

Starting with blank Ubuntu 16.04 images takes up to 20 minutes to finish, depending on the instance performance and bibigrid configuration.

Good to know

  • /vol/spool -> shared filesystem between all nodes.
  • /vol/scratch -> local diskspace (ephemeral disk, if provided)

Login into the Cluster

After a successful setup ...

SUCCESS: Master instance has been configured. 
Ok : 
 You might want to set the following environment variable:

export BIBIGRID_MASTER=XXX.XXX.XXX.XXX

You can then log on the master node with:

ssh -i /path/to/private/ssh-key ubuntu@$BIBIGRID_MASTER

The cluster id of your started cluster is: vnqtbvufr3uovci

You can easily terminate the cluster at any time with:
./bibigrid -t XXX 
you can login into the master node. Run qhost to check if there are 4 execution nodes available.

List running Cluster

Since it is possible to start more than one cluster at once, it is possible to list all running clusters:

java -jar bibigrid-openstack-<version>.jar -l

The command returns an informative list about all your running clusters.

Cloud9

Cloud9 is a Web-IDE that allows a more comfortable way to work with your cloud instances. Although cloud9 is in an alpha state, it is stable enough to be used for an environment like ours. Let's see how this works together with BiBiGrid.

Cloud9

If the cloud9 option is enabled in the configuration, cloud9 will be run as systemd service on localhost. For security reasons, cloud9 is not binding to a standard network device. A valid certificate and some kind of authentication is needed to create a safe connection, which is not that easy in a dynamic cloud environment.

However, Bibigrid has the possibility to open a ssh tunnel from the local machine to bibigrids master instance and open up a browser running cloud9 web ide.

java -jar bibigrid-openstack-<version>.jar --cloud9 <clusterid>

Hello World, Hello BiBiGrid!

After successful starting a cluster in the cloud, start with a typical example : Hello World !

  • Login into your master node and change to the spool directory. cd /vol/spool
  • Create a new shell script helloworld.sh containing a "hello world" :
#!/bin/bash
echo Hello from $(hostname) !
sleep 10
  • Submit this script to each node: qsub -cwd -t 1-4 -pe multislot 2 helloworld.sh
  • See the status of our cluster: qhost
  • See the output: cat helloworld.sh.o.*

Attaching a volume to a running cluster

If you have successfully run a cluster, you may want to attach a volume to an instance. See Using Cinder Volumes to get information about how to work with a volume.

Share a volume between all nodes

After attaching a volume to the master, you might want to share it between all slave nodes. One way of sharing data between master and slaves in the BiBiGrid is the spool directory. Instead, you have the possibility to share the volume created before with the Ansible tool. Ansible lets you automatically execute commands on several nodes in your cluster.

When you fulfilled the attaching of a volume to the master node (or any other node) you will see, that the other nodes in your cluster don't have access to it, neither does the volume exist at all.

  • Create mount points
  • Mount NFS shares

Instead of letting Ansible execute every single command, you can simply create a playbook.

  • Create shareVolume.yml
  • Copy & Paste the following lines into the file - XXX has to be changed like in the tutorial above:
    - hosts: slaves
      become: yes
      tasks:      
        - name: Create mount points
          file:
            path: "/vol/XXX"
            state: directory
            owner: root
            group: root
            mode: 0777
    
        - name: Mount shares
          mount:
            path: "/vol/XXX"
            src: "internal.master.ip:/vol/XXX"
            fstype: nfs4
            state: mounted
    
  • Save the changes

Run the playbook: ansible-playbook -i ansible_hosts shareVolume.yml

To share a volume (or a file) one has to configure the /etc/exports file and add a line as follows:

/vol/XXX CIDR.of.subnet(rw,nohide,insecure,no_subtree_check,async)

E.g.: /vol/test 192.168.0.0/24(rw,nohide,insecure,no_subtree_check,async)

Terminate a cluster

Terminating a running cluster is quite simple :

java -jar bibigrid-openstack-<version>.jar -t <clusterid>