A cloud solution for bioinformaticians is being provided through the university compute center in Tübingen, including computing infrastructure, various workflow solutions to construct pipelines and tools with a particular focus on the analysis of high-throughput data (genomics, proteomics, metabolomics, etc.). The provided cloud environment involves the following services:
- Infrastructure as a Service (IaaS): The cloud infrastructure can be accessed via various virtualization technologies such as virtual machines, Docker containers or Singularity containers. The user will be able to use computing resources via the UNICORE middleware to run resource consuming data analysis and simulation algorithms. Furthermore, users can set up web services in their virtualization environment and use the available resources for their applications.
- Platform as a Service (PaaS): we provide direct access to established pipelines for high-throughput data analysis as well as to workflow environments (Galaxy, KNIME, UNICORE) to execute these workflows in a cloud environment. A particular focus of the Tübingen site are workflows for the analysis of mass spectrometric data and multi-omics data analysis. The user will be able to use the existing frameworks and workflows and further customize it on its own to achieve complete and reproducible data analysis workflows.
- Software as a Service (SaaS): We provide a broad range of standard bioinformatics that can be applied without any further development. The preinstalled software will be provided as virtual machine images, Docker containers, Singularity containers, UNICORE or Galaxy workflows. The wealth of tools developed within The Center for Integrative Bioinformatics cibi are fully available and supported on the Tübingen site. They cover tools for high-throughput data (genomics, metagenomics, proteomics, metaproteomics, transcriptomics, metabolomics – SeqAn, OpenMS, MetFrag) as well as for image analysis (FIJI). The cloud site of Tübingen is focused, among other things, on the reproducibility of research data and their virtual research environments as Tübingen is a project partner of the CiTAR project.
- Data as a Service (DaaS): access to data pools of important large-scale datasets is a highly valuable resource for bioinformatics users of all expertise levels. These data pools serve for development, testing and benchmarking of bioinformatics methods and the cloud environment. Depending on their need, de.NBI cloud users are granted authorized access to big data sets, such as public reference data sets. The Tübingen site focuses on data for proteomics, metabolomics, and multi-omics data (e.g., PRIDE, CPTAC, TCGA, ICGC). Beyond that we are able to offer multiple storage solutions for manifold application purposes. If you want to store sensible patient data or share specific data of a dataset with different people with different privileges we can provide a solution for that. An other application purpose might be to run applications on fast SSD storage and move the data subsequently to hard disc drives (tiering), also this can be done.
One major aim of the de.NBI cloud site in Tübingen is to provide software, covering different scientific fields of research such as mass spectrometry analysis, NGS analysis pipelines but also molecular docking via Ball integrated into Galaxy workflows (ballaxy).
The de.NBI cloud infrastructure in Tübingen comprises more than 1650 compute cores, 15 TByte RAM, 250 TByte SSD storage and 13 PByte storage capacity.