Plant Breeding Driven by Big Data Analysis

Plant Breeding is a big numbers game. To enable breeding progress for complex traits like yield and stress tolerance in crop plants, integrated analyses of multi-omics and digitized phenomics data from large plant populations grown in multiple field studies are required. The de.NBI Cloud is a great resource which is helping us to extract meaning from large noisy and heterogenous data sets for predicting yield and quality traits from genotypes of crop plants.

Our research is focusing on quantitative and molecular genetics, genomics and breeding of oilseed rape, wheat, sorghum, barley and faba bean. Our datasets are usually very large as they are composed from a combination of population-wide genomic and multi-location phenotypic data sets. The tools and pipelines we are applying are requiring access to high amounts of memory.

We use the de.NBI resources mainly for complex crop genome assemblies, long-read analysis of structural variants and machine learning predictions of plant performance from large-scale genome/phenome datasets. With the de.NBI Cloud we are able to analyze these types of data within easy processes and with high efficiency. de.NBI Cloud provides simple setup and adaptation of the resources to our required bioinformatic analyses pipelines. We have been using different project types of the de.NBI Cloud for bioinformatic projects of different sizes and complexity including SimpleVM and OpenStack project types in combination with high memory and GPU nodes. SimpleVM projects have been especially useful to get established pipelines running quickly. de.NBI provides a great learning environment for PhD students with many attractive training and education resources. The wide range of resources provided by the de.NBI allows a variety of tailored applications and solutions. For example, the high memory VMs are used for assembly of large, complex plant genomes like faba bean, while GPU VMs allow using deep learning to help understand the rules governing molecular phenotypes including gene expression.

The ease of use and scalability makes the de.NBI Cloud an excellent tool for big-data driven life science research.

Prof. Rod Snowdon, Dr. Agnieszka Golicz, Dr. Christian Obermeier
Prof. Rod Snowdon, Dr. Agnieszka Golicz, Dr. Christian Obermeier from Plant Breeding Departement, Justus Liebig University Giessen