3rd round table

March 23, 2012

Genome assembly - working with next-generation sequencing data
Matthias Haimel, European Bioinformatics Institute (EBI-EMBL)

Bioinformatics in translational clinical research
Bernd Mayer, emergentec biodevelopment GmbH

 

Genome Assembly - working with next-generation sequencing data
Matthias Haimel, European Bioinformatics Institute (EBI-EMBL)

HTSeq (High Throughput Sequencing) technology has brought a drop in sequencing costs and an explosion of data which could not be handled by traditional methods. With more 3rd generation sequencing technologies arriving the possible application range and strategies of how to use the data is ever more increasing. New programs and methods have been developed to enable the handling, processing, evaluation and interpretation of the data. Recent developments of “Exome” sequencing technology has highlighted the opportunities for clinical sequencing and personalized medicine. The increased accessibility and affordability of the sequencing machines has also allowed to explore genomic sequences and variations beyond reference genomes and vertebrates (e.g. Metagenomes).

In this talk I will present de novo genome assembly of HTSeq data using the de Bruijn graph representation for memory efficiency and speed. After an introduction of different HTSeq Technologies and machines currently available on the market, I'll explain the de Bruijn graph for de novo assembly, algorithms for error correction and graph manipulation and current limitations of de Bruijn graph implementations. Other strategies for HTSeq and various fields of usage will be discussed at the end.

  

Bioinformatics in translational clinical research
Bernd Mayer, emergentec biodevelopment GmbH

Bioinformatics in the clinical context is from a goal-centric view (i.e. in the realm of clinical needs and bringing benefit to the patient) straightforward: (support the provision of) biomarkers and targets that work in diagnosis, prognosis and therapy. Here a number of disciplines heavily utilizing bioinformatics emerged over the last years, including Systems Biology/Medicine and Stratification/​Personalization. Although advancements in experimental protocols (in particular to note the Omics revolution), in tight connection with conceptual and methodological improvements on the data management, integration as well as analysis side are significant, the path towards clinical use is still less clear, and positive examples are sparse.

Nevertheless, computational approaches already do and further will provide essential elements for the field, and here we see disease phenotype-specific ‘data graphs’ as a promising conceptual framework: Starting out with handling and integration of the molecular data space (bridging from the genome to the metabolome level also invoking cell line and animal models), tight link to individual patient clinical data profiles, and (resting on this broad data space) provision of testable hypotheses, followed by their validation (and iteratively bringing validation results back to the hypothesis generation level).

My presentation will review such data graph concepts and address key computational (bioinformatics) elements involved.

Key reference: Data graphs for linking clinical phenotype and molecular feature space. International Journal of Systems Biology and Biomedical Technologies, 1(1), 11-25 (2012).