About this course
Across the life sciences, scientists utilize omics data to study biological phenomena in humans, plants, animals and microbes. This results in large and heterogeneous data sets that can be analyzed using a variety of algorithms and statistical methods. Making sense of the data, extracting biological knowledge out of the results of these analyses and formulating new hypotheses and research questions based on them is not trivial. However, when basic data science skills are combined with domain knowledge, either from literature or from databases accessed in a high-throughput manner, omics data constitute a goldmine for data-driven discovery of novel insights and hypotheses that can be tested in follow-up experiments.
This course will train students in linking domain knowledge to data using data science techniques and skills, in order to design omics experiments, evaluate the quality of the resulting data, interpret them in the light of literature and domain databases, and mine them to make discoveries and compose new research questions and hypotheses. Domain-specific case studies will allow students to directly apply their skills on data relevant to their specialization.
After successful completion of this course students are expected to be able to:
- describe the advantages and limitations of different types of omics data;
- access, in a high throughput manner, databases commonly used in the life sciences for the interpretation of omics data;
- design effective omics experiments, with appropriate replicates, controls, controlling for batch effects, etc.;
- evaluate the quality and limitations of omics data (quality of the raw data, technical/biological variation, etc.) based on the outcomes of statistical analyses;
- interpret (processed) omics analysis results using domain knowledge, data mining (commonly used biological databases ) and literature mining;
- extract knowledge from the data and synthesize this into a (possible) biological story and compose new research questions and hypotheses based on this (data-driven).
Statistics and data analysis as applied to omics data, as treated in BIF-51306 Data Analysis and Visualization or SSB-30306 Molecular Systems Biology and in MAT-32806 Statistics for Data Science. Basic coding skills in R and/or Python are considered very useful. The course is considered most useful for life science students who have recently acquired basic data science skills (e.g., through the data science track) and want to learn to apply these, or for bioinformatics students with a limited background in the life sciences who want to improve their data interpretation skills.
- CreditsECTS 6
- Contact coordinator