EduXchange.nl: Data Analysis for Plant and Animal Breeding

About this course

Data analysis is central to both plant and animal breeding, and the size and complexity of phenotypic and genomic data sets continue to increase. Thus, the ability to analyze and interpret such large data sets is an essential skill for breeders, both in science and industry. In this course you will become familiar with state-of-the-art methods and skills for quantitative genetic analysis of breeding data, both for animals and plants.

This is a hands-on course, where you develop the skills to analyze real-life data and handle real-life problems in genetic analysis. Next to genetic analysis, this will include developing the skills to competently curate data sets in the R software environment. At the same time, you will develop an understanding of the statistical methods on an applied, practically relevant and intuitive level. This includes being able to choose an appropriate analysis based on the research question and the data at hand, understanding the statistical model and its assumptions, interpreting the results and becoming aware of common pitfalls. You will achieve this by working on illustrative real-life data sets that link to modern animal and plant breeding. The course covers the most important categories of statistical models and the associated methods for genetic and genomic data analysis and for model validation. During the course, you will gradually build up the required R-skills.

We make use of plenary lectures and computer tutorials focused on application using real-life data, and you will also work on two case studies. In each of the two case studies, you will analyze an actual data set and write a short report on the analysis. In the tutorials, you will learn how to use the R-software for data handling, editing, filtering and quantitative genetic analyses. You will also become familiar with more advanced methods for genetic analysis, with complex pedigreed and large genomic data, using dedicated software.

The course consists of six one-week modules. In the first week, you will become familiar with data handling, visualization and editing, and model building and model validation using linear models. In the next weeks you will become familiar with more advanced statistical models and tools, with a major focus on Linear Mixed Models, and also including Generalized Linear Models and Maximum Likelihood, and the use of these tools for quantitative genetic analysis of breeding data. In the final weeks, you will become familiar with more advanced analysis of genetic, genomic and phenotypic (big) data in animals and plants. This includes the estimation of genetic parameters such as heritability, QTL mapping, genomic prediction, and genome-wide association studies.

Note: This course cannot be combined in an individual program with PBR-34803 Experimental Design and Data Analysis of Breeding Trials and/or PBR-32803 Markers in Genetics and Plant Breeding.

Learning outcomes

After successful completion of this course students are expected to be able to:

Apply data handling skills necessary to competently curate data sets in the R software environment
Choose a model category, build a model for quantitative genetic analysis of a given data set and research question, and execute the analysis
Interpret and explain the results of your data analysis
Perform model validation by evaluating model assumptions and/or cross validation (for genomic prediction), using illustrative plots
Explain the differences between a linear model (LM), linear mixed model (LMM) and a generalized linear model (GLM) in terms of model assumptions and purpose of the analysis
Explain the principles of maximum likelihood and restricted maximum likelihood
Explain the difference between fixed and random effects
Explain how the heritability of a trait can be estimated in pedigreed or genotyped populations in animals or plants
Design an experiment for estimating heritabilities, for QTL mapping and for genome-wide association studies (GWAS)
Explain how genome-wide association studies or QTL-detection can be used to detect genomic regions of interest in outbred populations or in line crosses
Explain how genomic prediction can be performed, and propose statistical models for that purpose

Prior knowledge

Assumed Knowledge:
It is assumed that students who take this course have a basic understanding of statistics and some understanding of genetics. It is recommended to take the courses MAT15303 + MAT15403 and MAT20306 Advanced Statistics before taking part in the present course. Some experience with the R-software is helpful, but not mandatory.

Additional information

More info
Coursepage on website of Wageningen University & Research
Contact a coordinator
SK Schnabel
P Duenk
MPL Calus
G Gort
P Bijma
CA Maliepaard

Credits
ECTS 6
Level
bachelor
Selection course
No

If anything remains unclear, please check the FAQ of Wageningen University.

Offering(s)

Start date
10 March 2025
- Ends
  2 May 2025
- Term *
  Period 5
- Location
  Wageningen
- Instruction language
  English
- Time info
  Monday 14:00 - 18:00, Tuesday 14:00 - 18:00, Thursday 14:00 - 18:00, Friday 14:00 - 18:00
Course is currently running

* View scheduling details of Wageningen University

For guests registration, this course is handled by Wageningen University

Data Analysis for Plant and Animal Breeding

About this course

Learning outcomes

Prior knowledge

Additional information

Offering(s)

Start date