ASCI 896-004 Statistical Genomics
Spring 2016
Instructor
- Name: Gota Morota
- Office: A218f Animal Science Building
- Email: morota@unl.edu
- Web: http://morotalab.org/
- Office Hours: By appointment
Time and Location
- Tues./Thurs. 9:30-10:45am
- Animal Science Building, Room A228
Prerequisites
- STAT 801 or equivalent
- STAT 970 recommended
- Knowledge of statistical programming language R
- Searle, S.R. (1982) Matrix Algebra Useful for Statistics. Wiley, New York. [Amazon]
Course Description
This course will cover quantitative genetic analysis of complex trait genetics with emphasis on the use of molecular markers spanning the entire genome. We will discuss statistical methodologies for connecting phenotypes with high-dimensional genomic information to better understand polygenic traits from both prediction and inference perspectives. Topics will include genomic relatedness, linkage disequilibrium, population stratification, genomic heritability, genome-enabled prediction of complex traits, and statistical learning. We will use examples from the animal, plant, and human genetics literature. Additional topics will be briefly touched upon, including sequence data, gene expression, epigenetics, and bioinformatics. Homework assignments involve hands-on analysis of simulated and real genomic data available at public repositories. The course will use R/Bioconductor software for statistical computing tools.
Learning Objectives
After taking this course, the student will be able to:
- understand the statistical theory behind commonly used quantitative methods in genomics
- apply statistical methods to high-dimensional genomic data and analyze them using statistical computing tools
- critically review current literature in statistical and quantitative genetics
Texts and Reading Materials
Lecture slides will be provided on the class website. There will be no required textbook.
Syllabus
Schedule
Lectures will be delivered using a whiteboard and presentation slides.
- 1/12 (T): Ordinary least-squares and the curse of dimensionality - prediction vs. inference [HTML]
- 1/14 (R): Linkage disequilibrium [HTML]
- 1/19 (T): One locus to the infinitesimal model - 1 [HTML]
- 1/21 (R): One locus to the infinitesimal model - 2 [HTML][HW1]
- 1/26 (T): Whole-genome regression - ridge regression 1 [HTML][R]
- 1/28 (R): Whole-genome regression - ridge regression 2 [HTML][R]
- 2/2 (T): UNL classes cancelled due to inclement weather
- 2/4 (R): Overview of likelihood-based inference [HW2]
- 2/9 (T): Prediction of random effects - best prediction (BP)
- 2/11 (R): Prediction of random effects - best linear prediction (BLP) and best linear unbiased prediction (BLUP)
- 2/16 (T): Prediction of random effects - mixed model equations (MME)
- 2/18 (R): Iterative methods for solving MME
- 2/23 (T): Relatedness due to genetic markers - additive genomic relationship [HTML][R][R]
- 2/25 (R): Relatedness due to genetic markers - dominance genomic relationship 1
- 3/1(T): Relatedness due to genetic markers - dominance genomic relationship 2 [HTML][HW3]
- 3/3(R): Whole-genome regression - Genomic BLUP (GBLUP) and ridge regression BLUP (RR-BLUP) [R]
- 3/8(T): Whole-genome regression - single-step GBLUP and cross-validation
- 3/10 (R): Genetic variance estimation with restricted maximum likelihood (REML)
- 3/15 (T): Whole-genome regression - Bayesian penalized regression models - Bayesian ridge regression
- 3/17 (R): Whole-genome regression - Bayesian penalized regression models - Bayesian LASSO / Bayesian alphabet - Bayes A, B, C, and Cpi [R]
- 3/22 (T): Spring break
- 3/24 (R): Spring break
- 3/29 (T): UNL Plant Breeding Symposium 2016 [WWW][HTML]
- 3/31 (R): Whole-genome regression - Semi-parametric regression - Reproducing kernel Hilbert spaces regression 1 [HW4]
- 4/5 (T): Whole-genome regression - Semi-parametric regression - Reproducing kernel Hilbert spaces regression 2 [R]
- 4/7 (R): Deterministic equations for genome-enabled prediction
- 4/12 (T): Population stratification - Mixed-linear-model association
- 4/14 (R): Multi-trait model and pleiotropy [HW5]
- 4/19 (T): Liability threshold model
- 4/21 (R): Student presentations 1
- 4/26 (T): Student presentations 2
- 4/28 (R): Student presentations 3