ASCI 944 / STAT 844 Quantitative Methods for Genomics of Complex Traits

Due date

Tuesday, April 3, 5pm

Mice data

Load mouse SNP data available in the BGLR R package.

library(BGLR)
data(mice)
`?`(mice)
`?`(mice.X)
`?`(mice.pheno)

Question 1

Compute the first genomic relationship matrix (\(\mathbf{G}_1\)) of VanRaden (2008) (doi) using the entire markers and all individuals. Report the median of the lower triangular part of \(\mathbf{G}_1\) matrix.

Question 2

Compute the second genomic relationship matrix (\(\mathbf{G}_2\)) of VanRaden (2008) using the entire markers and all individuals. Report the median of the lower triangular part of \(\mathbf{G}_2\) matrix. What is the correlation between lower triangular parts of \(\mathbf{G}_1\) and \(\mathbf{G}_2\) matrices?

Question 3

Compute the dominant genomic relationship matrix (\(\mathbf{D}_1\)) of Su et al. (2012) (doi) using the entire markers and all individuals. Report the median of the lower triangular part of \(\mathbf{D}_1\) matrix.

Question 4

Compute the dominant genomic relationship matrix (\(\mathbf{D}_2\)) of Vitezica et al. (2013) (doi) using the entire markers and all individuals. Report the median of the lower triangular part of \(\mathbf{D}_2\) matrix. What is the correlation between lower triangular parts of \(\mathbf{D}_1\) and \(\mathbf{D}_2\) matrices?

Question 5

Perform a single marker OLS-based GWA analysis by fitting \(y = \mathbf{X}\beta + \mathbf{W}_{ac} a + \mathbf{W}_d d + e\), where \(y\) is the BMI phenotype, \(\mathbf{X}\) includes systematic effects of intercept, sex, litter size, and cage density, \(\mathbf{W}_{ac}\) is the centered additive marker genotype matrix, \(\mathbf{W}_{d}\) is the dominant marker genotype matrix defined in Su et al. (2012), \(a\) is the additive marker effect, and \(d\) is the dominant marker effect. Report 1) SNP ID that has the smallest p-value for the additive marker effect and 2) SNP ID that has the smallest p-value for the dominant marker effect. If additive and dominant marker effects are not simultaneously estimable, set \(p\)-values equal to NA. Ignore multiple testing corrections for simplicity. Use the lm() function.

Question 6

Repeat Question 5 by defining \(\mathbf{W}_{d}\) as the dominant marker genotype matrix proposed in Vitezica et al. (2013).

Gota Morota

March 13, 2018