## ASCI 944 / STAT 844 Quantitative Methods for Genomics of Complex Traits

### Due date

Tuesday, April 3, 5pm

## Mice data

Load mouse SNP data available in the BGLR R package.

library(BGLR)
data(mice)
?(mice)
?(mice.X)
?(mice.pheno)

## Question 1

Compute the first genomic relationship matrix ($$\mathbf{G}_1$$) of VanRaden (2008) (doi) using the entire markers and all individuals. Report the median of the lower triangular part of $$\mathbf{G}_1$$ matrix.

## Question 2

Compute the second genomic relationship matrix ($$\mathbf{G}_2$$) of VanRaden (2008) using the entire markers and all individuals. Report the median of the lower triangular part of $$\mathbf{G}_2$$ matrix. What is the correlation between lower triangular parts of $$\mathbf{G}_1$$ and $$\mathbf{G}_2$$ matrices?

## Question 3

Compute the dominant genomic relationship matrix ($$\mathbf{D}_1$$) of Su et al. (2012) (doi) using the entire markers and all individuals. Report the median of the lower triangular part of $$\mathbf{D}_1$$ matrix.

## Question 4

Compute the dominant genomic relationship matrix ($$\mathbf{D}_2$$) of Vitezica et al. (2013) (doi) using the entire markers and all individuals. Report the median of the lower triangular part of $$\mathbf{D}_2$$ matrix. What is the correlation between lower triangular parts of $$\mathbf{D}_1$$ and $$\mathbf{D}_2$$ matrices?

## Question 5

Perform a single marker OLS-based GWA analysis by fitting $$y = \mathbf{X}\beta + \mathbf{W}_{ac} a + \mathbf{W}_d d + e$$, where $$y$$ is the BMI phenotype, $$\mathbf{X}$$ includes systematic effects of intercept, sex, litter size, and cage density, $$\mathbf{W}_{ac}$$ is the centered additive marker genotype matrix, $$\mathbf{W}_{d}$$ is the dominant marker genotype matrix defined in Su et al. (2012), $$a$$ is the additive marker effect, and $$d$$ is the dominant marker effect. Report 1) SNP ID that has the smallest p-value for the additive marker effect and 2) SNP ID that has the smallest p-value for the dominant marker effect. If additive and dominant marker effects are not simultaneously estimable, set $$p$$-values equal to NA. Ignore multiple testing corrections for simplicity. Use the lm() function.

## Question 6

Repeat Question 5 by defining $$\mathbf{W}_{d}$$ as the dominant marker genotype matrix proposed in Vitezica et al. (2013).

March 13, 2018