APSC 5984 Complex Trait Genomics

Due date

Thursday, April 27, 5pm

Question 1

Replace missing marker genotypes with mean values. Then store the marker genotypes in a matrix object W.

Question 2

Perform a quality control by removing markers with MAF < 0.05. How many markers are removed? Save the filtered genotype matrix in W2.

Question 3

Scale the genotype matrix (W2) to have a mean of zero and variance of one. Save this matrix as Ws.

Question 4

Compute the second genomic relationship matrix of VanRaden (2008) G using the entire markers. Then add a very small positive constant (e.g., 0.001) to the diagonal elements so that G matrix is invertible.

Question 5

Fit GBLUP by using the mixed.solve function in the rrBLUP R package. Report the estimates of intercept, additive genetic values, and genomic heritability.

Question 6

Repeat Question 5 by setting up mixed model equations (MME) with the model \(\mathbf{y = \mathbf{1}\mu + Zg + e}\), where \(\mu\) is the intercept, \(\mathbf{Z}\) is the incident matrix of individuals, \(\mathbf{g}\) is the additive genetic values, and \(\mathbf{e}\) is the residual. Use the ratio of variance components \(\lambda = \frac{\sigma^2_e}{\sigma^2_g}\) obtained from Question 5. Directly take the inverse of LHS to obtain the solutions for GBLUP. Report the estimates of intercept and additive genetic values. Do they agree with the estimates in Question 5?

Question 7

Fit RR-BLUP by using the mixed.solve function in the rrBLUP R package. Report the estimates of intercept, marker additive genetic effects, and marker additive genetic variance.

Question 8

Repeat Question 7 by setting up mixed model equations (MME) with the model \(\mathbf{y = \mathbf{1}\mu + Wa + e}\), where \(\mu\) is the intercept, \(\mathbf{W}\) is the standardized marker genotypes (Ws), \(\mathbf{a}\) is the additive marker genetic effects, and \(\mathbf{e}\) is the residual. Use the ratio of variance components \(\lambda = \frac{\sigma^2_e}{\sigma^2_a}\) obtained from Question 7. Directly take the inverse of LHS to obtain the solutions for marker-based GBLUP (RR-BLUP). Report the estimates of intercept and marker additive genetic effects. Do they agree with the estimates in Question 7?

Question 9

Repeat Questions 5 or 6 (GBLUP) but perform cross-validation by treating the first 600 individuals as a training set and predict the additive genetic values of the remaining individuals in the testing set. What is the predictive correlation in the testing set?

Question 10

Repeat Question 7 or 8 (RR-BLUP) but perform cross-validation by treating the first 600 individuals as a training set and predict the additive genetic values of the remaining individuals in the testing set. What is the predictive correlation in the testing set? Also, compare this predictive correlation to the one from Question 9. If computed correctly, these two values should be exactly the same or very similar. Briefly explain why this is the case.

Gota Morota

April 20, 2020