ASCI 944 / STAT 844 Quantitative Methods for Genomics of Complex Traits
Homework assignment 1
Due date
Thursday, February 1, 5pm
Data
For this assignment, we are going to use the cattle
data included in the synbreedData package. Learn more about the Synbreed project and synbreed R packages.
library(synbreed)
library(synbreedData)
help(package="synbreedData")
data(cattle)
?cattle
dim(cattle$geno)
set.seed(100)
cattleC <- codeGeno(cattle,impute=TRUE,impute.type="random", reference.allele = "minor")
Select SNP markers only on chromosome 1. Answer the six questions below using the W
variable, which is a SNP genotype matrix.
W <- cattleC$geno[,which(cattleC$map==1)]
dim(W)
Question 1
Compute the allele frequency of SNP markers. Recall that the expectation of genotype, \(E(W)\), is given by \(2p\), where \(p\) is the frequency of reference allele. Verify that \(2p\) is equal to the mean of each genotype obtained from the colMeans()
function.
Question 2
Recall that the variance of genotype, \(Var(W)\), is given by \(2p(1-p)\). Verify that \(2p(1-p)\) is close to the variance of each genotype obtained from the var()
function.
Question 3
Create a new marker matrix X
from W
and recode markers so that three genotypes \(AA\), \(Aa\), and \(aa\) are coded as 1, 0, and -1, respectively.
Recall that the expectation of genotype, \(E(X)\), is given by \(2p-1\), where \(p\) is the frequency of reference allele. Verify that \(2p-1\) is equal to the mean of each genotype obtained from the colMeans()
function.
Question 4
Recall that the variance of genotype, \(Var(X)\), remains the same and is given by \(2p(1-p)\). Verify that \(2p(1-p)\) is close to the variance of each genotype obtained from the var()
function.
Question 5
Verify that no matter how we code markers, centered marker codes, \(W - E(W)\) and \(X - E(X)\), remain the same.
Question 6
We will recode the SNP genotypes so that now the major allele is treated as a reference allele. Store the new coding into the W2
variable.
W2 <- W
W2[W2==0] <- 3
W2[W2==2] <- 0
W2[W2==3] <- 2
Compute the allele freqeuncy of SNP markers using W2
. Compare your result with the allele frequency obtained from W
.