For this assignment, we are going to use the `cattle`

data included in the synbreedData package. Learn more about the Synbreed project and synbreed R packages.

```
library(synbreed)
library(synbreedData)
help(package="synbreedData")
data(cattle)
?cattle
pheno <- as.matrix(cattle$pheno[,1,1])
dim(cattle$geno)
```

`## [1] 500 7250`

`cattleC <- codeGeno(cattle,impute=TRUE,impute.type="random")`

```
##
## Summary of imputation
## total number of missing values : 10000
## number of random imputations : 10000
```

Select SNP markers only on chromosome 1. Answer the six questions below by using the following two R objects: `pheno`

which contains the phenotype and `W`

which is the genotype matrix.

```
W <- cattleC$geno[,which(cattleC$map==1)]
dim(W)
```

`## [1] 500 250`

Compute the allele freqeuncy of SNP markers. Recall that the expectation of genotype, \(E(W)\), is given by \(2p\), where \(p\) is the frequency of reference allele. Verify that \(2p\) is equal to the mean of each genotype obtained from the `ColMeans()`

function.

Recall that the variance of genotype, \(Var(W)\), is given by \(2p(1-p)\). Verify that \(2p(1-p)\) is close to the variance of each genotype obtained from the `var()`

function.

Fit a single marker regression using ordinary least squares (OLS) and estimate SNP marker effects. Create a scatter plot of SNP IDs vs. effect size of markers.

The dimension of genotype matrix `W`

is `500 x 250`

. Can you use OLS for the 250 markers? If not, explain why this is not possible.

Compute multi-locus additive genetic variance under linkage equilibrium (LE) assumption. Compare this value with the additive genetic variance that accounts for covariance between genotypes. Apply the expression based on the correlation between genotypes by using the `cor()`

function. Report the estimates of genomic heritability.

What is the net contribution of the first locus to the total additive genetic variance?