# ASCI 896 Statistical Genomics

# Homework assignment 2

## Due date

Tuesday, February 14, 5pm

## Data

We will continue analyzing the `cattle`

data included in the synbreedData package. Learn more about the Synbreed project and synbreed R packages.

```
library(synbreed)
library(synbreedData)
help(package = "synbreedData")
data(cattle)
`?`(cattle)
pheno <- as.matrix(cattle$pheno[, 1, 1])
pheno <- scale(pheno)
dim(cattle$geno)
set.seed(100)
cattleC <- codeGeno(cattle, impute = TRUE, impute.type = "random")
```

Select SNP markers only on chromosome 1. Answer the five questions below by using the following two R objects: `pheno`

which contains the phenotype and `W`

which is the genotype matrix.

```
W <- cattleC$geno[, which(cattleC$map == 1)]
dim(W)
```

## Question 1

Create a new variable `W2`

by subsetting the first 10 markers. The dimension of `W2`

is equal to \(500 \times 10\). Verify that the covariance between allelic counts is \(Cov(W2[,i], W2[,j]) \approx 2D\), where is \(D\) is the estimate of linkage disequilibrium. Use the `W2`

object and the `LD()`

function from the genetics package to obtain \(D\).

## Question 2

Fit a single marker regression using ordinary least squares (OLS) and estimate SNP marker effects. Create a scatter plot of SNP IDs vs. effect size of markers. Use the objects `W`

and `pheno`

. Save the vector of marker effects into `a`

.

## Question 3

Compute the allele frequency of reference allele for each SNP marker. Report the estimate of multi-locus additive genetic variance under the linkage equilibrium (LE) assumption.

## Question 4

Compute multi-locus additive genetic variance that accounts for linkage disequilibrium (LD). Apply the expression based on the correlation between genotypes by using the `cor()`

function. Report the estimate of additive genetic variance.

## Question 5

What is the net contribution of the first locus to the total additive genetic variance?

## Visualization of LD in r^2

```
`?`(pairwiseLD)
cattleLD <- pairwiseLD(cattleC, chr = 1, type = "data.frame")
LDDist(cattleLD, type = "p", pch = 19, colD = hsv(alpha = 0.1, v = 0))
```