The basis of genetic likeness

Overview

We will learn how to compute an additive relationship matrix from pedigree using the tabular method.

Reading file

Use the function read.table to read the phenotype file (1000withEffects.redangus) in a data frame format.

dat <- read.table(file = file.choose(), header = TRUE, stringsAsFactors = FALSE)
head(dat)

Unknown pedigree

From the dat, let’s create a new variable ped that only has three columns: animal ID, sire ID, and dam ID.

ped <- dat[, c("REG", "SIRE", "DAM")]
head(ped)
dim(ped)

Phenotyped animals are the ones with animal IDs. If there are any animals that appear only in the SIRE or DAM column, they need to be listed in the REG column as well. For instance, consider the the following example from Legarra et al. (2009) JDS. \[\begin{array}{ccc} \text{REG} & \text{SIRE} & \text{DAM} \\ 9 & 1 & 2 \\ 10 & 3 & 4 \\ 11 & 5 & 6 \\ 12 & 7 & 8 \\ 13 & 9 & 10 \\ 14 & 11 & 12 \\ 15 & 4 & 11 \\ 16 & 13 & 15 \\ 17 & 13 & 14 \end{array}\] Note that animal IDs 1 to 8 have no phenotypic records so they only appear in the SIRE or DAM column. The final pedigree data frame should look like the following where animals 1 to 8 also appear in the REG column with the SIRE and DAM IDs set to unknown = 0. \[\begin{array}{ccc} \text{REG} & \text{SIRE} & \text{DAM} \\ 1 & 0 & 0 \\ 2 & 0 & 0 \\ 3 & 0 & 0 \\ 4 & 0 & 0 \\ 5 & 0 & 0 \\ 6 & 0 & 0 \\ 7 & 0 & 0 \\ 8 & 0 & 0 \\ 9 & 1 & 2 \\ 10 & 3 & 4 \\ 11 & 5 & 6 \\ 12 & 7 & 8 \\ 13 & 9 & 10 \\ 14 & 11 & 12 \\ 15 & 4 & 11 \\ 16 & 13 & 15 \\ 17 & 13 & 14 \end{array}\]

Now let’s find out how many animals appear only as sires or dams.

# Sire
table(!ped$SIRE %in% ped$REG)
unknown.sire <- unique(ped$SIRE[!ped$SIRE %in% ped$REG])  # only appear as sire
head(unknown.sire)
length(unknown.sire)

# Dam
table(!ped$DAM %in% ped$REG)
unknown.dam <- unique(ped$DAM[!ped$DAM %in% ped$REG])  # only appear as dam
length(unknown.dam)

We will create a new variable unknown to store animals with unknown sires and dams.

# Total
unknown <- c(unknown.sire, unknown.dam)
length(unknown)

Before we create a new data frame for the unknown, we sort animals with unknown sires and dams in ascending order.

unknown <- unknown[order(unknown)]  # sort unknown animals in ascending order
head(unknown)

Next, we will create a pedigree data frame for the unknown. We set the SIRE and DAM columns equal to zero.

unknown.ped <- data.frame(REG = unknown, SIRE = rep(0, length(unknown)), DAM = rep(0, 
    length(unknown)))
head(unknown.ped)
dim(unknown.ped)

Finally, we combine the ped and unknown.ped objects into a single data frame all.ped.

all.ped <- rbind(unknown.ped, ped)
dim(all.ped)

synbreed package

The synbreed package includes the function kin for computing a relationship matrix. Learn more about the Synbreed project. The kin function requires the object of class gpData for inputs.

library(synbreed)
gp.ped <- create.pedigree(ID = all.ped$REG, Par1 = all.ped$SIRE, Par2 = all.ped$DAM, 
    unknown = "0")
gpData <- create.gpData(pedigree = gp.ped)
class(gpData)

We can obtain an additive relationship matrix by setting the argument ret equals to add.

A <- kin(gpData, ret = "add")
dim(A)
table(colnames(A) == rownames(A))

One thing we need to be cautious of is that the kin function does not retrain the original order.

head(colnames(A))
head(all.ped)

This brings up a problem when we want to link phenotypic values to the A matrix because animals appear in a different order. So let’s keep the original order of animal IDs in the ped object and remove all animals that do not have records in the A matrix.

index <- match(ped$REG, rownames(A))
A2 <- A[index, index]
dim(A2)

The A2 object is a reordered A matrix where animals appear in the same order as that of the ped object.

head(ped$REG)
head(rownames(A2))
table(ped$REG == rownames(A2))
table(ped$REG == colnames(A2))

Exercise 1

Compute the inbreeding coefficients of animals and report the average of inbreeding coefficients. Use the function diag.

kinship2 package

Alternatively, we can use the kinship2 package to compute a relationship matrix. The kinship function returns the coefficient of kinship.

library(kinship2)
A3 <- kinship(all.ped$REG, all.ped$SIRE, all.ped$DAM)
dim(A3)
table(colnames(A3) == rownames(A3))

Again, let’s make sure that animals in the A matrix appear in the same order as that of the ped object. The A4 object is a reordered A matrix.

index2 <- match(ped$REG, rownames(A3))
A4 <- A3[index2, index2]
dim(A4)
head(ped$REG)
head(rownames(A4))
table(ped$REG == rownames(A4))
table(ped$REG == colnames(A4))

Exercise 2

Recall that the additive genetic relationship between a pair of individuals is equal to twice their coefficient of kinship. Verify this by comparing the A2 and A4 objects.

Save R object

Save the object A2 so that we can reuse in the next class.

save(A2, file = "A2.Rda")

Gota Morota

January 31, 2017