The basis of genetic likeness
Overview
We will learn how to compute an additive relationship matrix from pedigree using the tabular method.
Reading file
Use the function read.table
to read the phenotype file (1000withEffects.redangus) in a data frame format.
dat <- read.table(file = file.choose(), header = TRUE, stringsAsFactors = FALSE)
head(dat)
Unknown pedigree
From the dat
, let’s create a new variable ped
that only has three columns: animal ID, sire ID, and dam ID.
ped <- dat[, c("REG", "SIRE", "DAM")]
head(ped)
dim(ped)
Phenotyped animals are the ones with animal IDs. If there are any animals that appear only in the SIRE
or DAM
column, they need to be listed in the REG
column as well. For instance, consider the the following example from Legarra et al. (2009) JDS. \[\begin{array}{ccc}
\text{REG} & \text{SIRE} & \text{DAM} \\
9 & 1 & 2 \\
10 & 3 & 4 \\
11 & 5 & 6 \\
12 & 7 & 8 \\
13 & 9 & 10 \\
14 & 11 & 12 \\
15 & 4 & 11 \\
16 & 13 & 15 \\
17 & 13 & 14
\end{array}\] Note that animal IDs 1 to 8 have no phenotypic records so they only appear in the SIRE
or DAM
column. The final pedigree data frame should look like the following where animals 1 to 8 also appear in the REG
column with the SIRE
and DAM
IDs set to unknown = 0. \[\begin{array}{ccc}
\text{REG} & \text{SIRE} & \text{DAM} \\
1 & 0 & 0 \\
2 & 0 & 0 \\
3 & 0 & 0 \\
4 & 0 & 0 \\
5 & 0 & 0 \\
6 & 0 & 0 \\
7 & 0 & 0 \\
8 & 0 & 0 \\
9 & 1 & 2 \\
10 & 3 & 4 \\
11 & 5 & 6 \\
12 & 7 & 8 \\
13 & 9 & 10 \\
14 & 11 & 12 \\
15 & 4 & 11 \\
16 & 13 & 15 \\
17 & 13 & 14
\end{array}\]
Now let’s find out how many animals appear only as sires or dams.
# Sire
table(!ped$SIRE %in% ped$REG)
unknown.sire <- unique(ped$SIRE[!ped$SIRE %in% ped$REG]) # only appear as sire
head(unknown.sire)
length(unknown.sire)
# Dam
table(!ped$DAM %in% ped$REG)
unknown.dam <- unique(ped$DAM[!ped$DAM %in% ped$REG]) # only appear as dam
length(unknown.dam)
We will create a new variable unknown
to store animals with unknown sires and dams.
# Total
unknown <- c(unknown.sire, unknown.dam)
length(unknown)
Before we create a new data frame for the unknown
, we sort animals with unknown sires and dams in ascending order.
unknown <- unknown[order(unknown)] # sort unknown animals in ascending order
head(unknown)
Next, we will create a pedigree data frame for the unknown
. We set the SIRE
and DAM
columns equal to zero.
unknown.ped <- data.frame(REG = unknown, SIRE = rep(0, length(unknown)), DAM = rep(0,
length(unknown)))
head(unknown.ped)
dim(unknown.ped)
Finally, we combine the ped
and unknown.ped
objects into a single data frame all.ped
.
all.ped <- rbind(unknown.ped, ped)
dim(all.ped)
synbreed package
The synbreed package includes the function kin
for computing a relationship matrix. Learn more about the Synbreed project. The kin
function requires the object of class gpData for inputs.
library(synbreed)
gp.ped <- create.pedigree(ID = all.ped$REG, Par1 = all.ped$SIRE, Par2 = all.ped$DAM,
unknown = "0")
gpData <- create.gpData(pedigree = gp.ped)
class(gpData)
We can obtain an additive relationship matrix by setting the argument ret
equals to add
.
A <- kin(gpData, ret = "add")
dim(A)
table(colnames(A) == rownames(A))
One thing we need to be cautious of is that the kin
function does not retrain the original order.
head(colnames(A))
head(all.ped)
This brings up a problem when we want to link phenotypic values to the A matrix because animals appear in a different order. So let’s keep the original order of animal IDs in the ped
object and remove all animals that do not have records in the A matrix.
index <- match(ped$REG, rownames(A))
A2 <- A[index, index]
dim(A2)
The A2
object is a reordered A matrix where animals appear in the same order as that of the ped
object.
head(ped$REG)
head(rownames(A2))
table(ped$REG == rownames(A2))
table(ped$REG == colnames(A2))
Exercise 1
Compute the inbreeding coefficients of animals and report the average of inbreeding coefficients. Use the function diag
.
kinship2 package
Alternatively, we can use the kinship2
package to compute a relationship matrix. The kinship
function returns the coefficient of kinship.
library(kinship2)
A3 <- kinship(all.ped$REG, all.ped$SIRE, all.ped$DAM)
dim(A3)
table(colnames(A3) == rownames(A3))
Again, let’s make sure that animals in the A matrix appear in the same order as that of the ped
object. The A4
object is a reordered A matrix.
index2 <- match(ped$REG, rownames(A3))
A4 <- A3[index2, index2]
dim(A4)
head(ped$REG)
head(rownames(A4))
table(ped$REG == rownames(A4))
table(ped$REG == colnames(A4))
Exercise 2
Recall that the additive genetic relationship between a pair of individuals is equal to twice their coefficient of kinship. Verify this by comparing the A2
and A4
objects.
Save R object
Save the object A2
so that we can reuse in the next class.
save(A2, file = "A2.Rda")