Single-marker CMLM GWAS using GAPIT

Haipeng Yu & Gota Morota

April 6, 2020


This example illustrates how to fit single-marker compressed mixed linear model (CMLM) GWAS using GAPIT. The main idea behind CMLM is to reduce computational time by constructing a genomic relationship matrix among clusters instead of individuals. In brief,

  1. Use the dist() function to create a distance matrix from a genomic relationship matrix among individuals
  2. Use the the function hclust() to create clusters
  3. Use the function cutree() to assign individuals to each cluster
  4. Construct a reduced genomic relationship matrix by averaging the relationships within clusters and across clusters
  5. Set up the incidence matrix \(\mathbf{Z}\) accordingly

Load packages

# clear working environment

# install and load support packages

# load GAPIT function

Maize data

The maize data set used here is from a maize association panel including 281 diverse lines genotyped with 3,093 markers. The three phenotypes include ear height (EarHT), days to pollination (dpoll), and ear diameter (EarDia). In this example, we will only use EarHT.

# phenotypes
myY <- read.table(file = "", header = TRUE)

# marker matrix
myGD <- read.table(file = "", header = TRUE)

# map information of markers 
myGM <- read.table(file = "", header = TRUE)

Compressed mixed linear model (CMLM) GWAS

In the GAPIT() function, set model = CMLM.

# GWAS with MLM 
gwasCMLMfit <- GAPIT(Y = myY[, c(1:2)], GD = myGD, GM = myGM, model = 'CMLM', 
            = 3, SNP.P3D = TRUE, SNP.MAF = 0.05)


Use the read.csv() function to read the output file GAPIT.CMLM.EarHT.GWAS.Results.csv. The columns 4, 9 and 10 are p-value, FDR adjusted p-value, and marker effect, respectively.

gwasCMLM <- read.csv('GAPIT.CMLM.EarHT.GWAS.Results.csv', header = TRUE)

Manhattan plot

Open the GAPIT.CMLM.EarHT.Manhattan.Plot.Genomewise.pdf to view the Manhattan plot.
File: GAPIT.CMLM.EarHT.Manhattan.Plot.Genomewise.pdf

File: GAPIT.CMLM.EarHT.Manhattan.Plot.Genomewise.pdf