# Review of basic statistics

## Introduction

We will learn how to compute basic statistics in R by using the subset of beef cattle data set.

## Read a file

### Read the phenotypic data

We will first set a path to the phenotypic dataset. Insert your path to the file inside the double quotation `" "`

. Your path may differ from the path below if you stored the file in a different directory.

`FILE <- "../data/sub/1000withEffects.redangus"`

The function `read.table`

reads a file in a data frame format. Typing `help`

will open a documentation page.

`help(read.table)`

The `read.table`

function has many arguments but we will use only three in this exercise.

`dat <- read.table(file = FILE, header = TRUE, stringsAsFactors = FALSE)`

The `head`

function returns the first six rows of a data frame. The number of rows returned can be varied by controlling the argument `n`

.

```
head(dat)
head(dat, n = 10)
```

The `dim`

function returns the dimension of data frame.

`dim(dat)`

We will use two phenotypes to compute basic statistics. Each column of data frame can be accessed by the `$`

operator followed by a column name.

```
dat$BWT
dat$CE
```

The `length`

function returns the length of vector.

```
length(dat$BWT)
length(dat$CE)
```

## Mean

The `mean`

function computes the mean of a vector.

`mean(dat$BWT)`

### Exercise 1

Verfiy that \(E(aX) = aE(X)\). Set \(a = 10\). Here \(X\) is a vector of body weight. Use the multiplication operator `*`

.

## Variance

The `var`

function computes the sample variance of a vector.

`var(dat$BWT)`

Alternatively, we can use the equation \(Var(X) = \frac{1}{N-1}\sum(X - \bar{X})^2\)

`sum((dat$BWT - mean(dat$BWT))^2)/(length(dat$BWT) - 1)`

### Exercise 2

Verfiy that \(Var(aX) = a^2Var(X)\). Set \(a = 10\).

## Covariance

The `cov`

function computes the covariance of two vectors.

`cov(dat$BWT, dat$CE)`

Alternatively, we can use the equation \(Cov(X, Y) = \frac{1}{N-1}\sum(X - \bar{X})(Y - \bar{Y})\)

```
sum((dat$BWT - mean(dat$BWT)) * (dat$CE - mean(dat$CE)))/(length(dat$BWT) -
1)
```

### Exercise 3

Verfiy that \(Cov(aX, Y) = aCov(X, Y)\) by setting \(a = 10\). Here \(X\) is a vector of body weight and \(Y\) is a vector of calving ease.

### Exercise 4

Verfiy that \(Cov(aX, bY) = abCov(X, Y)\) by setting \(a = 10\) and \(b=5\).

## Correlation

The `cor`

function computes the correlation of two vectors.

`cor(dat$BWT, dat$CE)`

## Regression

The regression of Y on X is given by \(b_{Y,X} = \frac{Cov(X,Y)}{Var(X)}\).

`cov(dat$BWT, dat$CE)/var(dat$BWT)`

### Exercise 5

Verfiy that regression of \(Y\) on \(aX\) is \(b_{Y, aX} = \frac{Cov(aX,Y)}{Var(aX)} = \frac{Cov(X,Y)}{aVar(X)}\). Set \(a = 10\).

## Online R Tutorials

## Additional reading

- Fienberg (2014) What is Statistics?. Annual Review of Statistics and Its Application.