Statistical Genomics Lecture 5: Linear Algebra Homework Questions

Statistical Genomics

Zhiwu Zhang

Washington State University

Lecture 5: Linear Algebra



Homework1, due next Wednesday, Feb 1, 3:10PM

Administration



Example of first question on homework1



Expectation and Variance of random variable



Expectation and Variance of function of random variable



Covariance



Matrix and manipulations



Special matrices: Identity, symmetric, diagonal, singular,

and orthogonal



Rank

Outline



Start from random variables with standard normal distribution,

define your own random variable that is function of the normal

distributed variables. Name the random variable as your last

name and develop a R function to generate the random

variable. The input of your R function should include n, which is

number variables to be generated, and parameters for the

distribution of the random variable you defined. Note: try not

to be the same as the known distributions such as Chi-square, F

and t.

Question 1 in Homework1

Example of Chi-square distribution

#There is a function  in R

x=rchisq(n=10000,df=5)

#Expectation is df and var=2df

par(mfrow=c(2,2),mar = c(3,4,1,1))

plot(x)

hist(x)

plot(density(x))

plot(ecdf(x))

mean(x)

var(x)

Self-defined function of Chi-square

rZhang=function(n=10,df=2){

y=replicate(n,{

x=rnorm(df,0,1)

y=sum(x^2)

})

return(y)

x1=rchisq(n=10000,df=5)

x2=rZhang(n=10000,df=5)

plot(density(x1),col="blue")

lines(density(x2),col="red")

Expectation=Mean

when sample size  goes to infinity

par(mfrow=c(3,1),mar = c(3,4,1,1))

x=rchisq(n=10,df=5)

hist(x)

abline(v=mean(x), col = "red")

x=rchisq(n=100,df=5)

hist(x)

abline(v=mean(x), col = "red")

x=rchisq(n=10000,df=5)

hist(x)

abline(v=mean(x), col = "red")



Range



Average deviation from mean, but it is always zero



Average squared deviation from mean: Variance



Square root of variance = standard deviation

Variance

n=100

x=rnorm(100,100,5)

c(min(x),max(x))

sum(x-mean(x))/(n-1)

sum((x-mean(x))^2)/

sqrt(sum((x-mean(x))^2)/(n-1))



y=ax, E(y)=aE(x), Var(y)=a^2*Var(x)



y=x+a, E(y)=E(x)+a, Var(y)=Var(x)

Expectation and variance of linear

function of random variables

df

=rchisq(

df

mean(

var(

mean(

var(

mean(

var(

Covariance

=rpois(

=rchisq(

=rt(

par(

mfrow

=c(

),

mar

c(

))

plot(

plot(

plot(

var(

var(

var(

cov(

cov(

cov(

Covariance

a=rnorm(n,100,5)

=a+rpois(

=a+rchisq(

=a+rt(

par(

mfrow

=c(

),

mar

c(

))

plot(

plot(

plot(

var(

var(

var(

cov(

cov(

cov(



Cov(x,y)= sum(  (x- mean(x)) * (y- mean(y))    )/(n-1)

Formula of covariance

sum((

-mean(

))*(

-mean(

)))/(

n-1)

sum((

-mean(

))*(

-mean(

)))/(

n-1)

sum((

-mean(

))*(

-mean(

)))/(

n-1)

Calculation in R

W=cbind(x,y,z)

dim(W)

cov(W)

var(W)



Add/



subtraction



(dot)product



(dot)division

Element-wise Matrix manipulations

=matrix(seq(

),

=matrix(seq(

),

Multiplication

=matrix(c(

),

=matrix(c(

1000,1000

),

%*%

Inverse

IF:

B is inverse of A

vice versa

Inverse is for

square matrix only

Inverse in R: solve()

ti=solve(t)

ti

ti %*% t

t%*%ti

Transpose

Transpose

=matrix(c(

),

t(c)



(A

=A



(A+B)

=A

+B



(AB)

=B



(cB)

=cB

, where c is scalar

Properties of transpose

=matrix(c(

),

=matrix(c(

),

t(

%*%

t(

)%*%t(



Symmetric: A=Transpose(A)



Diagonal matrix: all elements are 0 except diagonals



Identity: Diagonals=1 and res=0



Orthogonal: A multiply by transpose (A) = Identity



Singular: A square matrix does not have a inverse

Special matrix



The size of  the largest non-singular sub matrix



Full rank matrix: rank=dimension

Rank



Example of first question on homework1



Expectation and Variance of random variable



Expectation and Variance of function of random variable



Covariance



Matrix and manipulations



Special matrices: Identity, symmetric, diagonal, singular,

and orthogonal



Rank

Highlight

Slide Note

Embed Share

Download

Explore the concepts of random variables, covariance matrix, special matrices, and self-defined functions in statistical genomics through a series of homework questions. Gain insights into linear algebra and statistical genomics while working on Homework 1, analyzing the expectation and variance of random variables, and developing custom R functions. Dive into the world of Chi-square distributions and learn about the implications of sample size on expectation and variance. Enhance your understanding of statistical genomics and linear algebra with examples, self-defined functions, and practical applications.

boda_eon Follow

Uploaded on Sep 21, 2024 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript

Statistical Genomics Lecture 5: Linear Algebra Zhiwu Zhang Washington State University

Administration Homework1, due next Wednesday, Feb 1, 3:10PM

Outline Example of first question on homework1 Expectation and Variance of random variable Expectation and Variance of function of random variable Covariance Matrix and manipulations Special matrices: Identity, symmetric, diagonal, singular, and orthogonal Rank

Question 1 in Homework1 Start from random variables with standard normal distribution, define your own random variable that is function of the normal distributed variables. Name the random variable as your last name and develop a R function to generate the random variable. The input of your R function should include n, which is number variables to be generated, and parameters for the distribution of the random variable you defined. Note: try not to be the same as the known distributions such as Chi-square, F and t.

Example of Chi-square distribution Histogram of x 3000 25 2500 20 2000 #There is a function in R x=rchisq(n=10000,df=5) #Expectation is df and var=2df Frequency 15 1500 x 1000 10 500 5 par(mfrow=c(2,2),mar = c(3,4,1,1)) plot(x) hist(x) plot(density(x)) plot(ecdf(x)) mean(x) var(x) 0 0 0 2000 4000 6000 8000 10000 0 5 10 15 20 25 density.default(x = x) ecdf(x) 0.15 1.0 0.8 0.10 0.6 Density Fn(x) 0.4 0.05 0.2 0.00 0.0 0 5 10 15 20 25 30 0 5 10 15 20 25 30

Self-defined function of Chi-square density.default(x = x1) 0.15 rZhang=function(n=10,df=2){ y=replicate(n,{ x=rnorm(df,0,1) y=sum(x^2) }) return(y) } 0.10 Density 0.05 x1=rchisq(n=10000,df=5) x2=rZhang(n=10000,df=5) plot(density(x1),col="blue") lines(density(x2),col="red") 0.00 0 5 10 15 20 25 N = 10000 Bandwidth = 0.4239

Expectation=Mean when sample size goes to infinity Histogram of x 5 4 Frequency 3 2 par(mfrow=c(3,1),mar = c(3,4,1,1)) x=rchisq(n=10,df=5) hist(x) abline(v=mean(x), col = "red") 1 0 2 4 6 8 10 12 Histogram of x 25 20 x=rchisq(n=100,df=5) hist(x) abline(v=mean(x), col = "red") Frequency 15 10 5 0 x=rchisq(n=10000,df=5) hist(x) abline(v=mean(x), col = "red") 0 5 10 15 Histogram of x 2500 Frequency 1500 500 0 0 5 10 15 20

Variance Range Average deviation from mean, but it is always zero Average squared deviation from mean: Variance Square root of variance = standard deviation n=100 x=rnorm(100,100,5) c(min(x),max(x)) sum(x-mean(x))/(n-1) sum((x-mean(x))^2)/ sqrt(sum((x-mean(x))^2)/(n-1))

Expectation and variance of linear function of random variables n=10000 df=10 x=rchisq(n,df) y=ax, E(y)=aE(x), Var(y)=a^2*Var(x) y=x+a, E(y)=E(x)+a, Var(y)=Var(x) mean(x) var(x) y=5*x mean(y) var(y) z=5+x mean(z) var(z)

Covariance 25 20 n=10000 x=rpois(n, 100) y=rchisq(n,5) z=rt(n,100) par(mfrow=c(3,1),mar = c(3,4,1,1)) plot(x,y) plot(x,z) plot(y,z) 15 y 10 5 0 60 80 100 120 140 4 2 z 0 -2 var(x) var(y) var(z) cov(x,y) cov(x,z) cov(y,z) -4 60 80 100 120 140 4 2 z 0 -2 -4 0 5 10 15 20 25

Covariance 130 120 110 n=10000 a=rnorm(n,100,5) x=a+rpois(n, 100) y=a+rchisq(n,5) z=a+rt(n,100) par(mfrow=c(3,1),mar = c(3,4,1,1)) plot(x,y) plot(x,z) plot(y,z) y 100 90 80 160 180 200 220 240 120 110 100 z 90 80 var(x) var(y) var(z) cov(x,y) cov(x,z) cov(y,z) 160 180 200 220 240 120 110 100 z 90 80 80 90 100 110 120 130

Formula of covariance Cov(x,y)= sum( (x- mean(x)) * (y- mean(y)) )/(n-1) sum((x-mean(x))*(y-mean(y)))/(n-1) sum((x-mean(x))*(z-mean(z)))/(n-1) sum((y-mean(y))*(z-mean(z)))/(n-1)

Calculation in R W=cbind(x,y,z) dim(W) cov(W) var(W)

Element-wise Matrix manipulations Add/ subtraction (dot)product (dot)division a=matrix(seq(10,60,10),2,3) b=matrix(seq(1,6),2,3) a b a+b a-b a*b a/b

Multiplication AS 1 BS 2 MS 3 PhD 4 Salary SQF Mean 20000 1000 Mean 1 1 Education 1 4 Age 30 50 Edu 10000 300 Age 1000 20 Salary 60000 110000 SQF 1900 3200 c=matrix(c(1,1,1,4,30,50),2,3) b=matrix(c(20000,10000,1000,1000,300,20),3,2) t=c%*%b

Inverse IF: 1 A B = 1 1 Inverse is for square matrix only B is inverse of A vice versa

Inverse in R: solve() t ti=solve(t) ti ti %*% t t%*%ti

Transpose Transpose c=matrix(c(1,1,1,4,30,50),2,3) c t(c)

Properties of transpose (AT)T=A (A+B)T=AT+BT (AB)T=BTAT (cB)T=cBT , where c is scalar A=matrix(c(1,1,1,4,30,50),2,3) B=matrix(c(1000,300,20,20000,10000,1000),3 ,2) t(A%*%B) t(B)%*%t(A)

Special matrix Symmetric: A=Transpose(A) Diagonal matrix: all elements are 0 except diagonals Identity: Diagonals=1 and res=0 Orthogonal: A multiply by transpose (A) = Identity Singular: A square matrix does not have a inverse

Rank The size of the largest non-singular sub matrix Full rank matrix: rank=dimension

Highlight Example of first question on homework1 Expectation and Variance of random variable Expectation and Variance of function of random variable Covariance Matrix and manipulations Special matrices: Identity, symmetric, diagonal, singular, and orthogonal Rank

Statistical Genomics Lecture 5: Linear Algebra Homework Questions

Download Presentation

Presentation Transcript

Related

More Related Content