Statistical Genomics Lecture 5: Linear Algebra Homework Questions

Statistical Genomics
Zhiwu Zhang
Washington State University
Lecture 5: Linear Algebra
Homework1, due next Wednesday, Feb 1, 3:10PM
Administration
Example of first question on homework1
Expectation and Variance of random variable
Expectation and Variance of function of random variable
Covariance
Matrix and manipulations
Special matrices: Identity, symmetric, diagonal, singular,
and orthogonal
Rank
Outline
Start from random variables with standard normal distribution,
define your own random variable that is function of the normal
distributed variables. Name the random variable as your last
name and develop a R function to generate the random
variable. The input of your R function should include n, which is
number variables to be generated, and parameters for the
distribution of the random variable you defined. Note: try not
to be the same as the known distributions such as Chi-square, F
and t.
Question 1 in Homework1
Example of Chi-square distribution
#There is a function  in R
x=rchisq(n=10000,df=5)
#Expectation is df and var=2df
par(mfrow=c(2,2),mar = c(3,4,1,1))
plot(x)
hist(x)
plot(density(x))
plot(ecdf(x))
mean(x)
var(x)
Self-defined function of Chi-square
rZhang=function(n=10,df=2){
y=replicate(n,{
x=rnorm(df,0,1)
y=sum(x^2)
})
return(y)
}
 
x1=rchisq(n=10000,df=5)
x2=rZhang(n=10000,df=5)
plot(density(x1),col="blue")
lines(density(x2),col="red")
Expectation=Mean
when sample size  goes to infinity
par(mfrow=c(3,1),mar = c(3,4,1,1))
x=rchisq(n=10,df=5)
hist(x)
abline(v=mean(x), col = "red")
x=rchisq(n=100,df=5)
hist(x)
abline(v=mean(x), col = "red")
x=rchisq(n=10000,df=5)
hist(x)
abline(v=mean(x), col = "red")
 
Range
Average deviation from mean, but it is always zero
Average squared deviation from mean: Variance
Square root of variance = standard deviation
Variance
 
n=100
x=rnorm(100,100,5)
c(min(x),max(x))
sum(x-mean(x))/(n-1)
sum((x-mean(x))^2)/
sqrt(sum((x-mean(x))^2)/(n-1))
 
y=ax, E(y)=aE(x), Var(y)=a^2*Var(x)
y=x+a, E(y)=E(x)+a, Var(y)=Var(x)
Expectation and variance of linear
function of random variables
 
n
=
10000
df
=
10
x
=rchisq(
n
,
df
)
mean(
x
)
var(
x
)
y
=
5
*
x
mean(
y
)
var(
y
)
z
=
5
+
x
mean(
z
)
var(
z
)
Covariance
n
=
10000
x
=rpois(
n
,
 
100
)
y
=rchisq(
n
,
5
)
z
=rt(
n
,
100
)
par(
mfrow
=c(
3
,
1
),
mar 
=
 
c(
3
,
4
,
1
,
1
))
plot(
x
,
y
)
plot(
x
,
z
)
plot(
y
,
z
)
 
var(
x
)
var(
y
)
var(
z
)
cov(
x
,
y
)
cov(
x
,
z
)
cov(
y
,
z
)
Covariance
n
=
10000
a=rnorm(n,100,5)
x
=a+rpois(
n
,
 
100
)
y
=a+rchisq(
n
,
5
)
z
=a+rt(
n
,
100
)
par(
mfrow
=c(
3
,
1
),
mar 
=
 
c(
3
,
4
,
1
,
1
))
plot(
x
,
y
)
plot(
x
,
z
)
plot(
y
,
z
)
 
var(
x
)
var(
y
)
var(
z
)
cov(
x
,
y
)
cov(
x
,
z
)
cov(
y
,
z
)
Cov(x,y)= sum(  (x- mean(x)) * (y- mean(y))    )/(n-1)
Formula of covariance
 
sum((
x
-mean(
x
))*(
y
-mean(
y
)))/(
n-1)
sum((
x
-mean(
x
))*(
z
-mean(
z
)))/(
n-1)
sum((
y
-mean(
y
))*(
z
-mean(
z
)))/(
n-1)
Calculation in R
W=cbind(x,y,z)
dim(W)
cov(W)
var(W)
Add/
subtraction
(dot)product
(dot)division
Element-wise Matrix manipulations
 
a
=matrix(seq(
10
,
60
,
10
),
2
,
3
)
b
=matrix(seq(
1
,
6
),
2
,
3
)
a
b
a
+
b
a
-
b
a
*
b
a
/
b
Multiplication
 
c
=matrix(c(
1
,
1
,
1
,
4
,
30
,
50
),
2
,
3
)
b
=matrix(c(
20000
,
10000
,
1000,1000
,
300
,
20
),
3
,
2
)
t
=
c
%*%
b
Inverse
A
B
=
IF:
 
B is inverse of A
vice versa
 
Inverse is for
square matrix only
Inverse in R: solve()
t
ti=solve(t)
ti
ti %*% t
t%*%ti
Transpose
Transpose
 
c
=matrix(c(
1
,
1
,
1
,
4
,
30
,
50
),
2
,
3
)
c
t(c)
(A
T
)
T
=A
(A+B)
T
=A
T
+B
T
(AB)
T
=B
T
A
T
(cB)
T
=cB
T 
, where c is scalar
Properties of transpose
 
A
=matrix(c(
1
,
1
,
1
,
4
,
30
,
50
),
2
,
3
)
B
=matrix(c(
1000
,
300
,
20
,
20000
,
10000
,
1000
),
3
,
2
)
t(
A
%*%
B
)
t(
B
)%*%t(
A
)
 
Symmetric: A=Transpose(A)
Diagonal matrix: all elements are 0 except diagonals
Identity: Diagonals=1 and res=0
Orthogonal: A multiply by transpose (A) = Identity
Singular: A square matrix does not have a inverse
Special matrix
 
The size of  the largest non-singular sub matrix
Full rank matrix: rank=dimension
Rank
Example of first question on homework1
Expectation and Variance of random variable
Expectation and Variance of function of random variable
Covariance
Matrix and manipulations
Special matrices: Identity, symmetric, diagonal, singular,
and orthogonal
Rank
Highlight
Slide Note
Embed
Share

Explore the concepts of random variables, covariance matrix, special matrices, and self-defined functions in statistical genomics through a series of homework questions. Gain insights into linear algebra and statistical genomics while working on Homework 1, analyzing the expectation and variance of random variables, and developing custom R functions. Dive into the world of Chi-square distributions and learn about the implications of sample size on expectation and variance. Enhance your understanding of statistical genomics and linear algebra with examples, self-defined functions, and practical applications.

  • Statistical genomics
  • Linear algebra
  • Random variables
  • Covariance matrix
  • Chi-square distribution

Uploaded on Sep 21, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Statistical Genomics Lecture 5: Linear Algebra Zhiwu Zhang Washington State University

  2. Administration Homework1, due next Wednesday, Feb 1, 3:10PM

  3. Outline Example of first question on homework1 Expectation and Variance of random variable Expectation and Variance of function of random variable Covariance Matrix and manipulations Special matrices: Identity, symmetric, diagonal, singular, and orthogonal Rank

  4. Question 1 in Homework1 Start from random variables with standard normal distribution, define your own random variable that is function of the normal distributed variables. Name the random variable as your last name and develop a R function to generate the random variable. The input of your R function should include n, which is number variables to be generated, and parameters for the distribution of the random variable you defined. Note: try not to be the same as the known distributions such as Chi-square, F and t.

  5. Example of Chi-square distribution Histogram of x 3000 25 2500 20 2000 #There is a function in R x=rchisq(n=10000,df=5) #Expectation is df and var=2df Frequency 15 1500 x 1000 10 500 5 par(mfrow=c(2,2),mar = c(3,4,1,1)) plot(x) hist(x) plot(density(x)) plot(ecdf(x)) mean(x) var(x) 0 0 0 2000 4000 6000 8000 10000 0 5 10 15 20 25 density.default(x = x) ecdf(x) 0.15 1.0 0.8 0.10 0.6 Density Fn(x) 0.4 0.05 0.2 0.00 0.0 0 5 10 15 20 25 30 0 5 10 15 20 25 30

  6. Self-defined function of Chi-square density.default(x = x1) 0.15 rZhang=function(n=10,df=2){ y=replicate(n,{ x=rnorm(df,0,1) y=sum(x^2) }) return(y) } 0.10 Density 0.05 x1=rchisq(n=10000,df=5) x2=rZhang(n=10000,df=5) plot(density(x1),col="blue") lines(density(x2),col="red") 0.00 0 5 10 15 20 25 N = 10000 Bandwidth = 0.4239

  7. Expectation=Mean when sample size goes to infinity Histogram of x 5 4 Frequency 3 2 par(mfrow=c(3,1),mar = c(3,4,1,1)) x=rchisq(n=10,df=5) hist(x) abline(v=mean(x), col = "red") 1 0 2 4 6 8 10 12 Histogram of x 25 20 x=rchisq(n=100,df=5) hist(x) abline(v=mean(x), col = "red") Frequency 15 10 5 0 x=rchisq(n=10000,df=5) hist(x) abline(v=mean(x), col = "red") 0 5 10 15 Histogram of x 2500 Frequency 1500 500 0 0 5 10 15 20

  8. Variance Range Average deviation from mean, but it is always zero Average squared deviation from mean: Variance Square root of variance = standard deviation n=100 x=rnorm(100,100,5) c(min(x),max(x)) sum(x-mean(x))/(n-1) sum((x-mean(x))^2)/ sqrt(sum((x-mean(x))^2)/(n-1))

  9. Expectation and variance of linear function of random variables n=10000 df=10 x=rchisq(n,df) y=ax, E(y)=aE(x), Var(y)=a^2*Var(x) y=x+a, E(y)=E(x)+a, Var(y)=Var(x) mean(x) var(x) y=5*x mean(y) var(y) z=5+x mean(z) var(z)

  10. Covariance 25 20 n=10000 x=rpois(n, 100) y=rchisq(n,5) z=rt(n,100) par(mfrow=c(3,1),mar = c(3,4,1,1)) plot(x,y) plot(x,z) plot(y,z) 15 y 10 5 0 60 80 100 120 140 4 2 z 0 -2 var(x) var(y) var(z) cov(x,y) cov(x,z) cov(y,z) -4 60 80 100 120 140 4 2 z 0 -2 -4 0 5 10 15 20 25

  11. Covariance 130 120 110 n=10000 a=rnorm(n,100,5) x=a+rpois(n, 100) y=a+rchisq(n,5) z=a+rt(n,100) par(mfrow=c(3,1),mar = c(3,4,1,1)) plot(x,y) plot(x,z) plot(y,z) y 100 90 80 160 180 200 220 240 120 110 100 z 90 80 var(x) var(y) var(z) cov(x,y) cov(x,z) cov(y,z) 160 180 200 220 240 120 110 100 z 90 80 80 90 100 110 120 130

  12. Formula of covariance Cov(x,y)= sum( (x- mean(x)) * (y- mean(y)) )/(n-1) sum((x-mean(x))*(y-mean(y)))/(n-1) sum((x-mean(x))*(z-mean(z)))/(n-1) sum((y-mean(y))*(z-mean(z)))/(n-1)

  13. Calculation in R W=cbind(x,y,z) dim(W) cov(W) var(W)

  14. Element-wise Matrix manipulations Add/ subtraction (dot)product (dot)division a=matrix(seq(10,60,10),2,3) b=matrix(seq(1,6),2,3) a b a+b a-b a*b a/b

  15. Multiplication AS 1 BS 2 MS 3 PhD 4 Salary SQF Mean 20000 1000 Mean 1 1 Education 1 4 Age 30 50 Edu 10000 300 Age 1000 20 Salary 60000 110000 SQF 1900 3200 c=matrix(c(1,1,1,4,30,50),2,3) b=matrix(c(20000,10000,1000,1000,300,20),3,2) t=c%*%b

  16. Inverse IF: 1 A B = 1 1 Inverse is for square matrix only B is inverse of A vice versa

  17. Inverse in R: solve() t ti=solve(t) ti ti %*% t t%*%ti

  18. Transpose Transpose c=matrix(c(1,1,1,4,30,50),2,3) c t(c)

  19. Properties of transpose (AT)T=A (A+B)T=AT+BT (AB)T=BTAT (cB)T=cBT , where c is scalar A=matrix(c(1,1,1,4,30,50),2,3) B=matrix(c(1000,300,20,20000,10000,1000),3 ,2) t(A%*%B) t(B)%*%t(A)

  20. Special matrix Symmetric: A=Transpose(A) Diagonal matrix: all elements are 0 except diagonals Identity: Diagonals=1 and res=0 Orthogonal: A multiply by transpose (A) = Identity Singular: A square matrix does not have a inverse

  21. Rank The size of the largest non-singular sub matrix Full rank matrix: rank=dimension

  22. Highlight Example of first question on homework1 Expectation and Variance of random variable Expectation and Variance of function of random variable Covariance Matrix and manipulations Special matrices: Identity, symmetric, diagonal, singular, and orthogonal Rank

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#