Random Sampling in Probabilistic System Analysis

undefined
 
 
 
 
Chapter 8
 
Fundamental Sampling Distributions
and Data Descriptions
Chapter 8
Fundamental Sampling Distributions and Data Descriptions
Populations and Samples
 
A 
population
 consists of the totality of the observations with
which we are concerned.
A population is the 
entire group 
we are interested in, which we
wish to describe or draw conclusions about.
Chapter 8.1
Random Sampling
 
A 
sample 
is a subset of a population.
 
In the field of statistical inference, the statistician is interested in
arriving at conclusions concerning a 
population
 when it is
impossible or impractical to observe the entire set of observations
that make up the population.
This brings us to consider the notion of 
sampling
.
In order to obtain valid inference about a population, the samples
must be 
representative
 of the population.
Random Sampling
 
Any sampling procedure that produces inferences that consistently
overestimate or consistently underestimate some characteristic of
the population is said to be 
biased
.
To eliminate any possibility of bias in the sampling procedure, it is
desirable to choose a 
random sample 
in the sense that the
observations are made 
independently
 and 
at random
.
Chapter 8.1
Random Sampling
 
Let 
X
1
, 
X
2
,..., 
X
n
 be 
n
 independent random variables, each having
the same probability distribution 
f
(
x
). We then define 
X
1
, 
X
2
,..., 
X
n
to be a random sample of size 
n
 from the population 
f
(
x
) and write
its joint probability distribution as
Random Sampling
 
If one makes a random selection of 
n
 = 8 storage batteries from a
manufacturing process, which has maintained the same
specifications, and records the length of life for each battery with the
first measurement 
x
1
 being a value of 
X
1
, the second measurement
x
2
 a value of 
X
2
, and so forth, then 
x
1
, 
x
2
,..., 
x
8
 are the values of the
random sample 
X
1
, 
X
2
,..., 
X
8
.
 
If we assume the population of battery lives to be 
normal
, the
possible values of any 
X
i
, 
i
 = 1, 2,..., 8 will be precisely the same as
those in the original population, and hence 
X
i
 has the same identical
normal
 distribution as 
X
.
Chapter 8.1
Random Sampling
Random Sampling
 
Suppose we wish to arrive at a conclusion concerning the
proportion of coffee-drinking people in the US who prefer a certain
brand of coffee. It is 
impossible
 to compute the value of the
parameter 
p
 that represents the population proportion.
Instead, we select a representative random sample, and can easily
calculate the proportion 
p
 of people in this sample favoring a
certain brand of coffee.
The value 
p
 is now used to make an inference concerning the true
proportion 
p
.
 
Why?
 
^
 
^
 
p
 is a function of the observed values in the random sample.
Many random sample are possible to be taken from the population,
and 
p
 would vary from sample to sample.
p 
is a value of a random variable that is represented by 
P
.
 
^
 
^
 
^
 
^
 
Any function of the random variables that constitutes (or contains)
a random sample is called a 
statistic
.
Chapter 8.2
Some Important Statistics
Sample Mean and Sample Variance
 
If 
X
1
, 
X
2
,..., 
X
n
 represent a random sample of size 
n
, then the
sample mean 
is defined by the statistic
Chapter 8.2
Some Important Statistics
 
If 
X
1
, 
X
2
,..., 
X
n
 represent a random sample of size 
n
, then the
sample variance 
is defined by the statistic
Some Important Statistics
Chapter 8.2
Some Important Statistics
 
A comparison of coffee prices at 4 randomly selected grocery stores
in San Diego showed increases from the previous month of 12, 15,
17, and 20 cents for a 1-pound bag.
Find the mean and the variance of this random sample of price
increases.
Sample Variance and Sample Standard Deviation
 
If 
S
2
 is the variance of a random sample of size 
n
, we may write
Chapter 8.2
Some Important Statistics
 
The 
sample standard deviation
, denoted by 
S
, is the positive
square root of the sample variance.
 
Find the variance of the data 3, 4, 5, 6, 6, and 7, representing the
number of trout caught by a random sample of 6 fishermen on June
19, 1996, at Lake Muskoka.
 
Can you calculate with
the first formula?
Sampling Distributions
 
The field of statistical inference is basically concerned with
generalizations and predictions.
For each sample selected from the population we can compute
statistics (
i
.
e
., the 
sample parameters
) , and from these statistics
we made various statements concerning the values of the
population parameters
 that may or may not be true.
Chapter 8.4
Sampling Distributions
 
Since a statistic is a random variable that depends only on the
observed sample, it must have a probability distribution.
The probability distribution of a statistic is called a 
sampling
distribution
.
 
The probability distribution of 
X
 is called the 
sampling
distribution of the mean
, etc.
The sampling distribution of a statistic depends on 
the size of the
population
, 
the size of the samples
, and 
the method of choosing
the samples
.
 
_
Sampling Distribution of 
X
 and 
S
2
 
One should view that the sampling distribution of 
X
 and 
S
2
 are the
mean/tool with which we eventually make inferences on the
parameters 
μ
 and 
σ
2
.
Chapter 8.4
Sampling Distributions
 
_
 
The same principle applies in the case of the distribution of 
S
2
. The
sampling distribution produces information about the variability of
s
2
 values around 
σ
2
 in repeated experiments.
 
The sampling distribution of 
X
 with sample size 
n
 is the distribution
that results when 
an experiment is conducted over and over
again 
(always with sample size 
n
) and 
the many values of 
X
result
.
This sampling distribution, then, describes the variability of sample
mean around the true population mean 
μ
.
 
_
 
_
_
Sampling Distribution of Means
 
Central Limit Theorem
. If 
X
 is the mean of a random sample of
size 
n
 taken from a population with mean 
μ
 and finite variance 
σ
2
,
then the limiting form of the distribution of
Chapter 8.5
Sampling Distribution of Means
 
as 
n
 
 ∞, is the standard normal distribution 
n
(
z
;
 
0,
 
1).
Sampling Distribution of Means
Chapter 8.5
Sampling Distribution of Means
 
An electrical firm manufactures light bulbs that have a length of life
that is approximately normally distributed, with mean equal to 800
hours and a standard deviation of 40 hours. Find the probability that
a random sample of 16 bulbs will have an average life of less than
775 hours.
 
It is very unlikely that the mean life of the light
bulbs is less then 775 hours, should the claim of
800 mean life be true.
Sampling Distribution of Means
Chapter 8.5
Sampling Distribution of Means
 
An important manufacturing process produces cylindrical
components parts for the automotive industry. It is important that
the process produce parts having a mean of 5 mm. An experiment is
conducted in which 100 parts produced by the process are selected
randomly and the diameter measured on each.
It is known that the population standard deviation 
σ
 = 0.1. The
experiment indicates a sample average diameter 
x
 = 5.027 mm.
Does this sample information appears to support or refute the
engineer’s conjecture?
 
_
Sampling Distribution of Means
Chapter 8.5
Sampling Distribution of Means
 
Someone would experience by chance an 
x
 that is 0.027 mm
from the mean 
μ
 in only 7 in 1000 experiments.
As a result, this experiment with 
x
 = 5.027 certainly does
not give supporting evidence to the conjecture that 
μ
 = 5.
In fact it strongly refutes the conjecture.
 
_
 
_
Sampling Distribution of Means
Chapter 8.5
Sampling Distribution of Means
Sampling Distribution of Means
 
If independent samples size 
n
1
 and 
n
2
 are drawn at random from
two populations, discrete or continuous, with means 
μ
1
 and 
μ
2
 and
variances 
σ
1
2
 and 
σ
2
2
, respectively, then 
the sampling distribution
of the differences of means
, 
X
1
X
2
, is approximately normally
distributed with mean and variance given by
Chapter 8.5
Sampling Distribution of Means
 
Hence,
 
is approximately a standard normal variable.
 
_
 
_
 
and
Sampling Distribution of Means
Chapter 8.5
Sampling Distribution of Means
 
Two independent experiments are being run in which two different
types of paints are compared. Eighteen specimens are painted using
type 
A
 and the drying time in hours is recorded on each. The same is
done with type 
B
. The population standard deviations are both
known to be 1.0.
Assuming that the mean drying time is equal for the two types of
paint, find 
P
(
X
A
X
B
>1.0), where 
X
A
 and 
X
B
 are the average drying
times for samples of size 
n
A
 
=
 
n
B
 
=
 
18.
Sampling Distribution of Means
Chapter 8.5
Sampling Distribution of Means
 
The paints are unlikely to be
dried with a time difference
of 1 hour, if their mean drying
time is equal.
Should in the reality the
difference is measured to be
1 hour, then the assumption
that 
μ
A
 = 
μ
B
 is questionable.
Sampling Distribution of Means
Chapter 8.5
Sampling Distribution of Means
 
With the same example as before, what difference can be inferred if
the difference in the two sample averages is only 15 minutes instead
of 1 hour?
 
In 22.66% of the time, the paint 
A
 will dried 15 minutes
longer than paint 
B
, although their means are the same.
The difference in sample means of 15 minutes can happen
by chance, 22.66% even though 
μ
A
 = 
μ
B
.
As a result, that type of difference in average drying time
certainly is not a clear indication that 
μ
A
 
μ
B
.
Sampling Distribution of Means
Chapter 8.5
Sampling Distribution of Means
 
The television picture tubes of manufacturer 
A
 has a mean lifetime of
6.5 years and a standard deviation of 0.9 year, while those of
manufacturer 
B
 has a mean lifetime of 6.0 years and a standard
deviation of 0.8 year.
What is the probability that a random sample of 36 tubes from
manufacturer 
A
 will have a mean lifetime that is at least 1 year more
than the mean lifetime of a sample of 49 tubes from manufacturer
B
?
Sampling Distribution of Means
Chapter 8.5
Sampling Distribution of Means
 
It is almost impossible (only by 0.4% chance)
that the mean lifetime of the tube of
manufacturer 
A
 will be 1 year longer than that
of manufacturer 
B
.
They are more probably to differ around 0.5
year, as given by the difference of the
population means.
Homework 10A
Probabilistic System Analysis
 
1.
An company manufactures light bulbs that have a mean operating
voltage of 100 volts and a standard deviation of 10 volts. The distribution
of light bulb voltage is normal. Find the probability that a random sample
of 25 light bulbs will have an average operating voltage less than 95
volts.
 
(Mont.E7.13)
 
2.
A first random sample of size 36 is taken from a normal population
having a mean of 75 and a standard deviation of 3. A second random
sample of size 25 is taken from another normal population having a
mean of 80 and a standard deviation of 5.
 
Find the probability that the sample mean computed from second
population will exceed the sample mean computed from the first
population by at least 3.4 but less than 5.9.
 
(Wal8.828)
Slide Note
Embed
Share

In the field of statistical inference, random sampling plays a crucial role in drawing conclusions about populations based on representative samples. This lecture by Dr. Erwin Sitompul at President University delves into the concepts of sampling distributions, unbiased sampling procedures, and important statistics derived from random samples. The importance of random selection in research and its application in estimating population characteristics are highlighted through practical examples.

  • Random Sampling
  • Probabilistic System Analysis
  • Statistical Inference
  • Sampling Distributions
  • Population Characteristics

Uploaded on Sep 13, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Probabilistic System Analysis Lecture 10 Dr.-Ing. Erwin Sitompul President University http://zitompul.wordpress.com 2 0 2 1 President University Erwin Sitompul PSA 10/1

  2. Chapter 8 Fundamental Sampling Distributions and Data Descriptions Chapter 8 Fundamental Sampling Distributions and Data Descriptions President University Erwin Sitompul PSA 10/2

  3. Chapter 8.1 Random Sampling Populations and Samples A population consists of the totality of the observations with which we are concerned. A population is the entire group we are interested in, which we wish to describe or draw conclusions about. A sample is a subset of a population. In the field of statistical inference, the statistician is interested in arriving at conclusions concerning a population when it is impossible or impractical to observe the entire set of observations that make up the population. This brings us to consider the notion of sampling. In order to obtain valid inference about a population, the samples must be representative of the population. President University Erwin Sitompul PSA 10/3

  4. Chapter 8.1 Random Sampling Random Sampling Any sampling procedure that produces inferences that consistently overestimate or consistently underestimate some characteristic of the population is said to be biased. To eliminate any possibility of bias in the sampling procedure, it is desirable to choose a random sample in the sense that the observations are made independently and at random. Let X1, X2,..., Xn be n independent random variables, each having the same probability distribution f(x). We then define X1, X2,..., Xn to be a random sample of size n from the population f(x) and write its joint probability distribution as ( , ,..., ) ( ) ( ) ( n n f x x x f x f x f x = ) 1 2 1 2 President University Erwin Sitompul PSA 10/4

  5. Chapter 8.1 Random Sampling Random Sampling If one makes a random selection of n = 8 storage batteries from a manufacturing process, which has maintained the same specifications, and records the length of life for each battery with the first measurement x1 being a value of X1, the second measurement x2 a value of X2, and so forth, then x1, x2,..., x8 are the values of the random sample X1, X2,..., X8. If we assume the population of battery lives to be normal, the possible values of any Xi, i = 1, 2,..., 8 will be precisely the same as those in the original population, and hence Xi has the same identical normal distribution as X. President University Erwin Sitompul PSA 10/5

  6. Chapter 8.2 Some Important Statistics Random Sampling Suppose we wish to arrive at a conclusion concerning the proportion of coffee-drinking people in the US who prefer a certain brand of coffee. It is impossible to compute the value of the parameter p that represents the population proportion. Instead, we select a representative random sample, and can easily calculate the proportion p of people in this sample favoring a certain brand of coffee. The value p is now used to make an inference concerning the true proportion p. Why? ^ ^ ^ p is a function of the observed values in the random sample. Many random sample are possible to be taken from the population, and p would vary from sample to sample. p is a value of a random variable that is represented by P. ^ ^ ^ Any function of the random variables that constitutes (or contains) a random sample is called a statistic. President University Erwin Sitompul PSA 10/6

  7. Chapter 8.2 Some Important Statistics Sample Mean and Sample Variance If X1, X2,..., Xn represent a random sample of size n, then the sample mean is defined by the statistic n X X n = i = 1 i If X1, X2,..., Xn represent a random sample of size n, then the sample variance is defined by the statistic ( 2 1 1 n ) n 2 X X i = = i S President University Erwin Sitompul PSA 10/7

  8. Chapter 8.2 Some Important Statistics Some Important Statistics A comparison of coffee prices at 4 randomly selected grocery stores in San Diego showed increases from the previous month of 12, 15, 17, and 20 cents for a 1-pound bag. Find the mean and the variance of this random sample of price increases. 12 15 17 + + + 20 = = 16 x cents 4 4 = 2 ( 16) x (12 16) + (15 16) + (17 16) + (20 16) 2 2 2 2 i 34 3 = = 2 = 1 i s 3 3 President University Erwin Sitompul PSA 10/8

  9. Chapter 8.2 Some Important Statistics Sample Variance and Sample Standard Deviation If S2 is the variance of a random sample of size n, we may write 2 2 i i i i S n n n n ( ) n 2 i n X X X X i = = = 1 1 = 2 2 = 1 Previously, S ( 1) 1 n The sample standard deviation, denoted by S, is the positive square root of the sample variance. Find the variance of the data 3, 4, 5, 6, 6, and 7, representing the number of trout caught by a random sample of 6 fishermen on June 19, 1996, at Lake Muskoka. 2 (6)(171) (31) (6)(5) 65 30 6 6 = = = = 2 2 i 171, 31 x x s i = = 1 1 i i Can you calculate with the first formula? s = 13 6 President University Erwin Sitompul PSA 10/9

  10. Chapter 8.4 Sampling Distributions Sampling Distributions The field of statistical inference is basically concerned with generalizations and predictions. For each sample selected from the population we can compute statistics (i.e., the sample parameters) , and from these statistics we made various statements concerning the values of the population parameters that may or may not be true. Since a statistic is a random variable that depends only on the observed sample, it must have a probability distribution. The probability distribution of a statistic is called a sampling distribution. _ The probability distribution of X is called the sampling distribution of the mean, etc. The sampling distribution of a statistic depends on the size of the population, the size of the samples, and the method of choosing the samples. President University Erwin Sitompul PSA 10/10

  11. Chapter 8.4 Sampling Distributions _ Sampling Distribution of X and S2 One should view that the sampling distribution of X and S2 are the mean/tool with which we eventually make inferences on the parameters and 2. _ _ The sampling distribution of X with sample size n is the distribution that results when an experiment is conducted over and over again (always with sample size n) and the many values of X result. This sampling distribution, then, describes the variability of sample mean around the true population mean . _ The same principle applies in the case of the distribution of S2. The sampling distribution produces information about the variability of s2 values around 2 in repeated experiments. President University Erwin Sitompul PSA 10/11

  12. Chapter 8.5 Sampling Distribution of Means Sampling Distribution of Means Central Limit Theorem. If X is the mean of a random sample of size n taken from a population with mean and finite variance 2, then the limiting form of the distribution of X Z n = as n , is the standard normal distribution n(z;0,1). President University Erwin Sitompul PSA 10/12

  13. Chapter 8.5 Sampling Distribution of Means Sampling Distribution of Means An electrical firm manufactures light bulbs that have a length of life that is approximately normally distributed, with mean equal to 800 hours and a standard deviation of 40 hours. Find the probability that a random sample of 16 bulbs will have an average life of less than 775 hours. X = Z x = = = = n 775 40 16 10, 800, X X 775 800 10 = = 2.5 z ( ) = = 775 ( 2.5) P X P Z 0.0062 It is very unlikely that the mean life of the light bulbs is less then 775 hours, should the claim of 800 mean life be true. President University Erwin Sitompul PSA 10/13

  14. Chapter 8.5 Sampling Distribution of Means Sampling Distribution of Means An important manufacturing process produces cylindrical components parts for the automotive industry. It is important that the process produce parts having a mean of 5 mm. An experiment is conducted in which 100 parts produced by the process are selected randomly and the diameter measured on each. It is known that the population standard deviation = 0.1. The experiment indicates a sample average diameter x = 5.027 mm. Does this sample information appears to support or refute the engineer s conjecture? _ x = = = = 5.027 5, 0.1 100 0.01, X X 5.027 5 0.01 = = 2.7 z ( ) ( ) = 5 0.027 1 5 0.027 P X P X President University Erwin Sitompul PSA 10/14

  15. Chapter 8.5 Sampling Distribution of Means Sampling Distribution of Means ( = ( 1 2.7 P Z = ( ) 1 ( 2.7) ( 2.7) P Z P Z = ( ) 1 0.9965 0.0035 = ( ) ) = 5 0.027 1 5 0.027 P X P X 5 X 1 2.7 P 0.1 100 ) = = 0.007 0.7% _ Someone would experience by chance an x that is 0.027 mm from the mean in only 7 in 1000 experiments. As a result, this experiment with x = 5.027 certainly does not give supporting evidence to the conjecture that = 5. In fact it strongly refutes the conjecture. _ President University Erwin Sitompul PSA 10/15

  16. Chapter 8.5 Sampling Distribution of Means Sampling Distribution of Means President University Erwin Sitompul PSA 10/16

  17. Chapter 8.5 Sampling Distribution of Means Sampling Distribution of Means If independent samples size n1 and n2 are drawn at random from two populations, discrete or continuous, with means 1 and 2 and variances 12 and 22, respectively, then the sampling distribution of the differences of means, X1 X2, is approximately normally distributed with mean and variance given by _ _ 2 1 2 2 = = + 2 X and 1 2 X X X n n 1 2 1 2 1 2 Hence, ( ) ( ) ( + ) X X 1 2 1 2 = Z ( ) 2 1 2 2 n n 1 2 is approximately a standard normal variable. President University Erwin Sitompul PSA 10/17

  18. Chapter 8.5 Sampling Distribution of Means Sampling Distribution of Means Two independent experiments are being run in which two different types of paints are compared. Eighteen specimens are painted using type A and the drying time in hours is recorded on each. The same is done with type B. The population standard deviations are both known to be 1.0. Assuming that the mean drying time is equal for the two types of paint, find P(XA XB>1.0), where XA and XB are the average drying times for samples of size nA=nB=18. = = 0 A B X X A B 2 1 2 2 1 1 1 9 = + ) ( ) ( 1 n 2 X = + = ( X n X n 18 18 1 2 1 2 ) X 1 0 1 9 1 2 1 2 = = = z 3.0 ( ) + 2 1 2 2 n 2 President University Erwin Sitompul PSA 10/18

  19. Chapter 8.5 Sampling Distribution of Means Sampling Distribution of Means 3) 1 = 1 0.9987 = = ( ( 3) 0.0013 P Z P Z The paints are unlikely to be dried with a time difference of 1 hour, if their mean drying time is equal. Should in the reality the difference is measured to be 1 hour, then the assumption that A = B is questionable. President University Erwin Sitompul PSA 10/19

  20. Chapter 8.5 Sampling Distribution of Means Sampling Distribution of Means With the same example as before, what difference can be inferred if the difference in the two sample averages is only 15 minutes instead of 1 hour? 1 4 0 1 9 3 4 = = z 1 0.7734 = = = 0.2266 ( 3 4) 1 ( 3 4) P Z P Z In 22.66% of the time, the paint A will dried 15 minutes longer than paint B, although their means are the same. The difference in sample means of 15 minutes can happen by chance, 22.66% even though A = B. As a result, that type of difference in average drying time certainly is not a clear indication that A B. President University Erwin Sitompul PSA 10/20

  21. Chapter 8.5 Sampling Distribution of Means Sampling Distribution of Means The television picture tubes of manufacturer A has a mean lifetime of 6.5 years and a standard deviation of 0.9 year, while those of manufacturer B has a mean lifetime of 6.0 years and a standard deviation of 0.8 year. What is the probability that a random sample of 36 tubes from manufacturer A will have a mean lifetime that is at least 1 year more than the mean lifetime of a sample of 49 tubes from manufacturer B? = = = 6.5 6 0.5 A B X X A B 2 1 2 2 2 2 (0.9) 36 (0.8) 49 = + = = + 2 2 X 3.556 10 X n n 1 2 1 2 = 0.1886 X X 1 2 1 (6.5 6.0) 0.1886 1 0.5 0.1886 = = = 2.651 z = 1 0.9960 = = ( 2.651) 1 ( 2.651) 0.0040 P Z P Z President University Erwin Sitompul PSA 10/21

  22. Chapter 8.5 Sampling Distribution of Means Sampling Distribution of Means 1 0.9960 = = = ( 2.651) 1 ( 2.651) 0.0040 P Z P Z It is almost impossible (only by 0.4% chance) that the mean lifetime of the tube of manufacturer A will be 1 year longer than that of manufacturer B. They are more probably to differ around 0.5 year, as given by the difference of the population means. President University Erwin Sitompul PSA 10/22

  23. Probabilistic System Analysis Homework 10A 1. An company manufactures light bulbs that have a mean operating voltage of 100 volts and a standard deviation of 10 volts. The distribution of light bulb voltage is normal. Find the probability that a random sample of 25 light bulbs will have an average operating voltage less than 95 volts. 2. A first random sample of size 36 is taken from a normal population having a mean of 75 and a standard deviation of 3. A second random sample of size 25 is taken from another normal population having a mean of 80 and a standard deviation of 5. Find the probability that the sample mean computed from second population will exceed the sample mean computed from the first population by at least 3.4 but less than 5.9. (Mont.E7.13) (Wal8.828) President University Erwin Sitompul PSA 10/23

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#