Spring 2BL : Lecture 5

Spring 2BL : Lecture 5
Vivek Sharma
5561 Mayer Hall Addition
858 822 2918
1
With grateful thanks to past 2BL instructors for input: Tera Austrum, Jim Branson & Avi Yagil 
Schedule For Week 5
Lecture on Probability distributions, Confidence Levels
This week (
“B”
) you will finalize and submit Experiment #2
by the end of your lab session
Be sure to read & follow experiment rubric as you write lab report
Go to LTAC office hour (Today 2-4 pm 2722 MHA) if you need
help (particularly with Excel or if you need to retake data)
Read Chapter 7 from Taylor book
Section A02 (Tuesday, 3:30 pm) students should mail to TA
(ngrogers@ucsd.edu) the Lab #2 quiz score and show him
the quiz sheet in the next lab.
2
Histograms & Limiting Distributions
 
Measure Multiple, 
N
, measurements of the same quantity, 
x
Calculated “average” and “spread” of values
Mean and Standard deviation
Determine the uncertainty on the mean
Standard Deviation On Mean (SDOM)
As you take more and more measurements you want to
visualize
 the distribution of the measurements you make
A convenient way to visualize data: plot measured value in
binned histogram
As the number of measurements 
N
 becomes very large, you
see a clear shape emerging from distribution of measured
values 
 The 
limiting distribution
3
Histograms
4
How To Make A Binned Histogram
5
 
 Determine the 
range
 of your data
(largest value - smallest value)
Choose number of bins ( ≥ 4)
Width
 of bins, 
Δ
k
, is the 
range 
divided by # of bins
Usually bin width = ½ σ
List bin boundaries, count # of data points, 
n
k
, in each bin
Draw histogram
x- scale represents the measured values
y-
scale
, f
k
 is the # measurements in each bin
Normalized Histogram
6
 
Want the area of each bin to equal the 
probability
 of
finding a measurement within that bin
Area of rectangle 
k
: 
A
k
 = 
f
k
 
Δ
k
f
k
 = vertical scale
Δ
k 
= width of bins
Fraction of data in bin: 
F
k
 = 
n
k
/
N
n
k
 = # measurements in 
k
th
 bin
N = total number of measurements
Choose 
f
k
 so 
A
k
 = 
F
k
 
 
(Total area of histogram = 1)
 
A
k
 
f
k
 
Δ
k
 
n
1
 
n
2
 
n
3
 
n
4
 
n
5
 
N
 = 
n
1
 + 
n
2
 + 
n
3
 + 
n
4
 + 
n
5
Calculating Mean From A  Histogram
7
Limiting Distributions
As N increases,  a limiting distribution comes into shape
Choose normalization of limiting distribution
Such that                       = probability of observing a measured
value between A and B
8
 
The Gaussian ( or Normal) Distribution
 
The limiting distribution for a
measurement 
x
 subject to many small
random errors is bell shaped and centered
on the true value of 
x
The mathematical function that describes
the bell-shaped curve is called the normal
distribution or Gauss function :
 
Defined by two parameters:
σ
 = width parameter
X = true value of 
x
9
The Normalized Gaussian Distribution
10
 
Normalization 
The normalized Gaussian function is
 
 
For a Guassian form
Standard deviation σ
x
 is the width parameter  σ
Mean value of 
x
 = true value X
Gaussian Distribution Shape: Changing X
11
Gaussian Distribution Shape: Changing σ
12
The Meaning Of σ In A Gaussian Distribution
13
Probability Of A Measurement In Terms Of σ
14
Taylor book
Table A
Page 287
Example Of A Normal Distribution
15
Compatibility Of A Measured Result: t-score
16
Acceptability Of A Measurement
Large probability means 
reasonably
 likely outcome
Small probability means reasonable chance of 
discrepancy
What is 
reasonable
 depends on some 
convention
We will define
< 5% probability (t >1.96) as significant discrepancy
<1% probability (t >2.58) as unreasonably large discrepancy
If probability is less than 1% , we declare data incompatible
with expectation
17
t-Score Test And Confidence Level
18
 
A student measures 
g
, the acceleration of gravity, repeatedly and
carefully, and gets a final answer of 9.5 m/s
2
 with a standard deviation
of 0.1 m/s
2
.  If his measurement were 
normally distributed
, with a mean
at the accepted value of 9.8 and with σ = 0.1, 
what is the probability
 of
getting an answer that differs from 9.8 by as much as (or more than) his
result ?
 
Its three standard deviations off the mean.  Looking up the probability:
we see that 99.73% measurements
are within 3 sigma, so, the
probability that his measument is
compatible with g= 9.8 is 0.27%
.
 
The Confidence Level is the probability to get a “worse” result than you
measured.
t-Score and Confidence Levels
19
 
Two students measure the radius of a planet.  Student A gets 
R
=9000 km
and estimates an error of 

= 600 km.  Student B gets 
R
=6000 km with
an error of 

=1000 km
. 
What is the probability that the two
measurements would disagree by more than this (given the error
estimates)?
  Define a quantity 
q = R
A
-R
B 
= 3000 km.  The expected 
q
 is zero.
  Use propagation of errors to determine the error on 
q
 
  Compute 
t,
 the number of standard deviations from the expected 
q
 
  Now  look at Table A.  98.95% should be within 2.56 
.  So the
probability to get a worse result is 1.05%
.
We call this the 
Confidence Level of measurement,
 and in this case
is it BAD !
Rejection Of Data
20
 
Rejecting data in an unwarranted fashion can 
bias
 your
measurements.
If there is suspicion of a mistake, data should be rejected
without looking at the value measured
.
I
f
 
o
n
l
y
 
t
h
e
 
m
e
a
s
u
r
e
d
 
v
a
l
u
e
 
i
s
 
s
u
s
p
i
c
i
o
u
s
,
 
w
e
 
s
h
o
u
l
d
 
h
a
v
e
 
a
p
r
e
s
c
r
i
p
t
i
o
n
 
f
o
r
 
d
a
t
a
 
r
e
j
e
c
t
i
o
n
.
 
 
W
e
 
w
i
l
l
 
u
s
e
 
o
n
e
 
c
a
l
l
e
d
C
h
a
u
v
e
n
e
t
s
 
C
r
i
t
e
r
i
o
n
Data are rejected if we 
expect less than 0.5 measurements
with a deviation from the mean as large or larger than the
one in question.
The criterion should be 
reapplied
 after the worst case is
rejected.
What would you call a follower of Chauvenet ?
Example Of Chauvenet’s Criterion
21
A student makes 14 measurements of the period of a pendulum.  She gets
the following measurements, all with the same estimated error:
   
T= 2.7, 2.3, 2.9, 2.3, 2.6, 2.9, 2.8, 2.7, 2.8, 3.2, 2.5, 2.9, 2.9, and 2.3
Should any of these measurements be dropped?
Add up all the periods and divide by 14 to get the average,
T
=2.7 seconds
Compute the standard deviation from the data, 

= 0.27
seconds
The measurement furthest from the mean is 3.2 seconds
giving 
t 
= 0.5/0.27 =1.85.
Look up the 
probability
 to be further off  => 
P
=6.43%
Multiply by the number of measurements to get the expected
number of events that far off, n
exp
=(14)(0.0643)=0.9
Do not drop this measurement (or any other)
Assume the student
made a 15th
measurement but her
partner bumped the
pendulum during the
measurement.  She got a
period of 2.8 seconds.
Should she drop this
measurement?  Why ?
Rejection Of Data: Chauvenet’s Criterion
22
The Maximum Likelihood Principle
23
The Maximum Likelihood Principle
24
Is 
L
 a Probability?
Why does max 
L
 give
the best estimate?
Proof That Mean Is Best Estimate of True Value X
25
 
Assume 
X
 is a parameter of 
P(x)
.
When 
L 
is maximum, we must have
 
Lets assume a Normal error distribution and
find the formula for the best value for 
X
.
 
Q.E.D.
the mean
Error On The Mean
26
 
Formula for mean of measurements.  (We
ve
shown that this is the best estimate of the true 
x
.)
 
Now (simply) use 
propagation of errors
 to get the error on the mean.
 
We got the error on the mean (SDOM)  simply by
propagating errors.
What would you do if the
x
i
 had different errors?
Weighted Average
27
 
We can use maximum Likelihood (
2
) to
average measurements with different errors.
 
We derive the result that
 
From error propagation, we can determine the
error on the weighted mean.
What does this give in the limit
where all errors are equal?
An Example Of Weighted Average
28
 
Suppose 2 students measure the radius of Neptune.  Student A gets r =80 Mm with an
error of 10 Mm and student B gets r = 60 Mm with an error of 3 Mm.  What is the best
estimate of the true radius of Neptune ?
What does this tell you about the
importance of  error estimates?
Is there a Democracy or a Meritocracy
when it comes to measurements ?
Slide Note
Embed
Share

In Lecture 5 of the Spring 2BL course with Vivek Sharma, students will delve into probability distributions, confidence levels, and the visualization of data through histograms. The lecture covers topics like calculating the mean and standard deviation, understanding the uncertainty in measurements, and creating binned histograms. Students are also reminded to submit Experiment #2, attend office hours for assistance, and read Chapter 7 from the Taylor book. The content emphasizes the importance of visualizing data distribution as the number of measurements increases.

  • Probability distributions
  • Histograms
  • Data visualization
  • Experiment analysis
  • Educational resources

Uploaded on Apr 19, 2024 | 3 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Spring 2BL : Lecture 5 Vivek Sharma 5561 Mayer Hall Addition 858 822 2918 1 With grateful thanks to past 2BL instructors for input: Tera Austrum, Jim Branson & Avi Yagil

  2. Schedule For Week 5 Lecture on Probability distributions, Confidence Levels This week ( B ) you will finalize and submit Experiment #2 by the end of your lab session Be sure to read & follow experiment rubric as you write lab report Go to LTAC office hour (Today 2-4 pm 2722 MHA) if you need help (particularly with Excel or if you need to retake data) Read Chapter 7 from Taylor book Section A02 (Tuesday, 3:30 pm) students should mail to TA (ngrogers@ucsd.edu) the Lab #2 quiz score and show him the quiz sheet in the next lab. 2

  3. Histograms & Limiting Distributions Measure Multiple, N, measurements of the same quantity, x Calculated average and spread of values Mean and Standard deviation Determine the uncertainty on the mean Standard Deviation On Mean (SDOM) As you take more and more measurements you want to visualize the distribution of the measurements you make A convenient way to visualize data: plot measured value in binned histogram As the number of measurements N becomes very large, you see a clear shape emerging from distribution of measured values The limiting distribution 3

  4. Histograms 4

  5. How To Make A Binned Histogram Determine the range of your data (largest value - smallest value) Choose number of bins ( 4) Width of bins, k, is the range divided by # of bins Usually bin width = List bin boundaries, count # of data points, nk, in each bin Draw histogram x- scale represents the measured values y-scale, fk is the # measurements in each bin fk xk 5

  6. Normalized Histogram Want the area of each bin to equal the probability of finding a measurement within that bin Area of rectangle k: Ak = fk k fk = vertical scale k = width of bins Fraction of data in bin: Fk = nk/N nk = # measurements in kth bin N = total number of measurements Choose fk so Ak = Fk (Total area of histogram = 1) k fk Ak n2 n3 n4 n5 n1 N = n1 + n2 + n3 + n4 + n5 6

  7. Calculating Mean From A Histogram 7

  8. Limiting Distributions As N increases, a limiting distribution comes into shape f(x) Choose normalization of limiting distribution Such that = probability of observing a measured value between A and B f(x)dx =1 B - f(x)dx = PAB A 8

  9. The Gaussian ( or Normal) Distribution The limiting distribution for a measurement x subject to many small random errors is bell shaped and centered on the true value of x The mathematical function that describes the bell-shaped curve is called the normal distribution or Gauss function : G(x) e-(x-X)2/2s2 Defined by two parameters: = width parameter X = true value of x 9

  10. The Normalized Gaussian Distribution + Normalization The normalized Gaussian function is G(x)dx =1 - 1 e-(x-X)2/2s2 GX,s(x)= s 2p For a Guassian form Standard deviation x is the width parameter Mean value of x = true value X 10

  11. Gaussian Distribution Shape: Changing X GX,s 11

  12. Gaussian Distribution Shape: Changing s s s s s 12

  13. The Meaning Of In A Gaussian Distribution GX,s(x) tells us probability of obtaining any given value x B PAB= G(x)dx A an answer in the range a x b is the probability that any one measurement gives The probability that a measurement will fall within ts X+ts X+ts 1 s 2pe-(x-X)2/2s2 P ts= G(x)dx = X-ts X-ts 13

  14. Probability Of A Measurement In Terms Of t Taylor book Table A Page 287 14

  15. Example Of A Normal Distribution 15

  16. Compatibility Of A Measured Result: t-score actual deviation from mean standard deviation t = 16

  17. Acceptability Of A Measurement Large probability means reasonably likely outcome Small probability means reasonable chance of discrepancy What is reasonable depends on some convention We will define < 5% probability (t >1.96) as significant discrepancy <1% probability (t >2.58) as unreasonably large discrepancy Prob(outside ts) = 1- Prob(within ts) If probability is less than 1% , we declare data incompatible with expectation 17

  18. t-Score Test And Confidence Level The Confidence Level is the probability to get a worse result than you measured. A student measures g, the acceleration of gravity, repeatedly and carefully, and gets a final answer of 9.5 m/s2 with a standard deviation of 0.1 m/s2. If his measurement were normally distributed, with a mean at the accepted value of 9.8 and with = 0.1, what is the probability of getting an answer that differs from 9.8 by as much as (or more than) his result ? 9.8 9.5 0.1 Its three standard deviations off the mean. Looking up the probability: = = 3 t we see that 99.73% measurements are within 3 sigma, so, the probability that his measument is compatible with g= 9.8 is 0.27%. 18

  19. t-Score and Confidence Levels Two students measure the radius of a planet. Student A gets R=9000 km and estimates an error of s = 600 km. Student B gets R=6000 km with an error of s =1000 km. What is the probability that the two measurements would disagree by more than this (given the error estimates)? Define a quantity q = RA-RB = 3000 km. The expected q is zero. Use propagation of errors to determine the error on q s s s = + Compute t, the number of standard deviations from the expected q 9000 6000 q t s Now look at Table A. 98.95% should be within 2.56 . So the probability to get a worse result is 1.05%. We call this the Confidence Level of measurement, and in this case is it BAD ! = 2 A 2 B 1170 km q = = = 2.56 1170 q 19

  20. Rejection Of Data Rejecting data in an unwarranted fashion can bias your measurements. If there is suspicion of a mistake, data should be rejected without looking at the value measured. If only the measured value is suspicious, we should have a prescription for data rejection. We will use one called Chauvenet s Criterion Data are rejected if we expect less than 0.5 measurements with a deviation from the mean as large or larger than the one in question. The criterion should be reapplied after the worst case is rejected. What would you call a follower of Chauvenet ? 20

  21. Example Of Chauvenets Criterion A student makes 14 measurements of the period of a pendulum. She gets the following measurements, all with the same estimated error: T= 2.7, 2.3, 2.9, 2.3, 2.6, 2.9, 2.8, 2.7, 2.8, 3.2, 2.5, 2.9, 2.9, and 2.3 Should any of these measurements be dropped? n 1 n = x x Add up all the periods and divide by 14 to get the average, T=2.7 seconds Compute the standard deviation from the data, s = 0.27 seconds The measurement furthest from the mean is 3.2 seconds giving t = 0.5/0.27 =1.85. Look up the probability to be further off => P=6.43% Multiply by the number of measurements to get the expected number of events that far off, nexp=(14)(0.0643)=0.9 Do not drop this measurement (or any other) i = n 1 i 2 1 ( ) s = 2 x x x i 1 n = 1 i Assume the student made a 15th measurement but her partner bumped the pendulum during the measurement. She got a period of 2.8 seconds. Should she drop this measurement? Why ? 21

  22. Rejection Of Data: Chauvenets Criterion 22

  23. The Maximum Likelihood Principle The best estimate for X and s of N observed measurements xi are those for which Probability PX,s(xi) is maximum 23

  24. The Maximum Likelihood Principle 2 ( ) x X s 1 s Recall the probability density for measurements of some quantity x. e = 2 ( ) x X P 2 , s 2 Normal distribution is one example of P(x). Now, lets make repeated measurements of x to help reduce our errors. , , , ..., n x x x x 1 2 3 We define the Likelihood as the product of the probabilities. The larger L, the more likely a set of measurements is = ( ) ( P x P x P x ) ( )... ( ) L P x 1 2 3 n Is L a Probability? The best estimate parameters of P(x) are those that maximize L Why does max L give the best estimate? 24

  25. Proof That Mean Is Best Estimate of True Value X L L X Assume X is a parameter of P(x). When L is maximum, we must have = 0 Xbest X Lets assume a Normal error distribution and find the formula for the best value for X. L X=0=Ce c2 X c2 X 2-1 c2 X -c2 2 n =0 L= P(x1)P(x2)...P(xn)= P(xi) i=1 n =1 s2 (xi- X)=0 2 n (xi-X )2 2s2 - (xi-X )2 2s2 - n 1 1 i=1 2pse 2sne L= = i=1 n n (xi- X)=0 (2p) i=1 i=1 n L=Ce-c22 xi-nX =0 Q.E.D. the mean (xi- X)2 s2 n i=1 X =1 c2= (definition) n xi i=1 n 25 i=1

  26. Error On The Mean n 1 n Formula for mean of measurements. (We ve shown that this is the best estimate of the true x.) = x x i = 1 i Now (simply) use propagation of errors to get the error on the mean. x x x x s = s s s ... x x x x x x 1 2 n 1 2 n 1 n x x = i What would you do if the xi had different errors? 2 s 2 s s n x s = = = n i x n n n = 1 i We got the error on the mean (SDOM) simply by propagating errors. 26

  27. Weighted Average We can use maximum Likelihood ( 2) to average measurements with different errors. 2 n x X = = 0 2 i s 2 i X = 1 i 2 n x X n n 1 x = 2 i = 0 X i s s s 2 i 2 i n = = 1 i i = = 1 1 i i w x i i 1 We derive the result that = x 1 n i w i s 2 i w i = 1 i n n = = 0 w x X w i i i From error propagation, we can determine the error on the weighted mean. = = 1 1 i i 1 n s = w x x i i n = = w X 1 n i i = 1 i What does this give in the limit where all errors are equal? w i = 1 i 27

  28. An Example Of Weighted Average Suppose 2 students measure the radius of Neptune. Student A gets r =80 Mm with an error of 10 Mm and student B gets r = 60 Mm with an error of 3 Mm. What is the best estimate of the true radius of Neptune ? 10080+1 1 100+1 1 960 r =wArA+wBrB wA+wB = =61.65 Mm 9 What does this tell you about the importance of error estimates? Is there a Democracy or a Meritocracy when it comes to measurements ? 28

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#