Examples of Data Analysis Techniques and Linear Regression Models

 
Get out the
Notes from
Monday
 
Feb. 4
th
, 2015
 
E
x
a
m
p
l
e
 
2
:
 
C
o
n
s
i
d
e
r
 
t
h
e
 
t
a
b
l
e
 
b
e
l
o
w
 
d
i
s
p
l
a
y
i
n
g
 
t
h
e
 
p
e
r
c
e
n
t
a
g
e
 
o
f
 
r
e
c
o
r
d
e
d
 
m
u
s
i
c
 
sales coming from music stores from 1998 to 2004.
 
 
 
Let’s create a scatter plot for this data,
 
F
o
r
 
o
u
r
 
s
c
a
t
t
e
r
 
p
l
o
t
 
i
n
 
e
x
a
m
p
l
e
 
2
,
 
t
h
e
 
l
i
n
e
a
r
 
f
u
n
c
t
i
o
n
 
y
 
=
 
-
2
.
9
7
x
 
+
 
4
8
.
8
9
 
r
e
p
r
e
s
e
n
t
s
 
t
h
e
 
l
i
n
e
a
r
function that best models our data.  The graph below shows the scatter plot and function plotted
together.
 
What does this equation mean in context of this problem?
 
 
 
What is the slope of the line? What does it mean in the
context of the problem?
 
What is the y-intercept? What does it mean in the context
of the problem?
 
 
 
What percentage of total sales came from music stores
in 2008?
 
E
x
a
m
p
l
e
 
3
:
 
T
h
e
 
a
m
o
u
n
t
 
o
f
 
a
n
t
i
b
i
o
t
i
c
 
t
h
a
t
 
r
e
m
a
i
n
s
 
i
n
 
y
o
u
r
 
b
o
d
y
 
o
v
e
r
 
a
 
p
e
r
i
o
d
 
o
f
 
t
i
m
e
 
v
a
r
i
e
s
 
f
r
o
m
 
o
n
e
drug to the next. The table given shows the amount of Antibiotic X that remains in your body over a period of
two days.
 
 
a).  Create a scatter plot for the data.
 
b)
Create the line of best fit for the data using the
 
instructions
from example 1.
 
 
 
 
 
 
c) The exact line of best fit is
 
   Use this function to determine the population in 2015.
 
 
d
) What year will the population be below 3,000?
 
e) What does the number -0.514 represent? What
does it mean in the context of the problem?
 
 
f) What does the number 9.314 represent? What does
it mean in the context of the problem?
 
 
Residual (Error)
 
– the ____________________ distance between a data point
and the graph of the __ ________ ______________.
the difference between the observed value of y and the predicted value of y.
Your predicted values come from the equation for the line of best fit.
 
exact
 
Line of Best Fit
4.
 
EXAMPLES:
Determine if the linear regression model is appropriate by
looking at the graph of the residuals.
 
 
 
 
Step 1: 
 Use the regression model to find the predicted values. (3
rd
 Column)
 
Step 2:
  Find the residuals:
 
2
nd
 Column – 3
rd
 Column = 4
th
 Column
 
This column  - this column  =  last column
 
Step 3:
  Make a residual plot (model):
 
Graph the points (1
st
 Column, 4
th
 Column).
 
A 
residual plot 
is another way to help us determine if a linear relationship
exists between our variables.  A residual plot is a scatter plot of the
 independent variable (y) and the residuals.  For residual plots,
the independent variable goes on the x-axis, and the residuals go on
the y-axis.
 
Step 4: 
 Determine if the regression model is appropriate.
    
No pattern - yes
 
Discussion:
What is this residual plot telling us about the relationship between speed and braking distance?
Let’s now end by discussing how to interpret the residual plots below…
 
 
D
i
s
c
u
s
s
i
o
n
:
1. Can we come up with any general rules in regards to interpreting residual plots?
Slide Note
Embed
Share

In these examples, we explore data analysis techniques and linear regression models using scatter plots, linear functions, and residual calculations. We analyze the trends in recorded music sales, antibiotic levels in the body, and predicted values in a linear regression model. The concepts of slope, y-intercept, residuals, and model appropriateness are explained within the context of each problem.

  • Data Analysis
  • Linear Regression
  • Scatter Plot
  • Residuals
  • Trend Analysis

Uploaded on Sep 11, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Get out the Notes from Monday Feb. 4th, 2015

  2. Example 2: Consider the table below displaying the percentage of recorded music sales coming from music stores from 1998 to 2004. Let s create a scatter plot for this data,

  3. For our scatter plot in example 2, the linear function y = -2.97x + 48.89 represents the linear function that best models our data. The graph below shows the scatter plot and function plotted together. 60 What does this equation mean in context of this problem? 55 50 45 What is the slope of the line? What does it mean in the context of the problem? 40 35 30 25 What is the y-intercept? What does it mean in the context of the problem? f x ( ) = 2.97 x + 48.89 20 15 10 What percentage of total sales came from music stores in 2008? 5 10 10 20 30 40 50 60 70 80 90 100 110 120 5

  4. Example 3: The amount of antibiotic that remains in your body over a period of time varies from one drug to the next. The table given shows the amount of Antibiotic X that remains in your body over a period of two days. a). Create a scatter plot for the data. b) Create the line of best fit for the data using the instructions from example 1. c) The exact line of best fit is = . 0 + 514 . 9 314 y x Use this function to determine the population in 2015. d) What year will the population be below 3,000?

  5. = . 0 . 9 + 514 314 y x e) What does the number -0.514 represent? What does it mean in the context of the problem? f) What does the number 9.314 represent? What does it mean in the context of the problem?

  6. exact Residual (Error) the ____________________ distance between a data point and the graph of the __ ________ ______________. the difference between the observed value of y and the predicted value of y. Your predicted values come from the equation for the line of best fit. Line of Best Fit

  7. EXAMPLES: Determine if the linear regression model is appropriate by looking at the graph of the residuals. 4.

  8. Step 1: Use the regression model to find the predicted values. (3rd Column) = 1.73 y x 0.96 Total Time (minutes) 32 Total Distance (miles) 51 Predicted Total Distance Residuals (observed predicted) -3.4 54.4 -1.9 -0.5 -5.3 -1.5 -3.8 -5 3.9 9.9 6.5 19 28 36 17 23 41 22 37 28 30 47 56 27 35 65 41 73 54 31.9 38.8 70 37.1 63.1 47.5 This column - this column = last column Step 2: Find the residuals: 2nd Column 3rd Column = 4th Column

  9. Step 3: Make a residual plot (model): Graph the points (1st Column, 4th Column). A residual plot is another way to help us determine if a linear relationship exists between our variables. A residual plot is a scatter plot of the independent variable (y) and the residuals. For residual plots, the independent variable goes on the x-axis, and the residuals go on the y-axis. Step 4: Determine if the regression model is appropriate. No pattern - yes

  10. Discussion: What is this residual plot telling us about the relationship between speed and braking distance? Let s now end by discussing how to interpret the residual plots below

  11. Discussion: 1. Can we come up with any general rules in regards to interpreting residual plots?

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#