Curvilinear Regression

Curvilinear Regression
Modeling Departures from the
Straight Line
(Curves and Interactions)
Skill Set
How does polynomial
regression model
quadratic and cubic
trends?
 
Describe the sequence
of tests used to model
curves in polynomial
regression.
 
Why is collinearity
likely to be a problem
in using polynomial
regression?
More skills
How do you model
interactions of
continuous variables
with regression?
What is the
difference between
a moderator and a
mediator?  How do
you test for the
presence of each?
 
Linear vs. Nonlinear Models
Typical linear Model:
Typical nonlinear models:
We don’t use models like this:
Nonlinear means in the terms, not the coefficients.
Curvilinear Regression
Uses Polynomial Regression to fit curves.  Polynomials
are formed by taking IVs to successive powers.
Polynomial equation referred to by its degree, determined
by highest exponent.  Power terms introduce bends.
Quadratic Function
Note the bend.  
We can fit data with ceiling effects,
sensation/perception as a function of
stimulus intensity, performance as a
function of practice, etc.
Quadratic Function (2)
The graph shows the effect of changing the b weight for
the squared (quadratic) term.
(original curve)
Cubic Function
Note the two bends.  We get a new bend for
each new power term.  Sharpness of bend
depends on size of 
b
 weights.
Response surfaces (1)
With 1 IV, the relations between X and Y are shown as a
line.  With linear regression and 2 IVs, the response
surface relating Y to the X variables will be a plane, e.g.:
Y = X1+2*X2
Y
X1
X2
The response surface
will be like a stiff
sheet of paper in a
cardboard box.
Response Surface (2)
The same surface from a different angle.
Y
X1
X2
Response Surface (3)
Y=X1+X2-.1*X1*X1.  Relations between X2 and Y are
linear; relations between X1 and Y are curved.
Response surface is like a section of a coffee can.
 
Response Surface (4)
In this graph, the relation between Y and both X
variables is curved.  Response surface is like a
parachute.
Y = X1+X2-.2X1*X1-.2X2*X2
 
Review
 
 
What is polynomial regression?
How does polynomial regression allow us to include
one or more bends into lines and response surfaces?
Nonlinear Relations in
Nonexperimental Research
In experimental research, you can fit
curves with 
orthogonal polynomials
,
which have advantages.  Not covered
here.
Create power terms (IV taken to
successive powers)
Test for increasing numbers of bends by
adding terms
Quit when adding a term does not
increase variance accounted for.
Polynomials to model bends in nonexperimental research
Correlations among terms
Note that terms with higher exponents are VERY highly
correlated.  There WILL be problems with collinearity.
Sequence of tests.  Start with time, add time squared.  If
significant, add time cubed.  Stop when adding a term
doesn’t help.  Each power adds a bend.  Quadratic is one
bend, cubic is two, and so forth.
Results of Polynomial
Regression
Note that polynomial is a special case of hierarchical reg.
Polynomial Results (2)
Suppose it had happened that the term for time-squared had
been significant.  The regression equation is 
Y' = 9.25 -.39X +
.014X
2
.  The results graphed:
 
Interpreting Weights in
Polynomial Regression
All power terms for an IV work together to
define the curve relating Y to X.
Do not interpret 
b
 weights for polynomials.
They change if you subtract the mean from
the raw data.
To estimate ‘importance’ look to the change
in R-square for the block of variables that
represent the IV.
Never use polynomials in a variable selection
algorithm (e.g., stepwise regression).
Specialized literature on nonlinear terms in
path analysis and SEM (hard to do).
Review
Why is collinearity likely to
be a problem in using
polynomial regression?
Describe the sequence of
tests used to model curves in
polynomial regression.
 Review R code.
Interactions
An interaction means that the
‘importance’ of one variable depends
upon the value of another.  Importance
means ‘slope’ in regression with a
continuous IV.
An interaction is also sometimes called
a moderator, as in “Z moderates the
relations between X and Y.”
In regression, we look to see if the
slope relating the DV to the IV changes
depending on the value of a second IV.
Example Interaction
 
 
 
For those with low cog ability, there is a
small correlation between creativity and
productivity.
As cognitive ability
increases, the relations
between creativity and
productivity become
stronger. The slope of
productivity on creativity
depends on cog ability.
Interaction Response Surface
 
The slope of X1 depends on the value of X2 and vice
versa.  Regression is looking to fit this response surface
and no other when we do the customary analysis for
interactions with continuous IVs.  More restrictive than
ANOVA.
 
Significance Tests for
Interactions
Subtract means from each IV (optional).
Compute product of IVs.
Compute significance of change in R-square
using interaction(s).
If R-square change is n.s., no interaction(s)
present.
If R-square change is significant, find the
significant interaction(s).
Graph the interaction(s)
1
 
50
 
40
 
100
2
 
35
 
45
 
80
3
 
40
 
50
 
90
4
 
50
 
55
 
105
5
 
55
 
60
 
110
6
 
35
 
40
 
95
7
 
45
 
45
 
100
8
 
55
 
50
 
105
9
 
50
 
55
 
95
10
 
40
 
60
 
90
11
 
45
 
40
 
110
12
 
50
 
45
 
115
13
 
60
 
50
 
120
14
 
65
 
55
 
125
15
 
55
 
60
 
105
16
 
50
 
40
 
110
17
 
55
 
45
 
95
18
 
55
 
50
 
115
19
 
60
 
60
 
120
20        65        65        140
;
Data to test for interaction between
cognitive ability and creativity on
performance.
 
Pearson Correlation Coefficients, N = 20
                                  Prob > |r| under H0: Rho=0
                        person       product        create           cog         inter
         person        1.00000       0.65629       0.32531       0.66705       0.57538
                                      0.0017        0.1616        0.0013        0.0079
         product       0.65629       1.00000       0.50470       0.83568       0.78465
                        0.0017                      0.0232        <.0001        <.0001
         create        0.32531       0.50470       1.00000       0.38414       0.84954
                        0.1616        0.0232                      0.0945        <.0001
         cog           0.66705       0.83568       0.38414       1.00000       0.80732
                        0.0013        <.0001        0.0945                      <.0001
         inter         0.57538       0.78465       0.84954       0.80732       1.00000
                        0.0079        <.0001        <.0001        <.0001
Correlation Matrix
Results for 2 IVs (Main Effects)
Result for interaction
Go to R code
Moderator and Mediator
Moderator Means Interaction.  Slope of one depends
on the value of the other. Use moderated regression
(test for an interaction) to test.
Mediator means there is a causal chain of events.
The mediating variable is the proximal cause of the
DV.  A more distal cause changes the mediator.  Use
path analysis* to test.  In this graph, X2 is the
mediator.
 
* Modern methods Sobel test; Preacher & Hayes; Bootstrap
Review
How do you model interactions of
continuous variables with regression?
What is the difference between a
moderator and a mediator?  How do
you test for the presence of each?
Slide Note
Embed
Share

Discover the nuances of curvilinear regression, polynomial modeling, and interactions in statistical analysis. Understand the challenges of collinearity, explore quadratic and cubic trends, and learn the sequence of tests to model curves effectively. Dive into the difference between linear and nonlinear models, and delve into the complexities of modeling continuous variable interactions. Expand your knowledge on moderators versus mediators and how to test for their presence. Unravel the power and flexibility of polynomial regression in fitting curves, from linear to quadratic to cubic equations. Learn how to leverage polynomial regression to capture intricate relationships in your data. Visualize the impact of changing weights in quadratic functions and the sharpness of bends in cubic functions. Be equipped with essential skills to enhance your regression modeling techniques.

  • Curvilinear Regression
  • Polynomial Modeling
  • Collinearity
  • Interactions
  • Nonlinear Models

Uploaded on Feb 20, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Curvilinear Regression Modeling Departures from the Straight Line (Curves and Interactions)

  2. Skill Set Why is collinearity likely to be a problem in using polynomial regression? How does polynomial regression model quadratic and cubic trends? Describe the sequence of tests used to model curves in polynomial regression.

  3. More skills How do you model interactions of continuous variables with regression? What is the difference between a moderator and a mediator? How do you test for the presence of each?

  4. Linear vs. Nonlinear Models Typical linear Model: + = + ' Y a b X b X 1 1 2 2 Typical nonlinear models: + + = + 2 ' Y a b X b X b X 1 1 2 1 3 2 = + ' log( ) Y a b X 1 1 We don t use models like this: = + + 2 3 2 ' Y a b X b X 1 1 2 Nonlinear means in the terms, not the coefficients.

  5. Curvilinear Regression Uses Polynomial Regression to fit curves. Polynomials are formed by taking IVs to successive powers. Polynomial equation referred to by its degree, determined by highest exponent. Power terms introduce bends. Linear Quadratic Cubic = = = + + + ' Y a b X 1 1 + + 2 ' Y a b X b X 1 1 2 1 + 2 3 1 ' Y a b X b X b X 1 1 2 1 3

  6. Quadratic Function 8 Note the bend. 6 Y=2+2*X-.25*X*X 4 We can fit data with ceiling effects, sensation/perception as a function of stimulus intensity, performance as a function of practice, etc. 2 0 0 2 3 X 5 6

  7. Quadratic Function (2) Y = 2+ 2X + bX*X 10 b = .125 8 b = -.125 6 b = -.25 (original curve) Y 4 b = -.5 2 0 0 2 3 X 5 6 The graph shows the effect of changing the b weight for the squared (quadratic) term.

  8. Cubic Function Note the two bends. We get a new bend for each new power term. Sharpness of bend depends on size of b weights. 1600 Y=50*X-(5*X*X)+(.09*X*X*X) 1200 800 400 0 -400 -800 0 10 20 30 40 50 X

  9. Response surfaces (1) With 1 IV, the relations between X and Y are shown as a line. With linear regression and 2 IVs, the response surface relating Y to the X variables will be a plane, e.g.: 15 12 Y 9 6 1 Y = X1+2*X2 2 3 3 The response surface will be like a stiff sheet of paper in a cardboard box. X1 4 1 5 2 3 X2 4 5

  10. Response Surface (2) 15 12 Y 9 6 1 2 3 3 X1 4 1 5 2 X2 3 4 5 The same surface from a different angle.

  11. Response Surface (3) 8 8 6 6 5 5 1 3 2 3 1 Nonlinear Relations Y = X1 + X2 - .1*X1**2 3 2 2 2 4 3 4 5 1 1 5 2 3 2 15 4 3 5 4 5 12 9 6 0 3 2 4 0 6 8 Y=X1+X2-.1*X1*X1. Relations between X2 and Y are linear; relations between X1 and Y are curved. Response surface is like a section of a coffee can. 10 0 2 4 6 8 10

  12. Response Surface (4) Y = X1+X2-.2X1*X1-.2X2*X2 Nonlinear Relations Y = X1+X2 -.1*X1**2-.1*X2**2 3 5 4 2 3 1 2 1 0 1 0 0 2 2 -1 4 3 -1 6 4 8 10 5 1 0 2 2 4 6 3 8 4 10 5 In this graph, the relation between Y and both X variables is curved. Response surface is like a parachute.

  13. Review What is polynomial regression? How does polynomial regression allow us to include one or more bends into lines and response surfaces?

  14. Nonlinear Relations in Nonexperimental Research In experimental research, you can fit curves with orthogonal polynomials, which have advantages. Not covered here. Create power terms (IV taken to successive powers) Test for increasing numbers of bends by adding terms Quit when adding a term does not increase variance accounted for.

  15. Polynomials to model bends in nonexperimental research Rating (DV) 10 9 10 8 9 8 7 7 8 9 7 6 8 5 7 5 6 7 7 8 Time 0 0 0 0 0 5 5 5 5 5 10 10 10 10 10 15 15 15 15 15 Time**2 0 0 0 0 0 25 25 25 25 25 100 100 100 100 100 225 225 225 225 225 Time**3 0 0 0 0 0 125 125 125 125 125 1000 1000 1000 1000 1000 3375 3375 3375 3375 3375

  16. Correlations among terms Excite 1 -.72 -.62 -.55 Time Time**2 Time**3 Excite Time Time**2 Time**3 1 .96 .91 1 .99 1 Note that terms with higher exponents are VERY highly correlated. There WILL be problems with collinearity. Sequence of tests. Start with time, add time squared. If significant, add time cubed. Stop when adding a term doesn t help. Each power adds a bend. Quadratic is one bend, cubic is two, and so forth.

  17. Results of Polynomial Regression Excite 1 -.72 -.62 -.55 Time Time**2 Time**3 Excite Time Time**2 Time**3 Note that polynomial is a special case of hierarchical reg. 1 .96 .91 1 .99 1 Model Intercept b1 b2 b3 R2 R2 Ch 1 Time 8.90 -.18 .52 .52 2 Time, Time2 9.25 -.39 .014 .58 .06 3 Time, Time2, Time3 9.20 -.23 -.02 .001 .59 .01

  18. Polynomial Results (2) Suppose it had happened that the term for time-squared had been significant. The regression equation is Y' = 9.25 -.39X + .014X2. The results graphed: Roller Coaster Evaluations 12 9 Excitement 6 3 0 -4 0 4 8 12 16 20 Time

  19. Interpreting Weights in Polynomial Regression All power terms for an IV work together to define the curve relating Y to X. Do not interpret b weights for polynomials. They change if you subtract the mean from the raw data. To estimate importance look to the change in R-square for the block of variables that represent the IV. Never use polynomials in a variable selection algorithm (e.g., stepwise regression). Specialized literature on nonlinear terms in path analysis and SEM (hard to do).

  20. Review Why is collinearity likely to be a problem in using polynomial regression? Describe the sequence of tests used to model curves in polynomial regression. Review R code.

  21. Interactions An interaction means that the importance of one variable depends upon the value of another. Importance means slope in regression with a continuous IV. An interaction is also sometimes called a moderator, as in Z moderates the relations between X and Y. In regression, we look to see if the slope relating the DV to the IV changes depending on the value of a second IV.

  22. Example Interaction Low Cognitive Ability For those with low cog ability, there is a small correlation between creativity and 10 5 Medium Cognitive Ability productivity. Productivity 0 10 -5 5 -10 -10 -6 -2 2 6 10 Productivity Creativity 0 -5 As cognitive ability increases, the relations between creativity and productivity become stronger. The slope of productivity on creativity depends on cog ability. -10 High Cognitive Ability -10 -6 -2 2 6 10 Creativity 10 5 Productivity 0 -5 -10 -10 -6 -2 2 6 10 Creativity

  23. Interaction Response Surface Interaction Interaction 80 60 Y 80 Y 40 60 20 40 0 0 20 2 10 4 10 8 6 8 6 X2 10 0 X2 8 6 8 4 6 X1 10 4 4 2 2 X1 2 0 0 0 The slope of X1 depends on the value of X2 and vice versa. Regression is looking to fit this response surface and no other when we do the customary analysis for interactions with continuous IVs. More restrictive than ANOVA.

  24. Significance Tests for Interactions Subtract means from each IV (optional). Compute product of IVs. Compute significance of change in R-square using interaction(s). If R-square change is n.s., no interaction(s) present. If R-square change is significant, find the significant interaction(s). Graph the interaction(s)

  25. 1 50 40 100 2 35 45 80 3 40 50 90 Data to test for interaction between cognitive ability and creativity on performance. 4 50 55 105 5 55 60 110 6 35 40 95 7 45 45 100 8 55 50 105 9 50 55 95 10 40 60 90 11 45 40 110 12 50 45 115 13 60 50 120 14 65 55 125 15 55 60 105 16 50 40 110 17 55 45 95 18 55 50 115 19 60 60 120 20 65 65 140 ;

  26. Correlation Matrix Pearson Correlation Coefficients, N = 20 Prob > |r| under H0: Rho=0 person product create cog inter person 1.00000 0.65629 0.32531 0.66705 0.57538 0.0017 0.1616 0.0013 0.0079 product 0.65629 1.00000 0.50470 0.83568 0.78465 0.0017 0.0232 <.0001 <.0001 create 0.32531 0.50470 1.00000 0.38414 0.84954 0.1616 0.0232 0.0945 <.0001 cog 0.66705 0.83568 0.38414 1.00000 0.80732 0.0013 <.0001 0.0945 <.0001 inter 0.57538 0.78465 0.84954 0.80732 1.00000 0.0079 <.0001 <.0001 <.0001

  27. Results for 2 IVs (Main Effects) The GLM Procedure Dependent Variable: product Sum of Source DF Squares Mean Square F Value Pr > F Model 2 1080.151718 540.075859 23.93 <.0001 Error 17 383.598282 22.564605 Corrected Total 19 1463.750000 R-Square Coeff Var Root MSE product Mean 0.737935 9.360042 4.750222 50.75000 Source DF Type I SS Mean Square F Value Pr > F create 1 372.8504184 372.8504184 16.52 0.0008 cog 1 707.3012991 707.3012991 31.35 <.0001 Source DF Type III SS Mean Square F Value Pr > F create 1 57.9376107 57.9376107 2.57 0.1275 cog 1 707.3012991 707.3012991 31.35 <.0001 Standard Parameter Estimate Error t Value Pr > |t| Intercept -11.31387459 9.26553773 -1.22 0.2387 create 0.23848700 0.14883268 1.60 0.1275 cog 0.47077911 0.08408699 5.60 <.0001

  28. Result for interaction Dependent Variable: product Sum of Source DF Squares Mean Square F Value Pr > F Model 3 1088.423539 362.807846 15.47 <.0001 Error 16 375.326461 23.457904 Corrected Total 19 1463.750000 R-Square Coeff Var Root MSE product Mean 0.743586 9.543519 4.843336 50.75000 Source DF Type I SS Mean Square F Value Pr > F create 1 372.8504184 372.8504184 15.89 0.0011 cog 1 707.3012991 707.3012991 30.15 <.0001 inter 1 8.2718212 8.2718212 0.35 0.5609 Source DF Type III SS Mean Square F Value Pr > F create 1 15.35701934 15.35701934 0.65 0.4303 cog 1 49.38823497 49.38823497 2.11 0.1661 inter 1 8.27182120 8.27182120 0.35 0.5609 Standard Parameter Estimate Error t Value Pr > |t| Intercept -45.48374780 58.31267387 -0.78 0.4468 create 0.87233592 1.07813934 0.81 0.4303 cog 0.79009054 0.54451482 1.45 0.1661 inter -0.00587585 0.00989498 -0.59 0.5609

  29. Go to R code

  30. Moderator and Mediator Moderator Means Interaction. Slope of one depends on the value of the other. Use moderated regression (test for an interaction) to test. Mediator means there is a causal chain of events. The mediating variable is the proximal cause of the DV. A more distal cause changes the mediator. Use path analysis* to test. In this graph, X2 is the mediator. Mediator Moderator X1 X2 X2 X1 Y Y * Modern methods Sobel test; Preacher & Hayes; Bootstrap

  31. Review How do you model interactions of continuous variables with regression? What is the difference between a moderator and a mediator? How do you test for the presence of each?

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#