Hypothesis testing for zero correlation
Conducting hypothesis tests for zero correlation, analyzing PMCC values, determining critical values, and exploring relationships between variables in statistical data sets for research purposes.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Hypothesis testing For zero correlation
Statistics: Pmcc and hypothesis tests BAT use a hypothesis test to determine if the pmcc for a sample indictates there is likely to be a linear relationship in the population KUS objectives Starter: interpret these results Data for daily rainfall in Perth in the LDS correlates to data for daily sunshine with a pmcc of ? = 0.68 Data suggests there may be a negative correlation Data for cloud cover in Beijing in the LDS correlates to data for humidity with a pmcc of ? = 0.02 Data suggests there may not be a correlation (what problems are there with data for cloud cover?) Data for daily sunshine in Heathrow in the LDS correlates to data for wind direction with a pmcc of ? = 0.54 Data suggests there may be a correlation However, common sense suggests we investigate further
Notes hypothesis testing for zero correlation For a one tailed test either ??: ? = ? ??: ? < ? ??: ? = ? ??: ? > ? or For a two tailed test ??: ? = ? ??: ? ? To find the Critical region: use the table of values in the formula booklet page 37 You need the significance level and the sample size The table gives the critical value between 0 and 1 (continuous data) So for a critical value between -1 and 0 we change sign on the value
WB 12 A researcher wishes to investigate if there is a positive correlation between the number of vehicles and the number of road fatalities in European countries. He selects a random sample of 10 European countries and records the number of vehicles, v per 1000 people, and the number of road fatalities, r per 100 000 population, for a particular year. These are shown in the table and scatter diagrams. Country Austria Belgium France Germany Greece Ireland Italy Luxembou rg Spain UK v r 578 559 578 572 624 513 679 5.4 6.7 5.1 4.3 9.1 4.1 6.1 739 8.7 593 519 3.7 2.9 a What is the definition of a critical value? (1) b The product moment correlaton coefficient for v and r is 0.714. Use this value to test for positive correlation at the 5% significance level. Interpret your result in context. (3) c The researcher wishes to predict the number of road fatalities for a country with 650 vehicles per 1000 people. Write down the regression model he should use. (1) d State the dependent variable for the regression model in part c. e Monaco has 899 vehicles per 1000 people. Explain why the model stated in c is not reliable for estimating the number of road fatalities in Monaco. (1) (1)
WB 12 ANSWERS a What is the definition of a critical value? (1) A critical value is the point (or points) on the scale of the test statistic beyond which we reject the null hypothesis. b The product moment correlaton coefficient for v and r is 0.714. Use this value to test for positive correlation at the 5% significance level. Interpret your result in context. (3) H0 : ? = 0, H1 : ? > 0 There is evidence to reject H0 There is evidence that there is a positive correlation between the number of vehicles and road traffic accidents. 0.714 > 0.5494 (test statistic in critical region) Critical value = 0.5494 c The researcher wishes to predict the number of road fatalities for a country with 650 vehicles per 1000 people. Write down the regression model he should use. (1) ? = 7.0 + 0.02? d State the dependent variable for the regression model in part c. Road fatalities per 100 000 population (1) e Monaco has 899 vehicles per 1000 people. Explain why the model stated in c is not reliable for estimating the number of road fatalities in Monaco. (1) Outside the range of the data used in the model. (This would require extrapolation)
WB 13 A scientist takes 30 observations of the masses of two reactants in an experiment. She calculates a pmcc of ? = 0.45 The scientist believes there is no correlation between the masses of the two reactants. Test, at the 10% level of significance, the scientists claim, stating your hypothesis clearly H0 : ? = 0, H1 : ? 0 From the table in the formula booklet: Critical values of r for a 5% significance level are ? = 0.3061 sample size 30 two tailed The critical region is ? < 0.3061 ? > 0.3061 Since 0.45 < 0.3061 reject H0 There is evidence, at the 10% level significance that there is a correlation between the masses of the two reactants
WB 14 data from the LDS is given in the table. X is daily maximum gust (kn) and Y is daily maximum relative humidity(%) in Leeming for a sample of eight days in May. X 31 28 38 37 Y 99 94 87 80 18 80 17 89 21 84 29 86 a) Find the pmcc for these data b) Test, at the 10% significance level, whether there is sufficient evidence of a positive correlation in the data. State your hypothesis clearly a) pmcc ? = 0.1149 b) H0 : ? = 0, H1 : ? > 0 sample size 8 one tailed From the table in the formula booklet: Critical values of r for a 10% significance level are ? = 0.5067 The critical region is ? > 0.5067 Since 0.1149 < 0.5067 not enough evidence to reject H0 There is not enough evidence, at the 10% level significance of a positive correlation between daily maximum gust and daily maximum relative humidity
WB 15 An engineer believes that there is a relationship between the CO2 emissions and fuel consumption for cars. A random sample of 40 different car models (old and new) was taken and the CO2 emission figures, e grams per kilometre, and fuel consumption, f miles per gallon, were recorded. The engineer calculates the pmcc for the 40 cars and obtains r= 0.803 a)State what is measured by the product moment correlation coefficient. b) State, with a reason, whether a linear regression model based on these data is reliable or not for a car when the fuel consumption is 60 mpg. c) For the linear regression model e= 198 1.71 f write down the explanatory variable. d) State the definition of a hypothesis test e) Test at 1% significance level whether or not the product moment correlation coefficient for CO2 emissions and fuel consumption is less than zero. State your hypotheses clearly.
WB 15 ANSWERS a) State what is measured by the pmcc. The Linear association between e and f. b) State, with a reason, whether a linear regression model based on these data is reliable or not for a car when the fuel consumption is 60 mpg. r= 0.803 Extrapolation, not reliable as 60 mpg is outside the range of the given data Explanatory variable is Fuel consumption f c) For the linear regression model e= 198 1.71 f write down the explanatory variable. d) State the definition of a hypothesis test A test used to determine if there is enough evidence in a sample of data to infer That a certain condition is true for the entire population e) Test at 1% significance level whether or not the product moment correlation coefficient for CO2 emissions and fuel consumption is less than zero. State your hypotheses clearly. ??: ? = ? ??: ? < ? ???????? ????? = 0.3665 0.803 Supports that the pmcc is negative
BAT use a hypothesis test to determine if the pmcc for a sample indictates there is likely to be a linear relationship in the population KUS objectives self-assess One thing learned is One thing to improve is