Statistical Analysis: Descriptive and Inferential Techniques Overview

Descriptive and Inferential

Statistic

Dr. Dyal Bhatnagar

STATISTICS

•

Manipulating Data to get meaningful

inferences out of it.

•

‘Data’ is plural or singular?

Variables

Metric (Continuous)

•

Ratio Variable

•

Interval Variable

Non-Metric (Categorical)

•

Ordinal Variable

•

Nominal Variable

•

Values that look interval are often ordinal

- Andy Field

•

Continuous variables are continuous (obviously) but also discrete

Statistical Techniques

Descriptive Statistical Techniques

•

These techniques are used

to describe the basic

features or characteristics

of data under study

–

Measures of Central tendency

–

Measures of Dispersion

–

Measures of Skewness

–

Measures of Kurtosis

Inferential Statistical Techniques

•

Inferential statistics uses a

random sample of data

taken from a population to

make inferences about the

population

–

Finding relationship among

variables

–

Testing a hypothesis

Measures of Central Tendency

•

Mean: The average

•

Median: A point in data set that divides data

set in two equal halves

•

Mode: The most frequent value

Measures of Dispersion

Measures of Dispersion

Absolute Measures

The output is an absolute

value represented in the

same unit of measurement

as of the data

Relative Measures

Coefficient of an absolute

measure is calculated.

Coefficient is a pure number

independent of the unit of

measurement

•

Variance is the square of deviations from the

central value divided by the number of

observations

•



 = √V or V =



Skewness and Kurtosis

•

The Dispersion tells about the

extent

of

variation in a distribution, where as, Skewness

tells about the

direction

of variation

•

Kurtosis tells if the distribution is peaked or

flat.

Correlation

•

Correlation measures the joint variation

among two or more variables.

•

Correlation does not mean there is causation,

but causation always implies correlation.

•

Negative and positive correlation

•

Linear and non-linear correlation

•

Simple, Partial or multiple correlation

Regression

•

Measuring cause and effect relationship

among two or more variables.

•

The direction of causation can be either from

X to Y or otherwise.

•

Y =





Hypothesis Testing

1. Set up a Null Hypothesis

•

 asserts that there is no real difference in

the samples or between the sample and the

population and difference found is arising out

of sampling fluctuations.

•







•

Status quo

•

We generally wish to reject the H

2. Set up a significance level (



•

Significance level indicates the probability of

rejecting the H

 when it was true.

•

It is the risk of error the researcher is ready to

accept.

•

The only way to reduce the probability of errors is

to increase the sample size

3. Decide about the tails of test

•

A directional H

 is tested using a one tailed

test and a non-directional H

 is tested with a

two tailed test.

•

Directional H





≥



•

Non-directional H







•

In case of two tail test the significance value

will be reduced to half

4. Choose a test

•

This involves choosing an appropriate distribution for a

particular test

•

Assumption for parametric tests

–

The distribution is normally distributed

–

Variance across samples is homogeneous

5. Run the test

6. Interpret the results

•

p-Value represents the probability of concluding

incorrectly that there is a difference in samples,

when no true difference exists.

•

p-value <





 Reject

•

p-value >





 Accept

•

As we wish to Reject H

 in most cases, we search

for p-values <



 (0.05)

–

A p-value of 0.07 would mean more risk then

tolerable level i.e. 0,05

Thank-you

Slide Note

Embed Share

Download

Understanding statistical analysis involves both descriptive and inferential techniques. Descriptive statistics focus on summarizing data, including measures of central tendency and dispersion. In contrast, inferential statistics use sample data to make inferences about populations and test hypotheses. Variables can be categorized as metric or non-metric, and measures of skewness and kurtosis provide insights into data distribution. This overview covers key concepts such as mean, median, mode, variance, skewness, and kurtosis essential for meaningful data analysis.

gkan Follow

Uploaded on Sep 18, 2024 | 1 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript

Descriptive and Inferential Statistic Dr. Dyal Bhatnagar

STATISTICS Manipulating Data to get meaningful inferences out of it. Data is plural or singular?

Variables Metric (Continuous) Ratio Variable Interval Variable Non-Metric (Categorical) Ordinal Variable Nominal Variable Values that look interval are often ordinal - Andy Field Continuous variables are continuous (obviously) but also discrete

Statistical Techniques Descriptive Statistical Techniques These techniques are used to describe the basic features or characteristics of data under study Measures of Central tendency Measures of Dispersion Measures of Skewness Measures of Kurtosis Inferential Statistical Techniques Inferential statistics uses a random sample of data taken from a population to make inferences about the population Finding relationship among variables Testing a hypothesis

Measures of Central Tendency Mean: The average Median: A point in data set that divides data set in two equal halves Mode: The most frequent value

Measures of Dispersion Series A Series B Series C 100 102 10 100 104 6 100 97 480 100 98 3 100 =100 99 1 =100 =100

Measures of Dispersion Absolute Measures Relative Measures The output is an absolute value represented in the same unit of measurement as of the data Coefficient of an absolute measure is calculated. Coefficient is a pure number independent of the unit of measurement

Variance is the square of deviations from the central value divided by the number of observations = V or V = 2

Skewness and Kurtosis The Dispersion tells about the extent of variation in a distribution, where as, Skewness tells about the direction of variation Kurtosis tells if the distribution is peaked or flat.

Correlation Correlation measures the joint variation among two or more variables. Correlation does not mean there is causation, but causation always implies correlation. Negative and positive correlation Linear and non-linear correlation Simple, Partial or multiple correlation

Regression Measuring cause and effect relationship among two or more variables. The direction of causation can be either from X to Y or otherwise. Y = + X

Hypothesis Testing

1. Set up a Null Hypothesis Ho asserts that there is no real difference in the samples or between the sample and the population and difference found is arising out of sampling fluctuations. Ho 1 = 2 Status quo We generally wish to reject the Ho

2. Set up a significance level () Significance level indicates the probability of rejecting the Ho when it was true. It is the risk of error the researcher is ready to accept. Accept Ho - Right decision - Type II error ( ) Reject Ho - Type I error ( ) - Right decision Ho is true Ho is false The only way to reduce the probability of errors is to increase the sample size

3. Decide about the tails of test A directional Ho is tested using a one tailed test and a non-directional Ho is tested with a two tailed test. Directional Ho 1 Non-directional Ho 1 = In case of two tail test the significance value will be reduced to half 2 2

4. Choose a test This involves choosing an appropriate distribution for a particular test Parametric Tests Non Parametric tests Z Test Sign Test t-test (Independent samples) Mann Whitney Test t-test (Paired samples) Wilcoxon Test F test (ANOVA) Kruskal Wallis Test Assumption for parametric tests The distribution is normally distributed Variance across samples is homogeneous

Dependent Variable Metric Non-Metric Discriminant Analysis Binary/Logistic regression Independent Variable(s) Metric Regression Hypothesis testing Non-Metric Chi-square Test

5. Run the test 6. Interpret the results p-Value represents the probability of concluding incorrectly that there is a difference in samples, when no true difference exists. p-value < Reject Ho p-value > Accept Ho As we wish to Reject Ho in most cases, we search for p-values < (0.05) A p-value of 0.07 would mean more risk then tolerable level i.e. 0,05

Thank-you

Statistical Analysis: Descriptive and Inferential Techniques Overview

Download Presentation

Presentation Transcript

Related

More Related Content