Understanding Central Tendency and Variability in Distributions
Central tendency and variability are fundamental features of statistical distributions. Central tendency, encompassing mean, median, and mode, represents the middle of a distribution, while variability describes the spread of data points. Knowing the effect of distribution shape on these measures helps in choosing the most appropriate statistic for analysis. Additionally, understanding variance, standard deviation, and z-scores aids in assessing the dispersion of data points.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Central Tendency and Variability The two most essential features of a distribution
Questions Define Mean Median Mode What is the effect of distribution shape on measures of central tendency? When might we prefer one measure of central tendency to another?
Questions (2) Define Range Average Deviation Variance Standard Deviation When might we prefer one measure of variability to another? What is a z score?
Variables have distributions A variable is something that changes or has different values (e.g., anger). A distribution is a collection of measures, usually across people. Distributions of numbers can be summarized with numbers (called statistics or parameters).
Central Tendency refers to the Middle of the Distribution
1. Central Tendency: Mode, Median, & Mean The mode the most frequently occurring score. Midpoint of most populous class interval. Can have bimodal and multimodal distributions.
Median Score that separates top 50% from bottom 50% Even number of scores, median is half way between two middle scores. 1 2 3 4 | 5 6 7 8 Median is 4.5 Odd number of scores, median is the middle number 1 2 3 4 5 6 7 Median is 4
Mean Sum of scores divided by the number of people. Population mean is (mu) and sample mean is (X-bar). We calculate the sample mean by: X X = X N We calculate the population mean by: X = N
Deviation from the mean X x = X Deviation score deviation from the mean Raw scores . Deviations sum to zero. 9 9 9 8 8 10 10 7 11 Deviation scores 0 0 0 -1 -1 1 1 -2 2
Comparison of mean, median and mode Mode Good for nominal variables Good if you need to know most frequent observation Quick and easy Median Good for bad distributions Good for distributions with arbitrary ceiling or floor
Comparison of mean, median & mode Mean Used for inference as well as description; best estimator of the parameter Based on all data in the distribution Generally preferred except for bad distribution. Most commonly used statistic for central tendency.
Best Guess interpretations Mean average of signed error will be zero. Mode will be absolutely right with greatest frequency Median smallest absolute error
Expectation Discrete and continuous variables Mean is expected value either way Discrete: Continuous: (The integral looks bad but just means take the average) = = = ( ) ( ) E X xp x mean of X = ( ) ( ) E X xf x dx mean of X
Review What is central tendency? Mode Median Mean
2. Variability aka Dispersion 4 Statistics: Range, Average Deviation, Variance, & Standard Deviation Range = high score minus low score. 12 14 14 16 16 18 20 range=20-12=8 Average Deviation mean of absolute deviations from the median: AD = | | X Md N Note difference between this definition & undergrad text- deviation from Median vs. Mean
Variance 2 ( ) X = 2 Population Variance: Where means population variance, means population mean, and the other terms have their usual meaning. The variance is equal to the average squared deviation from the mean. To compute, take each score and subtract the mean. Square the result. Find the average over scores. Ta da! The variance. N 2
Computing the Variance (N=5) X X X X 2) ( X X 5 15 -10 100 10 15 -5 25 15 15 0 0 20 15 5 25 25 15 10 100 Total: 75 0 250 Mean: Variance 50 Is
Standard Deviation Variance is average squared deviation from the mean. To return to original, unsquared units, we just take the square root of the variance. This is the standard deviation. Population formula: 2) ( X = N
Standard Deviation Sometimes called the root-mean-square deviation from the mean. This name says how to compute it from the inside out. Find the deviation (difference between the score and the mean). Find the deviations squared. Find their mean. Take the square root.
Computing the Standard Deviation (N=5) 5 15 10 15 15 15 20 15 25 15 Total: 75 Mean: Variance Sqrt SD X X 2) ( X X X X -10 -5 0 5 10 0 Is Is 100 25 0 25 100 250 50 = 50 = . 7 07
Example: Age Distribution Distribution of Age Central Tendency, Variability, and Shape 16 Median = 23 Mean=25.73 Average Distrance from Mean Mode = 21 12 SD = 6.47 Frequency 8 4 0 10 10 10 10 20 20 20 20 30 30 30 30 40 40 40 40 50 50 50 50 age age age age
Standard or z score A z score indicates distance from the mean in standard deviation units. Formula: X X z = z =X S Converting to standard or z scores does not change the shape of the distribution. Z-scores are not normalized.
Review Define each of these in words: Range Average deviation Variance Standard Deviation Z-score