Understanding Quantitative Data Analysis in Research

Slide Note
Embed
Share

Dive into the world of quantitative data analysis with a focus on frequencies, central tendency, dispersion, and standard deviation. Explore the collection and analysis of numerical data, levels of measurement, and methods for quantifying social concepts. Learn about the importance of capturing data using various instruments and understanding different levels of measurement in quantitative research. Gain insights into converting non-numerical values to numerical form for statistical analysis purposes. Join Dr. Luke Sloan on a journey through the fundamental aspects of quantitative data collection and analysis.


Uploaded on Nov 27, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Exploring Data: Frequencies, Central Tendency, Dispersion and Standard Deviation SIT094 The Collection and Analysis of Quantitative Data Week 3 Luke Sloan

  2. About Me Name: Dr Luke Sloan Office: 0.56 Glamorgan Email: SloanLS@cardiff.ac.uk To see me: please email first

  3. Introduction Collecting Quantitative Data Levels of Measurement Frequencies & Fidelity Central Tendency Dispersion Summary

  4. Collecting Quantitative Data I Research involving the collection of data in numerical form the defining factor is that numbers result from the process, whether the initial data collection produced numerical values, or whether non-numerical values were subsequently converted to numbers as part of the analysis process Source: Jupp 2006:250

  5. Collecting Quantitative Data II Operationalising of social concepts Quantifying fuzzy data into VARIABLES How to measure feelings, attitudes, behaviours, beliefs and attributes? Numbers allow statistical tests Statistical tests allow generalisations to made Characterisation from samples to populations

  6. Collecting Quantitative Data III Capture data using instruments Surveys (paper, online, telephone, in person) Secondary data analysis Experiments difficult outside of the natural sciences But social scientists try to emulate the natural science model (remember Popper s Falsification Principle?) But not all data is equal (some are more equal than others!)

  7. Levels of Measurement I Data Level Description Examples Nominal (categorical) Response categories cannot be placed in a specific order impossible to judge distance between categories Sex (Male/Female) Ethnicity (White/Black ) Party (Lab/Con/LD ) Ordinal (categorical) Response categories can be placed in rank order distance between categories cannot be measured mathematically Likert (Agree/Neutral/Disagree) Rank Preference (Coke/Pepsi ) Education (GCSE/A-Level ) Interval (or continuous)* Responses measured on a continuous scale with rank order uniform distance between responses allows mathematical measurement Age (in years) Income (in ) Source: David & Sutton (2004) *NOTE: Interval = no true zero point (e.g. height), Ratio = true zero point (e.g. income)

  8. Levels of Measurement II Level of measurement for certain variables is not pre- defined: AGE (in years e.g. 22, 34, 54) AGE (pre-set bands e.g. 18-30, 31-50) AGE (group membership e.g. mature student) There is a hierarchy of data always try to collect the highest level possible to maximise usefulness! Are you bored? (Yes/No) On a scale of 1-10, how bored are you [where 1= practically in tears of boredom and 10= riveted ]

  9. Frequencies & Fidelity I Not as interesting as it sounds sorry! Frequency tables display the number of times that a value appears in your dataset (per variable across all cases) They are always the first thing you do once your data is in electronic form Highlights data errors Indicative of potential analysis

  10. Frequencies & Fidelity II What can we say about this table? Error? Parties coded Frequency Percent Valid Percent Cumulative Percent Valid -9 Conservative 1 .0 .0 .0 1331 29.9 29.9 30.0 What we would expect? Labour 1103 24.8 24.8 54.8 Lib Dem 1044 23.5 23.5 78.2 Green 368 8.3 8.3 86.5 Look at %s UKIP 171 3.8 3.8 90.4 BNP 78 1.8 1.8 92.1 More than UKIP Independent 216 4.9 4.9 97.0 Others 135 3.0 3.0 100.0 What s this? Total 4447 100.0 100.0 Really? Only 1? Missing System 1 .0 Total 4448 100.0 A simple frequency table can tell you quite a bit!

  11. Central Tendency I You have all done quantitative research and you all use measures of central tendency in your normal lives the average, middle and most common values What to watch on TV with housemates Decide based on the most popular choice Most Common (MODE) How long do you cook a chicken? Cookbook says 2 hours but internet says 3 Middle (MEDIAN) Maintenance grant allowance per week Divide total grant by number of weeks at uni Average (MEAN)

  12. Central Tendency II High Date 2-Jan 3-Jan 4-Jan 5-Jan 6-Jan 7-Jan 8-Jan 9-Jan 10-Jan 11-Jan Temperature MODE 59 60 43 42 35 32 <===Mode 32 <===Mode 46 41 52 the value that occurs the most frequently in the data MODE = 32

  13. Central Tendency III The mode is useful for thinking about NOMINAL data Main reason for going to gym Cumulative Percent Frequency Percent Valid Percent Valid Relaxation Fitness Lose weight Build strength Total 9 10.0 34.4 36.7 18.9 100.0 10.0 34.4 36.7 18.9 100.0 10.0 44.4 81.1 100.0 31 33 17 90 What is the most frequent (MODAL) response?

  14. Central Tendency IV NOMINAL data can be displayed using a bar chart 40 30 Count 20 10 0 Relaxation Fitness Lose weight Build strength Main reason for going to gym

  15. Central Tendency V High Date 7-Jan 8-Jan 6-Jan 10-Jan 5-Jan 4-Jan 9-Jan 11-Jan 2-Jan 3-Jan Temperature MEDIAN 32 32 35 41 42 <===Middle values 43 <===Middle values 46 52 59 60 the middle value of the ordered sample data When the sample size if odd, the median is the middle value When the sample size if even, the median is the midpoint (mean) of the two middle values MEDIAN = 42.5

  16. Central Tendency VI The mode and median are useful for thinking about ORDINAL data There is a general lack of public knowledge about local government Cumulative Percent Frequency Percent Valid Percent Valid Strongly Agree Agree Neutral Disagree Strongly Disagree Total System 1911 2281 255 111 17 4575 71 4646 41.1 49.1 5.5 2.4 41.8 49.9 5.6 2.4 41.8 91.6 97.2 99.6 100.0 .4 .4 98.5 1.5 100.0 100.0 Missing Total What is the middle (MEDIAN) response? What is the most frequent (MODAL) response?

  17. Central Tendency VII ORDINAL data can also be displayed using a bar chart

  18. Central Tendency VIII High Date 2-Jan 3-Jan 4-Jan 5-Jan 6-Jan 7-Jan 8-Jan 9-Jan 10-Jan 11-Jan Temperature MEAN 59 60 43 42 35 32 32 46 41 52 442 sum of the value divided by the number of cases MEAN = 44.2 Sum

  19. Central Tendency IX The mean, mode and median are useful for thinking about INTERVAL data Statistics What is the average (MEAN) age? What was your age last birthday N Valid Missing Mean Median Mode 4290 158 54.74 57.00 What is the middle (MEDIAN) age? 62 What is the most common (MODAL) age?

  20. Central Tendency X INTERVAL data can be displayed using a histogram

  21. Dispersion I Measures of central tendency are heuristics They can hide important details in the data Dataset 1: 1 2 3 4 5 6 7 8 9 MEAN = 5 MEDIAN = 5 Dataset 2: 1 2 3 4 5 6 7 8 90 MEAN = 14 MEDIAN = 5 Need to consider RANGE and STANDARD DEVIATION

  22. Dispersion II RANGE measures the difference between the lowest and highest values Large range may reveal outliers (dataset 2!) Small range suggests tight grouping of data STANDARD DEVIATION (SD) measures the distance (deviation) of each value from the mean Large SDs occur when data points are a long way from the mean (wide range of different values) Small SDs occur when data points are close to the mean (values do not differ very much)

  23. Dispersion III For example: Age Age (Sample 1) (Sample 2) Descriptive Statistics 18 30 23 31 21 19 20 19 28 21 8 55 53 13 12 52 7 9 11 10 Std. N Range Minimum Maximum Mean 23.0000 Deviation 4.85341 Age Valid N (listwise) 10 10 13.00 18.00 31.00 Descriptive Statistics Std. N Range Minimum Maximum Mean 23.0000 Deviation 21.01851 Age Valid N (listwise) 10 10 48.00 7.00 55.00

  24. Summary Levels of measurement determine how data can be analysed Vital to understand what your data represents and into which level of measurement it falls Frequency tables help us to screen data for errors Frequency tables also help us to identify the median and mode Central tendency is a heuristic, but very common because of this Dispersion plays a vital role in critically evaluating central tendency These modes of analyses are often referred to as DESCRIPTIVE STATISTICS or UNIVARIATE ANALYSIS (literally one variable !)

  25. Lies, Damn Lies and Statistics? 90% of Sun readers want a cap on immigration The average Yale graduate earns $30,000 within six months of graduating The Green Party is not well supported as it received less than 5% of the national vote in the 2010 General Election House prices drop by 10% in the UK 90% of students at Cardiff University are binge drinkers

Related


More Related Content