Understanding Assumptions in Measurement and Psychometrics

Slide Note
Embed
Share

Explore the fundamental assumptions in measurement, highlighting the interplay between reality, chance, and statistical significance. Learn how measurements are influenced by error, patterns, and the role of calibrated tools in quantifying properties. Delve into the intriguing realm of psychometrics and the complexities of developing indicators for educational research and evaluation.


Uploaded on Oct 07, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Psychometrics: Developing Indicators Presentation to PhD Students, Faculty of Education, Mahasarakham University (MSU), Thailand at the Faculty of Education, the University of Auckland Assoc Prof Gavin T L Brown Wednesday 30 November 10am-12noon

  2. Assumptions in Measurement If something exists we can measure it If we can t measure it, we may have failed in our ingenuity and ability to measure it doesn t mean the thing doesn t exist We have an unfortunate tendency to treat as valuable only the things we can measure We tend to treat as value-less anything we can t measure even if it does have value

  3. Assumptions in Measurement All measures have some error The length of a certain platinum bar in Paris is a metre It is supposed to be 1/10,000,000thof the distance from equator to pole but it is short by 0.2 mm according to satellite surveys Less error in measures of physical phenomena and more in social phenomena

  4. Assumptions in Measurement Reality has patterns partly because random chance has patterns in it Toss a coin enough times and you will get patterns like this: T H T T H H T T T H H H T T T T H H H H OR T T T T T T T H H H H H H H If you measure something often enough, one of your results will appear to be non-chance, even though it actually is a chance event

  5. Assumptions in Measurement Chance plays a significant part in the results we generate Sometimes the result we get could occur by chance anyway; just because something happens doesn t mean it would not have occurred anyway If it is relatively unlikely to occur by chance then we say it is Statistically significant So we create tables of how often things occur by chance as a reference point We can estimate the probability (p) of something happening by chance and use this to determine whether our result is real or a chance artefact

  6. Measurement Properties are measured with tools that are calibrated into numerical scales Ratio Scale E.g., length metres; weight kilograms; The distance between any two markers is identical There is a non-arbitrary zero-value (there is a real world state that can be described as lacking any of the property of interest) The point on the scale is a RATIO of a base unit and addition of the ratio values is possible Full statistical analyses can be done

  7. Other types of scales Interval scale Temperature in Celsius (each point is 1/100thof distance between freezing and boiling point of fresh water at sea level) Equal size distances between scale points (1 to 5 is the same distance as 6 to 10) Differences can be ratios: 10 to 20 is twice the distance of 1 to 5 Zero point is arbitrary; negative scale points possible Standard arithmetic can be applied to such scales especially concerning centre of distributions

  8. Other types of scales Ordinal scale Rank ordered objects 1st, 2nd, 3rdin a race Tall, Taller, Tallest (not 1.80m; 2.00m; 2.26m) Neutral, slightly agree, moderately agree, agree Distance between ranks or orders not equal Challenge of how to do mathematical operations with some that is not continuous (non-parametric statistics) We need evidence that distances are at least approximately equal to use rank orders what evidence would be good?

  9. Other types of scales Nominal scale Categorical or classification naming of objects that are qualitatively different to each other Igneous, sedimentary, metamorphic These can all weigh the same or have the same length but on critical features they are not identical Male, female Agree, disagree We can count the frequency of each category and compare the distribution of frequencies in samples

  10. Measuring Human mental properties Problem of measuring stuff in-the-head Our mental actions are difficulty to observe directly How do we know how much xxx you have? Measure xxx with a recognised tool Measuring Tools for Social Science Answers to Questions (paper or oral) Self-reports Observational Check Lists What scale properties do tools like these have?

  11. Measuring Human mental properties What type of scales do we need in order to analyse arithmetically or statistically the central tendency, difference, mean of test scores? What type of mathematical or statistical methods do we need to take into account chance factors? Are the scales we use sufficient?

  12. Items & Categories From a qualitative study and literature review Categories of concepts or ideas you wish to generalise Language samples that express the ideas that you have created Possible relationships to explore

  13. Defining a test or inventory Domain A sample of tasks, questions, items drawn from a domain of interest intended to elicit information about learner skill, knowledge, understanding about that domain Items Topic C Items Topic B Items Topic A TEST Chapter

  14. Basic Principles of Instrument Design Know your domain identify, describe what you want to teach and learn A rose is not a rose; reading is not reading; maths is not arithmetic Select rich ideas for important content What are you really investigating?

  15. Test Blueprint/Template A powerful way to organise writing items and reporting results Chan, K. K. (2010, August). Using test blueprint in classroom assessment: Why and how. Paper presented to the annual conference of the International Association for Educational Assessment, Bangkok, Thailand.

  16. Define the categories For each category create a definition that captures the essential idea(s) of the category Conception Categories I. Negative Definitions of Conceptions Assessment is ignored, devalued or useless because it creates psychological burdens, pressure, or undesirable experiences. II. Gaming Strategies Success in assessment rewards and/or requires treating it as a game. Gaming strategies include getting tips or task-oriented skills from tutoring, pleasing or knowing the right person, or even cheating. Success in assessment requires high levels of effort, persistence, or exertion and one must always be modest about one s effort and success. III. Effortful Modesty IV. Escape System VI. Academic Content Only V. Family Obligation Assessment is an oppressive system which would be avoided or even escaped from if it were possible. Assessing is limited because it focuses only on academic content, neglecting other important aspects of human life and development. Achieving high in assessment is an obligation to one s family in order to please, show respect, or build reputation for the family.

  17. Draft Items For each category write items which capture how a target participant would talk about the category Use as natural language as possible Keep one idea in each item Avoid writing the same thing in other words Cover the full range of ideas covered by the category Aim for 8-12 items per category Better to have too many than not enough Brainstorm, don t censor too early

  18. Eliciting a response Consider what mechanism to obtain the participant opinion Number of response points Mid-point? Balanced or packed? Label every point or only ends? Type of rating? Agreement, frequency, importance, etc. Brown, G. T. L. (2004). Measuring attitude with positively packed self-report ratings: Comparison of agreement and frequency scales. Psychological Reports, 94(3), 1015-1024. doi: 10.2466/pr0.94.3.1015-1024

  19. Validity Checks Following procedures outlined by Gable & Wolf (1993) Do items clearly belong to the intended categories? Are categories understood and accepted by samples from population of interest? If not, then revise categories, items, and/or language used Gable, R. K., & Wolf, M. B. (1993). Instrument development in the affective domain: Measuring attitudes and values in corporate and school settings. (2nd ed.). Boston, MA: Kluwer Academic Publishers.

  20. Instructions for Category Checking A. Rating Tasks Please indicate the category that each statement best fits by circling the appropriate numeral. Please indicate how strongly you feel about your placement of the statement into the category by circling the appropriate number as follows: 3 no question about it 2 strongly 1 not very sure 0 absolutely not 3 2 1 0

  21. Sample Category Selection Task Statements of Conceptions 1. My classmates and peers are better at assessments than I am. 2. Assessment controls too much what and how students learn. 3. I am embarrassed to let others know how well I am doing academically. 4. Assessment results are filed & ignored. Which categories? I II III IV V VI How sure are you? 3 2 1 0 3 2 1 0 I II III IV V VI 3 2 1 0 I II III IV V VI 3 2 1 0 I II III IV V VI Use odd number of raters (3, 5, 7) Accept success if: Majority select same category as intended with an average confidence >2.00

  22. Sample HK SCoA Results I Negative Items No. of agree 4 Confidence Score 2.75 Conflicting Categories AGREE 7. When being assessed, I feel alone and abandoned. 49. Assessments cause anxiety, fear, nervousness, and pressure. 4 2.5 26. Assessment interferes with my learning. 3 2.67 61. Assessment is value-less. 3 2.33 23. The importance of assessment is over-rated. 3 2 14. Assessments cannot tell me how well I have achieved. 3 1.33

  23. Items No. of agree Confidence Score Conflicting Categories 34. Assessments get in the way of my personal development. 20. Assessments make things worse for students rather than help them learn. 4. Assessment results are filed & ignored. 33. I ignore or throw away my assessment results. 54. I ignore assessment information. 55. Assessment is used by school leaders to police what students do. 22. Success in assessment only brings trivial rewards. 65. Assessment forces students to learn in a way against their beliefs. 37. Teachers are over-assessing. 31. It means nothing even if I succeed. 53. Assessment is unfair to students. 8. Poor performance is irrelevant to me. 42. Assessment has little impact on my learning. 25. The only assessment that matters is my own evaluation of myself. 2 3 V, IV 2 2.5 V, IV 2 2 IV, II 2 2 IV, III 2 2 IV, III 2 2 V, IV 2 2 II / III / V 2 1.5 IVx3 (2) 2 1.5 V, IV 1 3 III / V 1 3 IV / V 1 2 IVx3 (2), V 1 2 V, III / IV 1 2 No consensus

  24. Prepare questionnaire Select items that clearly match the intended category (8 is enough) Keep items that did not match your intended category but which there was strong agreement that they belonged to a plausible category Assemble items into questionnaire sequentially organised Administer questionnaire

  25. Statistical analysis Test if items belong to factors as designed Drop those that have weak loading or cross-load too strongly Check fit of model with confirmatory factor analysis If no model matches intentions, then carry out exploratory factor analysis Get a second sample to check that EFA model is not a chance result

  26. Recommended reading Bandalos, D. L., & Finney, S. J. (2010). Factor analysis: Exploratory and confirmatory. In G. R. Hancock & R. O. Mueller (Eds.), The reviewer's guide to quantitative methods in the social sciences (pp. 93-114). New York: Routledge. Costello, A. B., & Osborne, J. W. (2005). Best practices in explMarsh, H. W., & Hau, K. T. (1999). Confirmatory factor analysis: Strategies for small sample sizes. In R. H. Hoyle (Ed.), Statistical strategies for small sample research (pp. 251-284). London: SAGE Publications. oratory factor analysis: Four recommendations for getting the most from your analysis. Practical Assessment Research & Evaluation, 10(7), Available online: http://www.pareonline.net/pdf/v10n17.pdf. Hoyle, R. H., & Duvall, J. L. (2004). Determining the number of factors in exploratory and confirmatory factor analysis. In D. Kaplan (Ed.), The SAGE Handbook of Quantitative Methodology for Social Sciences (pp. 301-315). Thousand Oaks, CA: Sage. Kline, P. (1994). An easy guide to factor analysis. London: Routledge. Marsh, H. W., & Hau, K. T. (1999). Confirmatory factor analysis: Strategies for small sample sizes. In R. H. Hoyle (Ed.), Statistical strategies for small sample research (pp. 251-284). London: SAGE Publications.

Related


More Related Content