Understanding Intercoder Reliability in Data Coding

coding and intercoder reliability l.w
1 / 16
Embed
Share

Learn about intercoder reliability in data coding, including its importance, measurement methods, and tips for improvement. Intercoder reliability ensures consistent coding judgments across multiple coders, enhancing research quality and enabling division of labor.

  • Data Coding
  • Intercoder Reliability
  • Research Quality
  • Measurement Methods
  • Coding Accuracy

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Coding and Intercoder Reliability Su Li School of Law, U.C. Berkeley 2/12/2015

  2. Outline Basics of data coding What s intercoder reliability? Why does it matter? How to measure and report intercoder reliability? How to improve intercoder reliability? References

  3. Data Coding Basics Start from a codebook Exhaustive and mutually exclusive value options for each variable Use multiple variables to code overlapping values or multiple values for one observation

  4. Example Codebook (White collar lawyer project) c_graduate_year: Year graduated from law school (or received the highest degree, if not law degree) 999 if the information is not available Note: type in the applicable year (YYYY) 1. White collar (includes white collar defense, white collar crime, white collar litigation, etc.) 2. Government or corporate investigations 3. White collar and government/corporate investigations (if the practice area is described this way) 4. Criminal defense (if the practice area is described this way) Note: choose from one of the above 4 choices and type in the number. If the practice area has a different title, type in the title. c_practice_area: Practice area See var14-18 in the WC project codebook

  5. http://www.skadden.com/professionals/jon-l-christianson http://www.uria.com/en/oficinas/pekin/abogados.html?iniciales=FMB http://www.akingump.com/en/lawyers-advisors.html http://www.akingump.com/en/lawyers-advisors/michael-a-asaro.html Input data in Stata Label data in Stata Recode data in Stata

  6. Whats intercoder reliability Intercoder reliability is the widely used term for the extent to which independent coders evaluate a characteristic of a message or artifact and reach the same conclusion. (Also known as intercoder agreement, according to Tinsley and Weiss (2000). The intercoder reliability is not exactly the same as the correlation coefficient that measures the degree to which "ratings of different judges are the same when expressed as deviations from their means." Rather it measures only "the extent to which the different judges tend to assign exactly the same rating to each object" (Tinsley & Weiss, 2000, p. 98); http://astro.temple.edu/~lombard/reliability/

  7. Why does it matter? Coding may involve coders judgments which vary among individuals. The quality of research depends on the coherence of coding judgments. Control the coding accuracy at the same time of monitoring intercoder reliability. Practically, make it possible for the division of labor among multiple coders.

  8. Mathematical measures that are commonly reported on intercoder reliability Popping (1988) identified 39 different "agreement indices" for coding nominal categories. Commonly used ones: Percent agreement: PA0=totalAs/n Scott's pi (p): p=(PA0-PAe)/(1-PAe) [when PAe=Sigma(pi_squared)] Cohen's kappa (k): k=(PA0-PAe)/(1-PAe) [when PAe=(1/n_squared)*Sigma(pi_squared)] Krippendorff's alpha (a): (Krippendorff's Alpha 3.12a software) There is no consensus on a single, "best" one. Percent agreement is widely used, but is misleading. Tends to over estimate reliability. Cohen s Kappa is being criticized but still the most frequently used. Hand calculations: http://astro.temple.edu/~lombard/reliability

  9. Example: binary var coding results of two coders coder1 0 50 1total 3 0 53 94.34% 5.66% 89.83% 2 coder2 1 4 6 66.67% 33.33% 10.17% 54 91.53% 8.47% total 5 59 100% PA0=50+2=52; n=59; PAe (in Scott s i)=53/59* 53/59+6/59*6/59; PAe (in Cohen s Kappa)=PAe(in Scott s i)*1/(59*59)

  10. Use SPSS to calculate Cohens Kappa CROSSTABS /TABLES=var1_coder2 BY var1_coder1 /FORMAT=AVALUE TABLES /STATISTICS=KAPPA /CELLS=COUNT /COUNT ROUND CELL.

  11. Use Stata to calculate Cohens Kappa Kappa varlist; (each column shows the frequency of a value coded by different coders) Kap coder1 coder2 .(each column is a coder) (see stata demo) According to Landis and Koch (1977a, 165) below 0.0 Poor 0.00 0.20 Slight 0.21 0.40 Fair 0.41 0.60 Moderate 0.61 0.80 Substantial 0.81 1.00 Almost perfect

  12. obs coder 1 coder 2 coder 3 coder 4 1 2 3 4 5 6 7 8 9 1 1 1 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 1 1 3 2 2 1 4 3 5 4 2 1 4 4 4 2 3 5 1 1 1 1 2 2 1 1 1 3 3 3 4 4 4 4 4 5 5 5 5 1 1 1 1 2 2 3 4 3 3 5 5 4 4 4 2 1 3 5 2 3 1. Coder 1 and coder 2; coder 1 and coder 3, differences are random 2. Coder 1 and coder 3 differences a systematic (e.g. coder 3 alwys code 2 as 1 and 3 as 4, compared with coder 2 10 11 12 13 14 15 16 17 18 19 20

  13. Acceptance standard: Neuendorf (2002) No coherent standard. Some rules of thumb: Coefficients of .90 or greater would be acceptable to all, .80 or greater would be acceptable in most situations, Below .8, there exists great disagreement (p. 145). The criterion of .70 is often used for exploratory research. More liberal criteria are usually used for the indices known to be more conservative (i.e., Cohen s kappa and Scott s pi).

  14. Hughes, Marie Adele, Garrett Dennis E. (1990) recommend to use or not Acceptance level does not correct for chance agreement address chance correction and systematic coding error problem Percent agreement NO Scott's pi (p) 0.6 Acceptable <0.00 Poor; 0.00-0.20 Slight; 0.21-0.40 Fair; 0.41-0.60 Moderate; 0.61-0.80 Substantial; 0.81-1.00 is Almost Perfect." (Landis&Koch 1977) address chance correction and systematic coding error problem address chance correction and systematic coding error problem Acceptable (most extensively discussed) Cohen's kappa (k) Krippendorff's alpha (a) Acceptable Pearson's correlation does not consider systematic coding bias NO

  15. How to improve Intercoder reliability (Lombard et. Al. 2002) In Research Design: 1. 2. 3. 4. Assess reliability informally during coder training ( detailed instructions, close monitoring etc) Assess reliability formally in a pilot test. Assess reliability formally during coding of the full sample. Select and follow an appropriate procedure for incorporating the coding of the reliability sample into the coding of the full sample. (e.g. master coder quality control) In results report: 1. 2. 3. Select one or more appropriate indices. Obtain the necessary tools to calculate the index or indices selected. Select an appropriate minimum acceptable level of reliability for the index or indices to be used. Report intercoder reliability in a careful, clear, and detailed manner in all research reports. 4. http://astro.temple.edu/~lombard/reliability/

  16. Reference http://astro.temple.edu/~lombard/reliability/ Lombard, M., Snyder-Duch, J., & Bracken, C. C. (2002). Content analysis in mass communication: Assessment and reporting of intercoder reliability. Human Communication Research, 28, 587-604. Tinsley, H. E. A. & Weiss, D. J. (2000). Interrater reliability and agreement. In H. E. A. Tinsley & S. D. Brown, Eds., Handbook of Applied Multivariate Statistics and Mathematical Modeling, pp. 95-124. San Diego, CA: Academic Press. Popping, R. (1988). On agreement indices for nominal data. In Willem E. Saris & Irmtraud N. Gallhofer (Eds.), Sociometric research: Volume 1, data collection and scaling (pp. 90-105). New York: St. Martin's Press. Richard J. Landis & Gary G. Koch, The Measurement of Observer Agreements for Categorical Data, Biometrics 33:159-174 (1977) Hughes, Marie Adele, Garrett Dennis E. 1990. Intercoder Reliability Estimation Approaches in Marketing: A Generalizability Theory Framework for Quantitative Data. Journal of Marketing Research. Vol. 27, No. 2 (May, 1990), pp. 185-195

Related


More Related Content