Best Practices for Scale Deployment: Preventing Faulty Measurement
This insightful guide focuses on preventing faulty measurement in the deployment of scales, emphasizing the importance of identifying, choosing, and modifying scales effectively. With a systematic approach outlined by Kelly L. Haws, Kevin L. Sample, and John Hulland, researchers can ensure accurate and reliable measurement practices to avoid mismeasuring constructs.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Scale Use and Abuse: Towards Best Practices in the Deployment of Scales Kelly L. Haws Kevin L. Sample John Hulland
JCP Scale Usage Review [4 Issues 30(3) 31(2)] 76% of the papers using scales in our review are potentially mismeasuring constructs! Part A: Summary of Scale Usage Scale Type Selected Scale Deployment Total Validated Improvised 16 17 35 Used as is Modified Total 11 22 31 27 39 66 This has serious (and bad) theoretical implications Part B: Summary of Scale Modifications Wording only Length only Dimensionality only Multiple modifications Indeterminate 3 2 8 3 1 7 1 0 10 4 10 3 8 13 5 Kelly L. Haws, Kevin L. Sample, & John Hulland
In other words don t use liters to measure grams!!! Kelly L. Haws, Kevin L. Sample, & John Hulland
How can we prevent faulty measurement? Step 1: Identify a Scale Kelly L. Haws, Kevin L. Sample, & John Hulland
Step 1: Choosing a Scale Identification of Scale and Assessment of Scale Fit for Deployment Step Guidelines 1) Specify Construct Clearly define the construct to be measured, the domain, and the intended type of measurement. 2) Identify Instrument Conduct a literature review to find the most closely aligned scale(s) to your construct and domain. Review sources to determine whether the scale was originally validated or improvised. 3) Assess Alignment Highly aligned: If deployment is highly consistent with prior use, appropriate fit may be assumed. Uncertain or less closely aligned: If deployment is inconsistent with prior use or researchers are uncertain of alignment, conduct an informal assessment: Solicit open-ended feedback from 2-3 experts, providing them with the construct definition, domain, and all items from the proposed scale to be used, asking if the proposed scale is appropriate for intended measurement and if modifications are needed. If modifications are desired, refer to Table 4 for modification guidelines, followed by Table 5 for validation guidelines. Kelly L. Haws, Kevin L. Sample, & John Hulland
How can we prevent faulty measurement? Step 2: Modify a Scale (if necessary, but should be avoided if possible) Kelly L. Haws, Kevin L. Sample, & John Hulland
Modification of Scales Modification Type Guidelines It is acceptable to make minimal changes to reflect a different context unless avoiding dated or polarizing language. If the culture or language by which the scale will be utilized is different than the one it was developed, these shifts are acceptable if appropriately modified (see Harkness et al., 2010 for these guidelines). Wording Arbitrary changes are typically unacceptable . Any wording shifts, should be thoroughly explained and justified as these types of shifts have enduring effects on scales and future use. If changes to length are desired, conduct an initial pretest of at least n = 50. CFA can be utilized to find potential items to be removed before any validation procedures as indicated in Table 5. Step 2: Modifying a Scale (if absolutely needed) When reducing length, maintain a minimum of 3 items per dimension, (4-5 is preferred). Length Scales should use the full set of original items. Adding new improvised items to an existing scale should be avoided, as such actions require more rigorous scale validation and/or scale development procedures. Guidelines Modification of Scales Modification of Scales Modification Type Guidelines Modification Type Ensure that your construct definition, domain, and intended measurement fully justify the removal/addition of one or more dimensions and provide this rationale in your research. If the culture or language by which the scale will be utilized is different than the one it was developed, these shifts are acceptable if appropriately modified (see Harkness et al., 2010 for these guidelines). It is acceptable to make minimal changes to reflect a different context unless avoiding dated or polarizing language. It is acceptable to make minimal changes to reflect a different context unless avoiding dated or polarizing language. Dimensions If the culture or language by which the scale will be utilized is different than the one it was developed, these shifts are acceptable if appropriately modified (see Harkness et al., 2010 for these guidelines). Best practice entails collecting all subdimensions in at least one study, reporting findings comparing the all scale dimensions to the focal sub- dimensions in the supplemental materials. Arbitrary changes are typically unacceptable . Wording Wording Arbitrary changes are typically unacceptable . Any wording shifts, should be thoroughly explained and justified as these types of shifts have enduring effects on scales and future use. Any wording shifts, should be thoroughly explained and justified as these types of shifts have enduring effects on scales and future use. Multiple modifications can quickly erode the theoretical foundations of scales. If absolutely necessary and having only transitory consequences within the current context, multiple modifications can be made, supported with additional validation procedures as outlined in Table 5. Otherwise, avoid multiple modifications. any validation procedures as indicated in Table 5. Multiple Modifications If changes to length are desired, conduct an initial pretest of at least n = 50. CFA can be utilized to find potential items to be removed before any validation procedures as indicated in Table 5. If changes to length are desired, conduct an initial pretest of at least n = 50. CFA can be utilized to find potential items to be removed before When reducing length, maintain a minimum of 3 items per dimension, (4-5 is preferred). When reducing length, maintain a minimum of 3 items per dimension, (4-5 is preferred). Length Length Scales should use the full set of original items. Adding new improvised items to an existing scale should be avoided, as such actions require more rigorous scale validation and/or scale development procedures. Scales should use the full set of original items. Adding new improvised items to an existing scale should be avoided, as such actions require more rigorous scale validation and/or scale development procedures. Ensure that your construct definition, domain, and intended measurement fully justify the removal/addition of one or more dimensions and provide this rationale in your research. Kelly L. Haws, Kevin L. Sample, & John Hulland Ensure that your construct definition, domain, and intended measurement fully justify the removal/addition of one or more dimensions and provide this rationale in your research. Dimensions Dimensions Best practice entails collecting all subdimensions in at least one study, reporting findings comparing the all scale dimensions to the focal sub- dimensions in the supplemental materials. Best practice entails collecting all subdimensions in at least one study, reporting findings comparing the all scale dimensions to the focal sub- dimensions in the supplemental materials. Multiple modifications can quickly erode the theoretical foundations of scales. If absolutely necessary and having only transitory consequences within the current context, multiple modifications can be made, supported with additional validation procedures as outlined in Table 5. Otherwise, avoid multiple modifications. Multiple modifications can quickly erode the theoretical foundations of scales. If absolutely necessary and having only transitory consequences within the current context, multiple modifications can be made, supported with additional validation procedures as outlined in Table 5. Otherwise, avoid multiple modifications. Multiple Modifications Multiple Modifications
How can we prevent faulty measurement? Step 3: Validate a Scale (necessary for any modifications or usage of scales not previously validated) Kelly L. Haws, Kevin L. Sample, & John Hulland
Validation of Scales* Recommended Procedures Step 1) Face Validity Conduct Formal Fit Assessment: Provide 2-3 experts (e.g., trained academics, field experts) or a panel of respondents of n = 50 (if appropriate) your construct definition and domain to evaluate scale items. Randomly present all items from the proposed scale of deployment on a seven-point scale ranging from "very bad fit" (-3) to "very good fit" (3). Acceptable fit for items occurs when means are greater than 0. Revisit modification guidelines in Table 4 for poor face validity results. Follow this with a reassessment of face validity. Internal Reliability Pretest (at least n = 50): Examine internal consistency of the entire measure with Cronbach's Alpha (or potentially Omega (Hayes & Coutts, 2020)). Results should be .70 or greater. Conduct Confirmatory Factor Analysis (CFA). Item loadings should be greater than .70 Revisit modification guidelines in Table 4 for poor internal reliability. Follow this with a reassessment of internal reliability. Convergent Validity Assessment (at least n = 50) Typically conducted in a separate study than internal reliability. Run a within-subjects study where participants respond to both your scale and an established, validated scale (i.e., typically the original, validated scale) that measures the same construct. Correlations between the two scales should be significant and high (r .70 or higher). More than one validated scale may be needed dependent upon the Do the scale items align with intended deployment? 2) Internal Reliability Do the scale items hold together? 3A) Convergent Validity Validity Does the scale measure the same construct as a validated scale? Step 3: Validating Modified/Improvised Scales (Face Validity & Internal Reliability) dimensionality of the intended scale of deployment. May not be necessary if only minor modifications to a validated scale have occurred. May not be possible with improvised scales. If not, discriminant validity should be assessed with at least 2 related, validated scales. Revisit modification guidelines in Table 4 for poor convergence. Follow this with a reassessment of convergent validity or choose alternative scale, returning to Table 3. Discriminant Validity Assessment (at least n = 50) Validity Can be simultaneously run with convergent validity assessment. Does the scale measure a distinct construct from a construct. Correlations between the two scales should be low (r below .70). validated, related, yet different, scale? dimensionality of the intended scale of deployment. This assessment may not be necessary if only minor modifications to a validated scale have occurred. Revisit modification guidelines in Table 4 for poor discrimination. Follow this with a reassessment of discriminant validity or choose alternative scale, returning to Table 3. * - In addition to modified scales, we recommend these procedures for all as is usages of improvised scales that have not been subjected to a form of this validatin previously. Validation of Scales* Recommended Procedures Step 1) Face Validity Conduct Formal Fit Assessment: Provide 2-3 experts (e.g., trained academics, field experts) or a panel of respondents of n = 50 (if appropriate) your construct definition and domain to evaluate scale items. Randomly present all items from the proposed scale of deployment on a seven-point scale ranging from "very bad fit" (-3) to "very good fit" (3). Acceptable fit for items occurs when means are greater than 0. Revisit modification guidelines in Table 4 for poor face validity results. Follow this with a reassessment of face validity. Internal Reliability Pretest (at least n = 50): Examine internal consistency of the entire measure with Cronbach's Alpha (or potentially Omega (Hayes & Coutts, 2020)). Results should be .70 or greater. Conduct Confirmatory Factor Analysis (CFA). Item loadings should be greater than .70 Revisit modification guidelines in Table 4 for poor internal reliability. Follow this with a reassessment of internal reliability. Convergent Validity Assessment (at least n = 50) Typically conducted in a separate study than internal reliability. Run a within-subjects study where participants respond to both your scale and an established, validated scale (i.e., typically the original, validated scale) that measures the same construct. Correlations between the two scales should be significant and high (r .70 or higher). More than one validated scale may be needed dependent upon the dimensionality of the intended scale of deployment. May not be necessary if only minor modifications to a validated scale have occurred. May not be possible with improvised scales. If not, discriminant validity should be assessed with at least 2 related, validated scales. Revisit modification guidelines in Table 4 for poor convergence. Follow this with a reassessment of convergent validity or choose alternative scale, returning to Table 3. Discriminant Validity Assessment (at least n = 50) Can be simultaneously run with convergent validity assessment. Run a within-subjects study where participants respond to both your scale and an established, validated scale that measures the a related, yet distinct construct. Correlations between the two scales should be low (r below .70). More than one validated scale may be needed dependent upon the dimensionality of the intended scale of deployment. This assessment may not be necessary if only minor modifications to a validated scale have occurred. Revisit modification guidelines in Table 4 for poor discrimination. Follow this with a reassessment of discriminant validity or choose alternative scale, returning to Table 3. * - In addition to modified scales, we recommend these procedures for all as is usages of improvised scales that have not been subjected to a form of this validatin previously. Do the scale items align with intended deployment? 3B) Discriminant Run a within-subjects study where participants respond to both your scale and an established, validated scale that measures the a related, yet distinct 2) Internal Reliability Do the scale items hold together? More than one validated scale may be needed dependent upon the 3A) Convergent Validity Validity Does the scale measure the same construct as a validated scale? Kelly L. Haws, Kevin L. Sample, & John Hulland 3B) Discriminant Validity Does the scale measure a distinct construct from a validated, related, yet different, scale?
Validation of Scales* Recommended Procedures Step 1) Face Validity Conduct Formal Fit Assessment: Provide 2-3 experts (e.g., trained academics, field experts) or a panel of respondents of n = 50 (if appropriate) your construct definition and domain to evaluate scale items. Randomly present all items from the proposed scale of deployment on a seven-point scale ranging from "very bad fit" (-3) to "very good fit" (3). Acceptable fit for items occurs when means are greater than 0. Revisit modification guidelines in Table 4 for poor face validity results. Follow this with a reassessment of face validity. Internal Reliability Pretest (at least n = 50): Examine internal consistency of the entire measure with Cronbach's Alpha (or potentially Omega (Hayes & Coutts, 2020)). Results should be .70 or greater. Conduct Confirmatory Factor Analysis (CFA). Item loadings should be greater than .70 Revisit modification guidelines in Table 4 for poor internal reliability. Follow this with a reassessment of internal reliability. Do the scale items align with intended deployment? 2) Internal Reliability Step 3: Validating Modified/Improvised Scales (Convergent & Discriminant Validity) Validation of Scales* Recommended Procedures 1) Face Validity Convergent Validity Assessment (at least n = 50) Validity Typically conducted in a separate study than internal reliability. Does the scale measure the same construct as a validated scale? should be significant and high (r .70 or higher). More than one validated scale may be needed dependent upon the dimensionality of the intended scale of deployment. May not be necessary if only minor modifications to a validated scale have occurred. May not be possible with improvised scales. If not, discriminant validity should be assessed with at least 2 related, validated scales. Revisit modification guidelines in Table 4 for poor convergence. Follow this with a reassessment of convergent validity or choose alternative scale, returning to Table 3. Discriminant Validity Assessment (at least n = 50) Validity Can be simultaneously run with convergent validity assessment. Does the scale measure a distinct construct from a construct. Correlations between the two scales should be low (r below .70). validated, related, yet different, scale? dimensionality of the intended scale of deployment. This assessment may not be necessary if only minor modifications to a validated scale have occurred. Revisit modification guidelines in Table 4 for poor discrimination. Follow this with a reassessment of discriminant validity or choose alternative scale, returning to Table 3. * - In addition to modified scales, we recommend these procedures for all as is usages of improvised scales that have not been subjected to a form of this validatin previously. Do the scale items hold together? Step Conduct Formal Fit Assessment: Provide 2-3 experts (e.g., trained academics, field experts) or a panel of respondents of n = 50 (if appropriate) your construct definition and domain to evaluate scale items. Randomly present all items from the proposed scale of deployment on a seven-point scale ranging from "very bad fit" (-3) to "very good fit" (3). Acceptable fit for items occurs when means are greater than 0. Revisit modification guidelines in Table 4 for poor face validity results. Follow this with a reassessment of face validity. Internal Reliability Pretest (at least n = 50): Examine internal consistency of the entire measure with Cronbach's Alpha (or potentially Omega (Hayes & Coutts, 2020)). Results should be .70 or greater. Conduct Confirmatory Factor Analysis (CFA). Item loadings should be greater than .70 Revisit modification guidelines in Table 4 for poor internal reliability. Follow this with a reassessment of internal reliability. Convergent Validity Assessment (at least n = 50) Typically conducted in a separate study than internal reliability. Run a within-subjects study where participants respond to both your scale and an established, validated scale (i.e., typically the original, validated scale) that measures the same construct. Correlations between the two scales should be significant and high (r .70 or higher). More than one validated scale may be needed dependent upon the dimensionality of the intended scale of deployment. May not be necessary if only minor modifications to a validated scale have occurred. May not be possible with improvised scales. If not, discriminant validity should be assessed with at least 2 related, validated scales. Revisit modification guidelines in Table 4 for poor convergence. Follow this with a reassessment of convergent validity or choose alternative scale, returning to Table 3. Discriminant Validity Assessment (at least n = 50) Can be simultaneously run with convergent validity assessment. Run a within-subjects study where participants respond to both your scale and an established, validated scale that measures the a related, yet distinct construct. Correlations between the two scales should be low (r below .70). More than one validated scale may be needed dependent upon the dimensionality of the intended scale of deployment. This assessment may not be necessary if only minor modifications to a validated scale have occurred. Revisit modification guidelines in Table 4 for poor discrimination. Follow this with a reassessment of discriminant validity or choose alternative scale, returning to Table 3. * - In addition to modified scales, we recommend these procedures for all as is usages of improvised scales that have not been subjected to a form of this validatin previously. 3A) Convergent Validity Do the scale items align with intended deployment? Run a within-subjects study where participants respond to both your scale and an established, validated scale (i.e., typically the original, validated scale) that measures the same construct. Correlations between the two scales 2) Internal Reliability Do the scale items hold together? 3B) Discriminant Run a within-subjects study where participants respond to both your scale and an established, validated scale that measures the a related, yet distinct 3A) Convergent Validity Validity Does the scale measure the same construct as a validated scale? More than one validated scale may be needed dependent upon the Kelly L. Haws, Kevin L. Sample, & John Hulland 3B) Discriminant Validity Does the scale measure a distinct construct from a validated, related, yet different, scale?
How can we prevent faulty measurement? Step 4: Appropriately Report Kelly L. Haws, Kevin L. Sample, & John Hulland
Step 4: Scale Usage Reporting (Content) Scale Deployment Reporting Type of Scale Deployment Information to Report "As Is" Scale Deployment Modified Scale Deployment Validated & Improvised Scales: 1) Provide the name, citation, and number of items of the as is scale in the manuscript. 2) Report the scale items as utilized in an appendix or supplemental online material. 3) Note the anchors used, reporting measures in the order assessed or noting that the order was randomized. Validated & Improvised Scales: 1) Provide the name, citation, and number of items of the modified scale in the manuscript. 2) Note the exact modifications to the scale and the reasoning behind these modifications in the body of the manuscript. 3) Report the scale as fully utilized in an appendix, ideally including additional explanation and procedures for modifications. Validated Scales 1) Provide face validity results in an appendix or supplemental online material. 2) Provide item loadings and either alpha or omega in the manuscript. 3) Provide convergent and discriminant validity results in an appendix or supplemental online material. Improvised Scales Report in line with modified, validated (1, 2, & 3 immediately above), but report convergent and discriminant validity results in the manuscript. Scale Content Validated Scales 1) Provide reliability results (alpha or omega) in the manuscript. Kelly L. Haws, Kevin L. Sample, & John Hulland Improvised Scales 1) Provide face validity results in an appendix or supplemental online material. 2) Provide item loadings and either alpha or omega in the manuscript. 3) Provide convergent and discriminant validity results and details in the manuscript. Scale Validity and Reliability
Scale Deployment Reporting Type of Scale Deployment Information to Report "As Is" Scale Deployment Modified Scale Deployment Validated & Improvised Scales: 1) Provide the name, citation, and number of items of the as is scale in the manuscript. 2) Report the scale items as utilized in an appendix or supplemental online material. Validated & Improvised Scales: 1) Provide the name, citation, and number of items of the modified scale in the manuscript. 2) Note the exact modifications to the scale and the reasoning behind these modifications in the body of the manuscript. 3) Report the scale as fully utilized in an appendix, ideally including additional explanation and Step 4: Scale Usage Reporting (Validity & Reliability) 3) Note the anchors used, reporting measures in the order assessed or noting that the order was randomized. Scale Content Scale Deployment Reporting Type of Scale Deployment Information to Report "As Is" Scale Deployment Modified Scale Deployment procedures for modifications. Validated Scales 1) Provide face validity results in an appendix or supplemental online material. 2) Provide item loadings and either alpha or omega in the manuscript. 3) Provide convergent and discriminant validity results in an appendix or supplemental online material. Improvised Scales Report in line with modified, validated (1, 2, & 3 immediately above), but report convergent and discriminant validity results in the manuscript. Validated & Improvised Scales: 1) Provide the name, citation, and number of items of the as is scale in the manuscript. 2) Report the scale items as utilized in an appendix or supplemental online material. 3) Note the anchors used, reporting measures in the order assessed or noting that the order was randomized. 3) Provide convergent and discriminant validity results and details in the manuscript. Validated & Improvised Scales: 1) Provide the name, citation, and number of items of the modified scale in the manuscript. 2) Note the exact modifications to the scale and the reasoning behind these modifications in the body of the manuscript. 3) Report the scale as fully utilized in an appendix, ideally including additional explanation and procedures for modifications. Validated Scales 1) Provide face validity results in an appendix or supplemental online material. 2) Provide item loadings and either alpha or omega in the manuscript. 3) Provide convergent and discriminant validity results in an appendix or supplemental online material. Improvised Scales Report in line with modified, validated (1, 2, & 3 immediately above), but report convergent and discriminant validity results in the manuscript. Validated Scales 1) Provide reliability results (alpha or omega) in the manuscript. Improvised Scales 1) Provide face validity results in an appendix or supplemental online material. 2) Provide item loadings and either alpha or omega in the manuscript. Scale Content Scale Validity and Reliability Validated Scales 1) Provide reliability results (alpha or omega) in the manuscript. Improvised Scales 1) Provide face validity results in an appendix or supplemental online material. 2) Provide item loadings and either alpha or omega in the manuscript. 3) Provide convergent and discriminant validity results and details in the manuscript. Kelly L. Haws, Kevin L. Sample, & John Hulland Scale Validity and Reliability