Pitfalls of Multiple Modes in Questionnaire Design by Dr. Pamela Campanelli

Slide Note

The research discusses challenges in questionnaire design when using multiple modes, such as sensitivity to question content and mode effects. Recommendations are provided to address issues related to sensitive questions, factual versus subjective inquiries, and positivity bias in Telephone (TEL) and Face-to-Face (F2F) interviews.

chin260 Follow

Uploaded on Oct 05, 2024 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript

The Questionnaire Design Pitfalls of Multiple Modes Dr Pamela Campanelli Survey Methods Consultant Chartered Statistician Chartered Scientist

Acknowledgements Other main members of UK Mixed Modes and Measurement Error grant team: Gerry Nicolaas Ipsos MORI Peter Lynn University of Essex Annette J ckle University of Essex Steven Hope University College London Grant funding from: UK Economic and Social Research Council (Award RES-175-25-0007)

Larger Project Looked for Evidence of Mode Differences by Question Content Question Format Type of task Characteristics of the task Implementation of the task Made recommendations Today a few highlights

Sensitive Questions Mixed Mode Context Very well-known Sensitive questions prone to social desirability effects in interviewer modes (see Tourangeau and Yan, 2007; Kreuter, Presser and Tourangeau, 2008) But not all questions (Fowler, Roman, and Di, 1998) Difference by time frame (Fowler, Roman, and Di, 1998) Modes to Use SA (Self- administered) Recommendations If mixed mode design includes F2F interviews, ask sensitive questions in a paper SA form or use CASI TEL interview, pre-test sensitive questions across modes that will be used to see if there are differences

Non-Sensitive: Factual Versus Subjective (1) Mixed Mode Context Subjective questions more prone to mode effects than factual questions (see Lozar Manfreda and Vehovar, 2002; Schonlau et al, 2003) But factual questions also susceptible (Campanelli, 2010) Subjective scalar questions can be prone to TEL positivity bias

TEL (and F2F) Positivity Bias Dillman et al (2009) - aural versus visual effect TEL Rs giving more extreme positive answers Ye et al (2011) - TEL Rs giving more extreme positive answers But found that F2F was like TEL Concluded caused by a MUM effect Hope et al (2011) TEL Rs giving more extreme positive answers But no trace of this in F2F (with a showcard and without a show card) Thus, actual cause for the TEL positivity bias is still unclear

Non-Sensitive: Factual Versus Subjective (2) Modes to Use Recommendations F2F TEL? SA Factual Questions Use Dillman s uni-mode principles and test to see if there are differences across modes Subjective scalar questions Avoid TEL, if possible, due to TEL positivity bias Test F2F to see if positivity bias is present

Inherently Difficult Questions (1) General Questionnaire Design Context Inherent difficulty: Question is difficult due to conceptual, comprehension and/or recall issues Survey satisficing should be greater for inherently difficult questions (Krosnick, 1991) But this is not true for all inherently difficult questions (Hunt et al, 1982; Sangster and Fox, 2000; Nicolaas et al, 2011)

Inherently Difficult Questions (2) EXAMPLE: N56y. What are the things that you like about your neighbourhood? Do you like your neighbourhood because of its community spirit? Yes . 1 No .. 2 N57y. Do you like your neighbourhood because it feels safe? Yes . 1 No .. 2 Etc. Nicolaas et al (2011)

Inherently Difficult Questions (3) Modes to Use Recommendations F2F TEL? SA? General Questionnaire Design Context In general, before use, test questions that are inherently difficult to see how feasible the questions are for Rs (Testing can be done with cognitive interviewing or Belson s (1981) respondent debriefing method) Mixed Modes Context In mixed mode design, pre-test questions with inherent difficulty across modes that will be used to see if there are differences

Mark All That Apply vs. Yes/No for Each (1) Mark at that apply Yes/No for each This card shows a number of different ways for reducing poverty. In your opinion, which of the following would be effective in reducing poverty? MARK ALL THAT APPLY. Increasing pensions 1 Investing in education for children Improving access to childcare Redistribution of wealth Increasing trade union rights Reducing discrimination 6 Increasing income support Investing in job creation None of these Next are a number of questions about different ways for reducing poverty. In your opinion, which of the following would be effective? Would increasing pensions reduce poverty? Yes No 1 2 2 3 4 5 Would investing in education for children reduce poverty? Yes No 1 2 7 8 9 Etc.

Mark All That Apply vs. Yes/No for Each (2) General Questionnaire Design Context Mark all that apply is problematic Sudman and Bradburn (1982) Rasinski et al (1994), Smyth et al (2006) and Thomas and Klein (2006) Thomas and Klein (2006) Smyth et al (2006) Nicolaas et al (2011)

Mark All That Apply vs. Yes/No for Each (3) Mixed Mode Context Smyth et al (2008) - student sample Nicolaas et al (2011) - probability sample of the adult population More research needed

Mark All That Apply vs. Yes/No for Each (4) Mark all that apply Modes to Use F2F? SA? reporting of items, quicker processing time and primacy effects. Therefore probably best to avoid. Recommendations The mark all that apply format is prone to lower However, it may be less likely to show mode effects in a mixed mode design (F2F with showcard versus SA). Yes/No for each Modes to Use F2F TEL SA Recommendations The Yes/No for each format is strongly supported as superior to mark all that apply by Smyth et al (2006, 2008). But It can add to the time taken to complete a questionnaire Long lists of items should be avoided to reduce potential R burden The results from Nicolaas et al (2011) suggest that the format should be tested across modes before use

Ranking versus Rating (1) Ranking What would you consider most important in improving the quality of your neighbourhood? Battery of Rating Questions Next are a number of questions about improving your neighbourhood? Please rank the following 7 items from 1 (meaning most important) to 7 (meaning least important). How important would less traffic be for improving the quality of your neighbourhood? Less traffic Less crime Very important Moderately important Somewhat important Or not important at all? 4 1 2 3 More / better shops Better schools More / better facilities for leisure activities Etc. Better transport links More parking spaces

Ranking versus Rating (2) General Questionnaire Design Context Ranking Is difficult (Fowler, 1995) Primacy effects (see Stern, Dillman & Smyth, 2007) Better quality (see Alwin and Krosnick, 1985; Krosnick, 1999; Krosnick, 2000).

Ranking versus Rating (3) Mixed Modes Context Rating more susceptible to non-differentiation in Web than TEL (Fricker et al, 2005) Similarly, rating sometimes more susceptible to non- differentiation in Web or TEL than F2F (Hope et al, 2011) Ranking more susceptible to non-differentiation in Web than F2F (TEL not tested) (Hope et al, 2011)

Ranking versus Rating (4) Ranking Modes to Use Recommendations F2F Avoid use of ranking in mixed mode studies Ranking could be considered for F2F surveys if the list is short Ranking is not feasible for TEL surveys (unless 4 categories or less) Ranking is often problematic in SA modes Ranking with programme controls in Web may irritate or confuse some Rs NOT TEL SA? Rating Modes to Use Recommendations F2F TEL? SA ? Avoid long sequences of questions using the same rating scale in mixed mode designs that include Web and possibly TEL Could try rating task followed by ranking of the duplicates (except in postal where skip patterns would be too difficult)

Agree/Disagree Questions General Questionnaire Design Context Agree/Disagree questions are a problematic format in all modes This neighbourhood is not a bad place to live. Strongly agree Agree Neither agree nor disagree Disagree Or strongly disagree? 1 2 They create a cognitively complex task Are susceptible to acquiescence bias 3 4 5 For additional problems see Fowler (1995), Converse and Presser (1986), Saris et al (2010) and recent Holbrook AAPOR Webinar Mixed Modes Context Differences across modes were found with more acquiescence bias in the interview modes and curiously, more middle category selection in SA (Hope et al, 2011) Modes to Use Should not be used in any mode Recommendations Avoid use of agree-disagree scales and use alternative formats, such as questions with item specific (IS) response options

Use of Middle Category (1) And how satisfied or dissatisfied are you with street cleaning? Very satisfied 1 Moderately satisfied Slightly satisfied Neither satisfied nor dissatisfied Slightly dissatisfied Moderately dissatisfied Very dissatisfied 2 3 4 5 6 7

Use of Middle Category (2) General Questionnaire Design Context Kalton et al (1980) Krosnick (1991) and Krosnick and Fabrigar (1997) Schuman and Presser (1981) Krosnick and Presser (2010) Krosnick and Fabrigar (1997) O Muircheartaigh, Krosnick and Helic (1999) Hope et al (2011)

Use of Middle Category (3) Mixed modes context More use of the middle category in visual (as opposed to aural) mode (Tarnai and Dillman, 1992) More selection of middle categories on end-labelled scales than fully labelled scales, but less so for TEL (Hope et al 2011) More use of the middle category in Web as opposed to F2F or TEL (Hope et al 2011)

Use of Middle Category (4) Modes to Use Recommendations F2F TEL? SA? Probably best not to use middle categories with a mixed modes study with SA If mixed mode design includes TEL interviews be cautious of the use of end-labelled scales

Overall Typology of Questions

A classification of question characteristics relevant to measurement error Question content Topic: behaviour, other factual, attitude, satisfaction, other subjective Sensitivity Inherent difficulty: conceptual, comprehension, recall Question format Closed Ordinal Ratio/interval Visual analogue scale Nominal Yes/no Mark all Ranking Open Number Date Short textual/ verbal Unconstrained textual/verbal Agree/disagree Rating-unipolar Rating-bipolar Numeric bands Battery of rating scales Number of categories Middle categories Full/end labels Branching Type of task Characteristics of the task Use of instructions, probes, clarification, etc. Edit checks DK/refused explicit or implicit Formatting of response boxes Labelling of response boxes answer space Implementation of question Size of answer box/text field Delineation of Formatting of response lists Showcards

In Summary 1) Mode is a characteristic of a question 2) Good questionnaire design is key to minimising many measurement differences 3) But we are unlikely to eliminate all differences as there are different types of satisficing in different modes 4) We need to do more to assess any remaining differences and find ways to adjust for these (more on this in the next few slides)

Assessing Mixed Mode Measurement Error (1) Quality indicators For example: Mean item nonresponse rate Mean length of responses to open question Mean number of responses in mark all that apply Psychometric scaling properties Comparison of survey estimates to a gold standard (de Leeuw 2005; Kreuter et al, 2008; Voogt and Saris, 2005) Although validation data often hard or impossible to obtain Etc.

Assessing Mixed Mode Measurement Error (2) How was the mixed mode data collected? What are the confounding factors or limitations? Random assignment R s randomly assigned to mode (Nicolaas et al, 2011): But this is not always possible Random group changes mode during the interview (Heerwegh, 2009) In both cases non-compatibility can occur due to differential nonresponse bias R choses mode of data collection May reduce nonresponse, but selection and measurement error effects are confounded (Vannieuwenhuyze et al, 2010)

Assessing Mixed Mode Measurement Error (3) Ways to separate sample composition from mode effects Compare mixed mode data to that of a comparable single-mode survey (Vannieuwenhuyze et al, 2010) Statistical modelling: Weighting (Lee, 2006) Multivariate model (Dillman et al, 2009) Latent variable models (Biemer, 2001) Propensity score matching (Lugtig et al, 2011) Matching Rs from two survey modes which share the same background characteristics Identify Rs who are unique to a specific survey mode and those who are found in both modes May be a useful technique

Assessing Mixed Mode Measurement Error (4) The size of effects between modes Depends on the type of analyses, which Depends on the type of reporting needed For example: Reporting of Means Percentages for extreme categories Percentages for all categories

We hope that todays talk has given you. . . More understanding of the theoretical and practical differences in how Rs react to different modes of data collection More awareness of specific question attributes that make certain questions less portable across modes More knowledge and confidence in executing your own mixed modes questionnaires

Thank you all for listening dr.pamela.campanelli@thesurveycoach.com Complete table of results and recommendations available upon request

Appendix

Open Questions (1) Option 1: Unconstrained textual/verbal open questions (i.e., fully open questions) General Questionnaire Design Context - SA Lines in text boxes versus an open box Christian and Dillman (2004) But Ciochetto et al (2006) Slightly larger answer spaces (Christian and Dillman, 2004)

Open Questions (2) Option 1: Fully open questions (continued) Mixed Mode Context TEL Rs give less detailed answers to open-ended questions than F2F Rs (Groves and Kahn, 1979; Sykes & Collins, 1988; de Leeuw and van der Zouwen, 1988) Paper SA Rs give less complete answers to open-ended questions than F2F or TEL Rs (Dillman, 2007; de Leeuw,1992, Groves and Kahn, 1979) Web Rs provide 30 more words on average than paper SA Rs (Schaeffer and Dillman, 1998) Positive effects of larger answer spaces may also apply to interview surveys (Smith, 1993; 1995)

Open Questions (3) Option 1: Fully open questions (continued) Modes to Use Recommendations F2F TEL SA? If mixed mode design includes SA, Minimise the use of open questions (as less complete answers are obtained) Pre-test SA visual layout 1) To ensure that the question is understood as intended 2) To check if there are differences across modes

Open Questions (4) Option 2: Open question requiring a number, date, or short textual/verbal response General Questionnaire Design Context - SA Small changes in visual design can have large impact on measurement Examples Couper, Traugott and Lamias (2001) Smith (1993; 1995) Dillman et al (2004) Martin et al (2007)

Open Questions (5) Option 2: Short number, date or textual/verbal response (continued) Mixed Modes Context Modes to Use Recommendations F2F TEL SA? Test SA visual layout 1) To ensure that the question is understood as intended 2) To check if there are differences across modes

End-labelled versus Fully-labelled (1) On the whole, how satisfied are you with the present state of the economy in Great Britain, where 1 is very satisfied and 7 is very dissatisfied? General Questionnaire Design Context Krosnick and Fabrigar (1997) suggest that fully-labelled scales are Easier to answer More reliable and valid Two formats are not equivalent Fully-labelled scales produce more positive responses (Dillman and Christian, 2005; Campanelli et al, 2012) End-labelled scales have a higher percent of Rs in the middle category (Campanelli et al, 2012; not discussed in text but in tables of Dillman and Christian, 2005)

End-labelled versus Fully-labelled (2) Mixed Modes Context Although higher endorsement of middle categories on end-labelled scales Less true for TEL Rs (Campanelli et al, 2012) Modes to Use Recommendations F2F TEL? SA Be careful of the use of end-labelled scales as these are more difficult for Rs If mixed mode design includes TEL interviews be cautious of the use of end-labelled scales

Branching versus No Branching (1) In the last 12 months would you say your health has been good or not good? General Questionnaire Design Context Good 1 Not good In TEL surveys, ordinal scales are often changed into a sequence of two or more branching questions in order to reduce the cognitive burden 2 IF GOOD: Would you say your health has been fairly good or very good? Fairly good Very good 1 2 Krosnick and Berent (1993) Malhotra et al (2009) IF NOT GOOD: Would you say your health has been not very good or not good at all? Hunter (2005) Nicolaas et al (2011) Not very good 1 Not good at all 2

Branching versus No Branching (2) Mixed Modes Context Nicolaas et al (2000) found more extreme responses to attitude questions in the branched format in TEL mode (but unclear whether more valid) Nicolaas et al (2011) found Mode differences between F2F, TEL and Web, but with but with no clear patterns No mode difference for the non-branching format More research needed

Branching versus No Branching (3) Modes to Use Recommendations As branching may improve reliability and validity, if used, it should be used across all modes But testing is recommended to see if mode differences are present F2F TEL SA Due to R non-compliance with skip patterns in paper SA1, Dillman (2007) recommends Avoidance of branching questions in mixed mode surveys that include a postal component Instead reduce number of categories so that branching is not required 1 Dillman (2007) shows that the skips after a filter question can be missed by a fifth of postal survey Rs

Implementation of task

Use of instructions, probes, clarifications, etc. (1) Can I check, is English your first or main language? INTERVIEWER: If yes', probe - 'Is English the only language you speak or do you speak any other languages, apart from languages you may be learning at school as part of your studies?' Yes - English only Yes - English first/main and speaks other languages No, another language is respondent's first or main language Respondent is bilingual 1 2 3 4

Use of instructions, probes, clarifications, etc. (2) It is common practice to provide interviewers with additional information that can be used if necessary to improve the quality of information from Rs Although not yet studied in mixed modes, it is likely that this may result in differences across modes in a study that uses SA alongside interviewer modes

Use of instructions, probes, clarifications, etc. (3) Modes to Use Recommendations F2F TEL SA Where possible, all instructions and clarifications should be added to the question for all modes (rather than being left to the discretion of the interviewer) or excluded from all modes Dillman (2007) recommends that interviewer instructions be evaluated for unintended response effects and their use for SA modes considered

Dont Know (1) What, if any, is your religion? None Christian Buddhist Hindu Jewish Muslim Sikh Another religion General Questionnaire Design Context 1 2 3 4 5 6 7 8 Offering explicit don t know response greatly increases cases in this category Particularly true for R s with lower educational attainment (see Schuman and Presser, 1981; Krosnick et al, 2002) Common practice not to provide an explicit don t know in TEL and F2F (Spontaneous only) (Don t know (Refused 99) In SA modes, the don t know option tends to be either an explicit response option or it is omitted altogether 98)

Dont Know (2) Mixed Mode Context Treating don t know differently in different modes may result in different rates of don t know across the modes Fricker et al (2005) Dennis and Li (2007) Bishop et al (1980) Vis-Visschers (2009)

Dont Know (3) Recommendations Modes to Use F2F TEL SA Spontaneous don t know can be offered in mixed mode designs that include only interviewer administered modes (i.e., TEL & F2F). For mixed mode designs that include both interviewer-administered and SA modes, it is generally recommended not to allow don t know as a response option. Further research is required to compare spontaneous don t know in TEL and F2F with alternative methods of dealing with don t know in Web questionnaires (e.g. allowing questions to be skipped without further prompting). For questions where it is likely that many Rs may not know the answer, explicit don t knows should be used across all modes.

Pitfalls of Multiple Modes in Questionnaire Design by Dr. Pamela Campanelli

Download Presentation

Presentation Transcript

Related

More Related Content