THINK BEYOND THE CURBSTONE

undefined
 
T
H
I
N
K
 
B
E
Y
O
N
D
 
T
H
E
 
C
U
R
B
S
T
O
N
E
T
A
K
I
N
G
 
F
A
B
R
I
C
A
T
I
O
N
 
D
E
T
E
C
T
I
O
N
 
A
N
D
 
P
R
E
V
E
N
T
I
O
N
B
E
Y
O
N
D
 
T
H
E
 
I
N
T
E
R
V
I
E
W
E
R
 
L
E
V
E
L
 
STEVE KOCZELA
12/2/2014, WSS
 
BEYOND THE CURBSTONE?
 
Q
u
a
l
i
t
y
 
c
o
n
t
r
o
l
 
p
r
o
b
l
e
m
s
 
a
p
p
e
a
r
 
t
o
 
e
x
t
e
n
d
 
b
e
y
o
n
d
 
t
h
e
i
n
t
e
r
v
i
e
w
e
r
,
 
b
u
t
 
m
o
s
t
 
p
u
b
l
i
s
h
e
d
 
d
e
t
e
c
t
i
o
n
 
m
e
t
h
o
d
s
 
f
o
c
u
s
 
o
n
t
h
e
 
i
n
t
e
r
v
i
e
w
e
r
 
a
n
d
 
c
u
r
b
s
t
o
n
i
n
g
 
o
r
 
i
n
t
e
r
v
i
e
w
e
r
s
d
e
v
i
a
t
i
o
n
s
C
o
m
p
u
t
e
r
s
 
a
p
p
e
a
r
 
t
o
 
b
e
 
i
n
 
u
s
e
 
f
o
r
 
c
r
e
a
t
i
n
g
 
o
r
 
d
u
p
l
i
c
a
t
i
n
g
d
a
t
a
 
r
a
t
h
e
r
 
t
h
a
n
 
c
o
l
l
e
c
t
i
n
g
 
i
t
,
 
b
u
t
 
m
o
s
t
 
m
e
t
h
o
d
s
 
f
o
c
u
s
 
o
n
h
u
m
a
n
 
f
a
c
t
o
r
s
There is an apparent lack of basic quality control processes
and a lack of awareness regarding the warning signs of
problematic data
 
PREVALENCE OF DUPLICATE CASES
 
Duplicates easy to spot with stats programs, but finding them is a
routine quality control step that is often missed
There are many examples of these patterns in datasets available
online, sponsored by survey research heavyweights
 
DUPLICATES AS A SET
 
Duplicated “sets” suggest cut and paste of whole blocks of interviews
 
 
 
 
DUPLICATES AS A SET
 
Here, a set of 13 consecutive cases is duplicated
Suggests someone with access to the data-file is cutting and pasting
whole blocks of interviews
 
CUT AND PASTE
 
Here, blocks of data are cut and paste, but not in a way that creates full
duplicates
Arrows show differences between cases, though major
segments of the 2 data blocks are identical
 
SUGGESTIONS
 
Require detailed administrative data for each record
including all persons responsible for each line of data
 
Interviewer, keypuncher, supervisor, start/stop time, PSU, back-
check type (if any), manager, region, etc.
 
Any level at which data is collected, processed, or verified
 
Run quality control checks for every level of staff
 
Research sponsors should add their own checks to
supplement contractor-led quality control efforts
 
 
Slide Note
Embed
Share

Quality control problems in data collection extend beyond interviewers, with a focus needed on basic processes for detection and prevention of data manipulation. Duplicate cases and sets, as well as cut-and-paste methods, indicate potential issues that can be addressed with detailed administrative data and thorough quality checks at every level of data handling.

  • Quality Control
  • Data Fabrication
  • Duplicate Cases
  • Administrative Data
  • Data Manipulation

Uploaded on Feb 17, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. THINK BEYOND THE CURBSTONE TAKING FABRICATION DETECTION AND PREVENTION BEYOND THE INTERVIEWER LEVEL STEVE KOCZELA 12/2/2014, WSS

  2. BEYOND THE CURBSTONE? Quality control problems appear to extend beyond the interviewer, but most published detection methods focus on the interviewer and curbstoning or interviewers deviations Computers appear to be in use for creating or duplicating data rather than collecting it, but most methods focus on human factors There is an apparent lack of basic quality control processes and a lack of awareness regarding the warning signs of problematic data

  3. PREVALENCE OF DUPLICATE CASES Duplicates easy to spot with stats programs, but finding them is a routine quality control step that is often missed There are many examples of these patterns in datasets available online, sponsored by survey research heavyweights Variables in duplicated data Case Country Duplicates County A Country B 777 of 1,182 total 208 of 750 total 180 180 Survey 1 Survey 2 Country C 230 of 850 cases 78 1,774 of 2,669 cases Survey 3 Country D 66

  4. DUPLICATES AS A SET Duplicated sets suggest cut and paste of whole blocks of interviews

  5. DUPLICATES AS A SET Here, a set of 13 consecutive cases is duplicated Suggests someone with access to the data-file is cutting and pasting whole blocks of interviews

  6. CUT AND PASTE Here, blocks of data are cut and paste, but not in a way that creates full duplicates Arrows show differences between cases, though major segments of the 2 data blocks are identical

  7. SUGGESTIONS Require detailed administrative data for each record including all persons responsible for each line of data Interviewer, keypuncher, supervisor, start/stop time, PSU, back- check type (if any), manager, region, etc. Any level at which data is collected, processed, or verified Run quality control checks for every level of staff Research sponsors should add their own checks to supplement contractor-led quality control efforts

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#