Jumping into Statistics: Study Design & Statistical Analysis in Medical Research

undefined
Jumping into Statistics:
Introduction to Study Design and
Statistical Analysis for Medical
Research Using JMP Pro
Statistical Software
WINTER/SPRING 2021
DR. CYNDI GARVAN & DR. TERRIE VASILOPOULOS
Meet the Instructors
CYNTHIA GARVAN, MA, PHD
 
Research Professor in Anesthesiology
TERRIE VASILOPOULOS, PHD
 
Research Assistant Professor in
Anesthesiology and Orthopaedics and
Rehabilitation
Course Objectives
 Review fundamentals of study design and
research methodology
 Understand how to choose best statistical test for
your research question
 Practice basic statistical analysis use JMP Pro
Software
Course Topics
 Asking a Good Research Question
 Life Cycle of Research and the
Scientific Method
 Study Design
 Data types and Database
Construction
 Descriptive Statistics
 Data Visualization
 Population and Sample,
Probability, Statistical Inference
How to Chose Correct Statistical
Method and Run Some Analyses
T-tests, ANOVA, Non-Parametric
Chi-square, odds ratio, relative risk
Regression and Correlation
Survival Analysis
Test Diagnostics (e.g. sensitivity,
specificity, etc.)
 Comparing Statistical Modeling
and Machine Learning
undefined
Data, Data, Data
Data Types
Databases
Data Dictionary
2/24/2021
Why is this topic important?
T
he fundamental unit of statistics and statistical
analyses is the 
variable
.
The 
variable type 
(or data type) determines:
how the variable can be 
described
 (summary
statistics) 
how the variable can be 
analyzed
 (variable-
appropriate analytical methods)
Why is this topic important?
Developing a thorough understanding of variable types will
improve skills in:
troubleshooting data 
statistical analysis
identification of analytic pitfalls
presentation and interpretation of results
critical assessment of published results
 
 
Data is the foundation of statistical analysis.
undefined
B
U
T
,
W
e
 
n
e
e
d
 
t
o
 
t
r
u
s
t
 
t
h
e
d
a
t
a
!
W
h
a
t
 
i
s
 
n
e
e
d
e
d
 
t
o
 
e
s
t
a
b
l
i
s
h
 
t
r
u
s
t
 
i
n
 
o
u
r
 
d
a
t
a
:
1)
Understand data types
2)
Understand databases
3)
Understand how to document data in a
data dictionary
4)
Look at our data
D
a
t
a
 
T
y
p
e
s
Four Classification Systems
There are four ways to classify data type:
I
I
Quantitative
Quantitative
Qualitative
Qualitative
II
II
III
III
IV
IV
Classify data as
Classify data as
Classify data as
Classify data as
Categorical
Categorical
Numerical
Numerical
Discrete
Discrete
Continuous
Continuous
Nominal
Nominal
Ordinal
Ordinal
Interval
Interval
Ratio
Ratio
                                               I: Qualitative or Quantitative
Qualitative
: 
A scale of measurement is a set of categories that vary in
some quality but not in magnitude.
Quantitative
: 
A scale of measurement is a set of categories that vary
in magnitude.
                                               II: Categorical or Numerical
Categorical
:
 A scale of measurement where levels are a set of
categories.
Numerical
:
 A scale of measurement where levels are a set of
meaningful numbers such as integers or decimals.
                                               III: Discrete or Continuous
Discrete:
 A variable that can take only selected values.
Continuous:
 A numerical variable whose levels include (conceptually)
all values between any two levels.
Discrete data are counted
Continuous data are measured.
                                        IV: Nominal, Ordinal, Interval, Ratio
Nominal:
 A scale of measurement where levels are distinct but do not
vary in magnitude.
Ordinal:
 A scale of measurement where levels vary in order of magnitude
but equal intervals between levels cannot be assumed.
Interval:
 The interval level of measurement has the characteristics of
distinct levels, ordering in magnitude, and equal intervals.
Ratio:
 The ratio level of measurement has characteristics of distinct
levels, ordering in magnitude, equal intervals, and an absolute zero.  A
measurement has an absolute zero when a measurement of zero
represents the absence of the property being measured.
E
X
A
M
P
L
E
S
Database Construction
1. Use top row for variable names
2. Use consistent codes for variable values
3. There are two types of files:
Wide
Long
4. It is important to have good naming
conventions for your study variables
B
A
D
 
E
X
A
M
P
L
E
G
O
O
D
 
E
X
A
M
P
L
E
S
t
a
t
s
 
s
o
f
t
w
a
r
e
 
l
i
k
e
s
 
g
o
o
d
 
v
a
r
i
a
b
l
e
 
n
a
m
e
s
Lists are helpful (e.g., Q1 – Q20)
Short but meaningful very helpful
Variable names spelled correctly and should be consistent
Stats software has rules for naming variables. For example, the
rules for SAS and JMP are:
1.
All variable names must start with a 
letter
 or an 
underscore
 (_).
2.
Names can contain only letters, numerals, and the underscore. No
%$!*&#@.
3.
All variable and data set names must be 
thirty-two
 (32) or fewer
characters in length.
Practice good ID hygiene on paper forms:
Labels
ID numbers each page
Use same ID different time points
Don’t let participants make up their own ID
21
E
x
a
m
p
l
e
 
W
i
d
e
 
D
a
t
a
E
x
a
m
p
l
e
 
L
o
n
g
 
D
a
t
a
Data Dictionary
Data Dictionary
 
Example from WHI:
 
https://www.whi.org/dataset/26
D
a
t
a
 
M
a
n
a
g
e
m
e
n
t
 
P
l
a
n
From Wikipedia, the free encyclopedia
A 
data management plan
 or 
DMP
 is a formal document that outlines
how 
data
 are to be handled both during a research project, and after
the project is completed.
 
The goal of a data management plan is to
consider the many aspects of 
data management
, 
metadata
 generation,
data preservation, and analysis before the project begins; this ensures
that data are well-managed in the present, and prepared for
preservation in the future.
26
R
e
s
o
u
r
c
e
s
 
f
o
r
 
W
r
i
t
i
n
g
 
D
a
t
a
 
M
a
n
a
g
e
m
e
n
t
 
P
l
a
n
https://libraries.mit.edu/data-management/plan/write/
https://library.stanford.edu/research/data-management-services/data-
management-plans
http://www.lib.ncsu.edu/data-management/dmp_examples
27
B
e
s
t
 
P
r
a
c
t
i
c
e
s
 
i
n
 
D
a
t
a
M
a
n
a
g
e
m
e
n
t
 
Thursday, March 11, 12-1pm
 
Learn practical strategies for best managing your research data. Several U.S.
funding agencies such as the National Science Foundation and the National
Institutes of Health require researchers to supply plans for managing
research data, called Data Management Plans (DMP), for all new grant
proposals. This workshop will provide an overview of the questions to
consider when creating a data management plan, with a focus on the
DMPTool and tools for sharing your data at the University of Florida (e.g.,
subject-specific repositories). Topics include metadata and annotation, file
formats and organization, storage, backups and security, and data sharing.
The workshop is geared toward graduate students, faculty, and researchers.
 
January: 
https://ufl.libcal.com/event/7374027
 
February: 
https://ufl.zoom.us/j/92336146853
 
March: 
https://ufl.zoom.us/j/94686576308
Q
u
e
s
t
i
o
n
s
?
Summary Tips
 Learn as much as you can about data types
Make a data management plan before starting your study!
Consult with Statistician when constructing a database
JMP Pro!
https://software.ufl.edu/
Slide Note
Embed
Share

Explore the fundamentals of study design & research methodology, learn to select appropriate statistical tests, and practice statistical analysis using JMP Pro Software. Topics include research question formulation, statistical methods, regression, survival analysis, data visualization, and more. Understand the importance of variable types in statistical analysis and how they impact data interpretation.

  • Statistics
  • Study Design
  • Medical Research
  • Statistical Analysis
  • Data Analysis

Uploaded on Sep 15, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Jumping into Statistics: Introduction to Study Design and Statistical Analysis for Medical Research Using JMP Pro Statistical Software WINTER/SPRING 2021 DR. CYNDI GARVAN & DR. TERRIE VASILOPOULOS

  2. Meet the Instructors CYNTHIA GARVAN, MA, PHD TERRIE VASILOPOULOS, PHD Research Assistant Professor in Anesthesiology and Orthopaedics and Rehabilitation Research Professor in Anesthesiology

  3. Course Objectives Review fundamentals of study design and research methodology Understand how to choose best statistical test for your research question Practice basic statistical analysis use JMP Pro Software

  4. Course Topics Asking a Good Research Question How to Chose Correct Statistical Method and Run Some Analyses T-tests, ANOVA, Non-Parametric Chi-square, odds ratio, relative risk Regression and Correlation Survival Analysis Test Diagnostics (e.g. sensitivity, specificity, etc.) Life Cycle of Research and the Scientific Method Study Design Data types and Database Construction Descriptive Statistics Comparing Statistical Modeling and Machine Learning Data Visualization Population and Sample, Probability, Statistical Inference

  5. Data, Data, Data Data Types Databases Data Dictionary 2/24/2021

  6. Why is this topic important? The fundamental unit of statistics and statistical analyses is the variable. The variable type (or data type) determines: how the variable can be described (summary statistics) how the variable can be analyzed (variable- appropriate analytical methods)

  7. Why is this topic important? Developing a thorough understanding of variable types will improve skills in: troubleshooting data statistical analysis identification of analytic pitfalls presentation and interpretation of results critical assessment of published results Data is the foundation of statistical analysis.

  8. BUT, BUT, We need to trust the We need to trust the data! data!

  9. What is needed to establish trust in our data: What is needed to establish trust in our data: 1) Understand data types 2) Understand databases 3) Understand how to document data in a data dictionary 4) Look at our data

  10. Data Types Data Types Four Classification Systems

  11. There are four ways to classify data type: I II III IV Qualitative Quantitative Classify data as Classify data as Numerical Categorical Continuous Classify data as Discrete Ordinal Interval Classify data as Nominal Ratio

  12. I: Qualitative or Quantitative Qualitative: A scale of measurement is a set of categories that vary in some quality but not in magnitude. Quantitative: A scale of measurement is a set of categories that vary in magnitude.

  13. II: Categorical or Numerical Categorical: A scale of measurement where levels are a set of categories. Numerical: A scale of measurement where levels are a set of meaningful numbers such as integers or decimals.

  14. III: Discrete or Continuous Discrete: A variable that can take only selected values. Continuous: A numerical variable whose levels include (conceptually) all values between any two levels. Discrete data are counted Continuous data are measured.

  15. IV: Nominal, Ordinal, Interval, Ratio Nominal: A scale of measurement where levels are distinct but do not vary in magnitude. Ordinal: A scale of measurement where levels vary in order of magnitude but equal intervals between levels cannot be assumed. Interval: The interval level of measurement has the characteristics of distinct levels, ordering in magnitude, and equal intervals. Ratio: The ratio level of measurement has characteristics of distinct levels, ordering in magnitude, equal intervals, and an absolute zero. A measurement has an absolute zero when a measurement of zero represents the absence of the property being measured.

  16. EXAMPLES EXAMPLES I II III IV Qualitative or Quantitative Categorical or Numerical Discrete or Continuous Nominal Ordinal Interval Ratio Variable Heart rate bpm Quantitative Numerical Continuous Ratio History of MI Qualitative Categorical Discrete Nominal ASA classification Quantitative Categorical Discrete Ordinal Number of pRBCs given in surgery Quantitative Numerical Discrete Ratio Modified Fatigue Impact Scale (MFIS) Quantitative Numerical Continuous Interval Cancer stage Quantitative Categorical Discrete Ordinal Surgery type Qualitative Categorical Discrete Nominal Pain reported on Visual Analog Scale Quantitative Categorical Discrete Ordinal

  17. Database Construction 1. Use top row for variable names 2. Use consistent codes for variable values 3. There are two types of files: Wide Long 4. It is important to have good naming conventions for your study variables

  18. BAD BAD EXAMPLE EXAMPLE # ID gender of patient age f location 45 34 34 42 male 33 56,67 12 F 23 34,56 102 M 26 34,45,67 86 woman 47 56

  19. GOOD EXAMPLE GOOD EXAMPLE ID gender age level34 level45 level56 level67 34 1 34 0 1 0 0 42 2 33 0 0 1 1 12 1 23 1 0 1 0 102 2 26 1 1 0 1 86 1 47 0 0 1 0

  20. Stats software likes good variable names Stats software likes good variable names Lists are helpful (e.g., Q1 Q20) Short but meaningful very helpful Variable names spelled correctly and should be consistent Stats software has rules for naming variables. For example, the rules for SAS and JMP are: 1. All variable names must start with a letter or an underscore (_). 2. Names can contain only letters, numerals, and the underscore. No %$!*&#@. 3. All variable and data set names must be thirty-two (32) or fewer characters in length. Practice good ID hygiene on paper forms: Labels ID numbers each page Use same ID different time points Don t let participants make up their own ID 21

  21. Example Wide Data Example Wide Data ID gender age level34 level45 level56 level67 34 1 34 0 1 0 0 42 2 33 0 0 1 1 12 1 23 1 0 1 0 102 2 26 1 1 0 1 86 1 47 0 0 1 0

  22. Example Long Data Example Long Data ID 34 34 34 42 42 42 102 102 86 86 86 86 Injection Type Particulate steroid Non-particulate steroid Non-particulate steroid Non-particulate steroid Particulate steroid Non-particulate steroid Particulate steroid Non-particulate steroid Non-particulate steroid Particulate steroid Particulate steroid Non-particulate steroid Date 1/12/2020 3/22/2020 7/06/2020 1/24/2020 3/26/2020 8/26/2020 3/01/2020 8/10/2020 1/05/2020 2/20/2020 6/04/2020 11/16/2020

  23. Data Dictionary

  24. Data Dictionary Example from WHI: https://www.whi.org/dataset/26

  25. Data Management Plan Data Management Plan From Wikipedia, the free encyclopedia A data management plan or DMP is a formal document that outlines how data are to be handled both during a research project, and after the project is completed.The goal of a data management plan is to consider the many aspects of data management, metadata generation, data preservation, and analysis before the project begins; this ensures that data are well-managed in the present, and prepared for preservation in the future. 26

  26. Resources for Writing Data Management Plan Resources for Writing Data Management Plan https://libraries.mit.edu/data-management/plan/write/ https://library.stanford.edu/research/data-management-services/data- management-plans http://www.lib.ncsu.edu/data-management/dmp_examples 27

  27. Best Practices in Data Best Practices in Data Management Management Thursday, March 11, 12-1pm Learn practical strategies for best managing your research data. Several U.S. funding agencies such as the National Science Foundation and the National Institutes of Health require researchers to supply plans for managing research data, called Data Management Plans (DMP), for all new grant proposals. This workshop will provide an overview of the questions to consider when creating a data management plan, with a focus on the DMPTool and tools for sharing your data at the University of Florida (e.g., subject-specific repositories). Topics include metadata and annotation, file formats and organization, storage, backups and security, and data sharing. The workshop is geared toward graduate students, faculty, and researchers. January: https://ufl.libcal.com/event/7374027 February: https://ufl.zoom.us/j/92336146853 March: https://ufl.zoom.us/j/94686576308

  28. Questions? Questions?

  29. Summary Tips Learn as much as you can about data types Make a data management plan before starting your study! Consult with Statistician when constructing a database

  30. JMP Pro! https://software.ufl.edu/

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#