Jumping into Statistics: Study Design & Statistical Analysis in Medical Research
Explore the fundamentals of study design & research methodology, learn to select appropriate statistical tests, and practice statistical analysis using JMP Pro Software. Topics include research question formulation, statistical methods, regression, survival analysis, data visualization, and more. Understand the importance of variable types in statistical analysis and how they impact data interpretation.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Jumping into Statistics: Introduction to Study Design and Statistical Analysis for Medical Research Using JMP Pro Statistical Software WINTER/SPRING 2021 DR. CYNDI GARVAN & DR. TERRIE VASILOPOULOS
Meet the Instructors CYNTHIA GARVAN, MA, PHD TERRIE VASILOPOULOS, PHD Research Assistant Professor in Anesthesiology and Orthopaedics and Rehabilitation Research Professor in Anesthesiology
Course Objectives Review fundamentals of study design and research methodology Understand how to choose best statistical test for your research question Practice basic statistical analysis use JMP Pro Software
Course Topics Asking a Good Research Question How to Chose Correct Statistical Method and Run Some Analyses T-tests, ANOVA, Non-Parametric Chi-square, odds ratio, relative risk Regression and Correlation Survival Analysis Test Diagnostics (e.g. sensitivity, specificity, etc.) Life Cycle of Research and the Scientific Method Study Design Data types and Database Construction Descriptive Statistics Comparing Statistical Modeling and Machine Learning Data Visualization Population and Sample, Probability, Statistical Inference
Data, Data, Data Data Types Databases Data Dictionary 2/24/2021
Why is this topic important? The fundamental unit of statistics and statistical analyses is the variable. The variable type (or data type) determines: how the variable can be described (summary statistics) how the variable can be analyzed (variable- appropriate analytical methods)
Why is this topic important? Developing a thorough understanding of variable types will improve skills in: troubleshooting data statistical analysis identification of analytic pitfalls presentation and interpretation of results critical assessment of published results Data is the foundation of statistical analysis.
BUT, BUT, We need to trust the We need to trust the data! data!
What is needed to establish trust in our data: What is needed to establish trust in our data: 1) Understand data types 2) Understand databases 3) Understand how to document data in a data dictionary 4) Look at our data
Data Types Data Types Four Classification Systems
There are four ways to classify data type: I II III IV Qualitative Quantitative Classify data as Classify data as Numerical Categorical Continuous Classify data as Discrete Ordinal Interval Classify data as Nominal Ratio
I: Qualitative or Quantitative Qualitative: A scale of measurement is a set of categories that vary in some quality but not in magnitude. Quantitative: A scale of measurement is a set of categories that vary in magnitude.
II: Categorical or Numerical Categorical: A scale of measurement where levels are a set of categories. Numerical: A scale of measurement where levels are a set of meaningful numbers such as integers or decimals.
III: Discrete or Continuous Discrete: A variable that can take only selected values. Continuous: A numerical variable whose levels include (conceptually) all values between any two levels. Discrete data are counted Continuous data are measured.
IV: Nominal, Ordinal, Interval, Ratio Nominal: A scale of measurement where levels are distinct but do not vary in magnitude. Ordinal: A scale of measurement where levels vary in order of magnitude but equal intervals between levels cannot be assumed. Interval: The interval level of measurement has the characteristics of distinct levels, ordering in magnitude, and equal intervals. Ratio: The ratio level of measurement has characteristics of distinct levels, ordering in magnitude, equal intervals, and an absolute zero. A measurement has an absolute zero when a measurement of zero represents the absence of the property being measured.
EXAMPLES EXAMPLES I II III IV Qualitative or Quantitative Categorical or Numerical Discrete or Continuous Nominal Ordinal Interval Ratio Variable Heart rate bpm Quantitative Numerical Continuous Ratio History of MI Qualitative Categorical Discrete Nominal ASA classification Quantitative Categorical Discrete Ordinal Number of pRBCs given in surgery Quantitative Numerical Discrete Ratio Modified Fatigue Impact Scale (MFIS) Quantitative Numerical Continuous Interval Cancer stage Quantitative Categorical Discrete Ordinal Surgery type Qualitative Categorical Discrete Nominal Pain reported on Visual Analog Scale Quantitative Categorical Discrete Ordinal
Database Construction 1. Use top row for variable names 2. Use consistent codes for variable values 3. There are two types of files: Wide Long 4. It is important to have good naming conventions for your study variables
BAD BAD EXAMPLE EXAMPLE # ID gender of patient age f location 45 34 34 42 male 33 56,67 12 F 23 34,56 102 M 26 34,45,67 86 woman 47 56
GOOD EXAMPLE GOOD EXAMPLE ID gender age level34 level45 level56 level67 34 1 34 0 1 0 0 42 2 33 0 0 1 1 12 1 23 1 0 1 0 102 2 26 1 1 0 1 86 1 47 0 0 1 0
Stats software likes good variable names Stats software likes good variable names Lists are helpful (e.g., Q1 Q20) Short but meaningful very helpful Variable names spelled correctly and should be consistent Stats software has rules for naming variables. For example, the rules for SAS and JMP are: 1. All variable names must start with a letter or an underscore (_). 2. Names can contain only letters, numerals, and the underscore. No %$!*&#@. 3. All variable and data set names must be thirty-two (32) or fewer characters in length. Practice good ID hygiene on paper forms: Labels ID numbers each page Use same ID different time points Don t let participants make up their own ID 21
Example Wide Data Example Wide Data ID gender age level34 level45 level56 level67 34 1 34 0 1 0 0 42 2 33 0 0 1 1 12 1 23 1 0 1 0 102 2 26 1 1 0 1 86 1 47 0 0 1 0
Example Long Data Example Long Data ID 34 34 34 42 42 42 102 102 86 86 86 86 Injection Type Particulate steroid Non-particulate steroid Non-particulate steroid Non-particulate steroid Particulate steroid Non-particulate steroid Particulate steroid Non-particulate steroid Non-particulate steroid Particulate steroid Particulate steroid Non-particulate steroid Date 1/12/2020 3/22/2020 7/06/2020 1/24/2020 3/26/2020 8/26/2020 3/01/2020 8/10/2020 1/05/2020 2/20/2020 6/04/2020 11/16/2020
Data Dictionary Example from WHI: https://www.whi.org/dataset/26
Data Management Plan Data Management Plan From Wikipedia, the free encyclopedia A data management plan or DMP is a formal document that outlines how data are to be handled both during a research project, and after the project is completed.The goal of a data management plan is to consider the many aspects of data management, metadata generation, data preservation, and analysis before the project begins; this ensures that data are well-managed in the present, and prepared for preservation in the future. 26
Resources for Writing Data Management Plan Resources for Writing Data Management Plan https://libraries.mit.edu/data-management/plan/write/ https://library.stanford.edu/research/data-management-services/data- management-plans http://www.lib.ncsu.edu/data-management/dmp_examples 27
Best Practices in Data Best Practices in Data Management Management Thursday, March 11, 12-1pm Learn practical strategies for best managing your research data. Several U.S. funding agencies such as the National Science Foundation and the National Institutes of Health require researchers to supply plans for managing research data, called Data Management Plans (DMP), for all new grant proposals. This workshop will provide an overview of the questions to consider when creating a data management plan, with a focus on the DMPTool and tools for sharing your data at the University of Florida (e.g., subject-specific repositories). Topics include metadata and annotation, file formats and organization, storage, backups and security, and data sharing. The workshop is geared toward graduate students, faculty, and researchers. January: https://ufl.libcal.com/event/7374027 February: https://ufl.zoom.us/j/92336146853 March: https://ufl.zoom.us/j/94686576308
Questions? Questions?
Summary Tips Learn as much as you can about data types Make a data management plan before starting your study! Consult with Statistician when constructing a database
JMP Pro! https://software.ufl.edu/