Machine Learning for Improved Risk Stratification in Health Care

Slide Note
Embed
Share

Explore the use of machine learning for risk stratification of patients with non-communicable diseases in Estonia. This study showcases the application of big data and machine learning in healthcare, emphasizing the benefits of personalized care, proactive disease prevention, and efficient interventions based on patient data.


Uploaded on Aug 14, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. MACHINE LEARNING FOR IMPROVED RISK STRATIFICATION OF NCD PATIENTS IN ESTONIA Big Data and Machine Learning in Health Care Big Data and Machine Learning in Health Care Marvin Ploetz Philip Docena Ojaswi Pandey Aakash Mohpal 23 April, 2019

  2. Objectives Propose an alternative - machine learning based - approach to patient risk stratification for ECM in Estonia Illustrate the use and applicability of machine learning to other areas of work relevant to EHIF

  3. Big Data and Machine Learning in Health Care Machine Learning Basics Context of ECM Research Question Data Overview & Sample Construction Feature Engineering Evaluation & Modelling Choices Results Conclusions Overview

  4. Big Data and Machine Learning in Health Care

  5. Big Data and Machine Learning in Health Care Big Data and Machine Learning in Health Care Vital signs Activity data Behavioral data Nutritional data EMR Clinical notes Medical images Genome data Take advantage of massive amounts of data and provide the right intervention to the right patient at the right time Patients Providers Personalized care to the patient Other Payers Potentially benefit all agents in the health care system: patient, provider, payer, management stakeholders Public health Pharma companies and drug discoveries Claims and billing Approvals and denials Population health and risk

  6. Uses of Machine Learning in Health Care Uses of Machine Learning in Health Care Improve care and efficiency, lower costs Personalized medicine Benefits Proactively prevent diseases Assist diagnostics Right patient Patients Big data and machine learning analytics in health care Right intervention Providers Improve clinical trials Predict disease risk Right time Payers Study population health Find cures for conditions

  7. Example 1: Hip and knee replacement in the US Example 1: Hip and knee replacement in the US Osteoarthritis: a common and painful chronic condition Often requires replacement of hip and knees More than 500,000 Medicare beneficiaries receive replacements each year Medical costs: roughly $15,000 per surgery Medical benefits: accrue over time, since some months after surgery is painful and spent in disability Therefore, a joint replacement only makes sense if you will live long enough to enjoy it. If you die soon after, could be futile and painful Prediction/classification problem: Can we predict which surgeries will be futile using only data available at the time of surgery?

  8. Example 1: Hip and knee replacement in the US Example 1: Hip and knee replacement in the US 3,305 independent variables Train data 65,395 observations 98,090 had hip or knee replacement in 2010 20% of 7.4 million beneficiaries Model to predict riskiest patients Test data 32,695 observations 1.4% died within one month of surgery 4.2% died within 1-12 months Traditional Analysis: About Averages Big Data and ML Analytics: Predict Individual Risks

  9. Example 1: Hip and knee replacement in the US Example 1: Hip and knee replacement in the US Predicted Mortality Percentile Observed mortality rate Futile procedures averted Futile spending ($ millions) 1 43.5% 1,984 30 2 42.2% 3,844 58 5 35.8% 8,061 121 10 24.2% 10,512 158 20 15.2% 12,317 185 30 13.6% 16,151 242 The first column sorts the test sample by risk percentiles. In the top 5th percentile riskiest population, the observed mortality rate within 1 year within 1-12 months was 43.5%. Reallocating these surgeries to those with median risk level (50th percentile) would have averted 1,984 futile procedures, and reallocated $30m to other beneficiaries.

  10. Example 2: Diagnoses of pediatric conditions Example 2: Diagnoses of pediatric conditions Apply natural language processing algorithms to extract data from EHRs Extract 101.6m data points from 1.3m EHRs of pediatric patients High diagnostic accuracy among multiple organ systems and comparable to performance of experienced pediatric physicians

  11. Example 2: Diagnoses of pediatric conditions Example 2: Diagnoses of pediatric conditions

  12. Example 3: Breast cancer screening Example 3: Breast cancer screening Most common form of cancer afflicting 2.5 million patients worldwide in 2015 Need to distinguish malignant tumors from benign ones Early detection is key Data: 62,219 mammography findings from the Wisconsin State Cancer Reporting System A Neural Network based algorithm does as well as radiologists in classifying the tumors

  13. Machine Learning Basics

  14. Definition of Big Data Definition of Big Data Collection of large and complex data sets which are difficult to process using common database management tools or traditional data processing applications Volume Big Data Not only about size: finding insights from complex, noisy, heterogeneous, and longitudinal data sets This includes capturing, storing, searching, sharing and analyzing Variety Velocity

  15. Types of Machine Learning Problems Types of Machine Learning Problems 1) Supervised Making predictions using labeled/structured data Classification: use data to predict which category something falls into Examples: If an image contains a store front or not; If a patient is high risk or not Regression: use data to make predictions on a continuous scale Examples: Predict stock price of a company; given historical data, what will the temperature be tomorrow 2) Unsupervised Detecting patterns from unstructured data Problems where we have little or no idea what the results should look like Provide algorithms with data and ask to look for hidden features and cluster the data in a way it makes sense Examples: identify patterns from genomics data, separating voice from noise in audio files

  16. Machine Learning Implementation Machine Learning Implementation Standardize and clean data Build model using train data Split data in test/train Collect data Validate model results using test data Data Build Machine Learning Model Train data Model Results Data 80% Feature engineering / Data construction Data Test data 20% Data

  17. Assessing Model Performance: Precision and Recall Assessing Model Performance: Precision and Recall Actual Condition/Outcome True False Accuracy = (TP+TN)/All Precision = TP/(TP+FP) Recall = TP/(TP+FN) Condition/Outcome True True Positive (TP) False positive (FP) Predicted False False negative (FN) True negative (TN)

  18. Assessing Model Performance: Precision and Recall Assessing Model Performance: Precision and Recall Case I: High recall, low precision Case II: Low recall, high precision Actual Actual True False True False 100 TP 45 FP 90 TP 5 FP True True Predicted Predicted 5 80 TN 35 FN 100 TN False False FN Accuracy = 150/165 = 78% Precision = 100/145 = 69% Recall = 100/105 = 95% Accuracy = 190/230 = 83% Precision = 90/95 = 95% Recall = 90/125 = 72%

  19. Assessing Model Performance: ROC Curve Assessing Model Performance: ROC Curve Plot the true and false positive rate for every classification threshold A perfect model has a curve that passes through the upper left corner (AUC = 1) The diagonal (red line) represents random guessing (AUC = 0.5)

  20. Decision Tree: Playing Golf Decision Tree: Playing Golf A non-parametric supervised Outlook Temperature Humidity Windy Play Golf Rainy Hot High False No learning method used for Rainy Hot High True No classification and regression Overcast Hot High False Yes Built in the form a tree structure Sunny Mild High False Yes Sunny Cool Normal False Yes Breaks data down in smaller and Sunny Cool Normal True No smaller subsets while incrementally Overcast Cool Normal True Yes building tree Rainy Mild High False No Final result is tree with decision Rainy Cool Normal False Yes Sunny Mild Normal False Yes nodes and leaf nodes

  21. Decision Tree: Playing Golf Decision Tree: Playing Golf Outlook Rainy Overcast Sunny No Golf Golf Windy False True Play Golf No Golf

  22. Decision tree to Random Forest Decision tree to Random Forest A collection of decision trees whose results are aggregated into one final output Use different sub-samples of the data and different set of features Helps reduce overfitting, bias and variance

  23. Context of ECM

  24. A Big Challenge of the Estonian Healthcare System Changes in the demand for health care due to population ageing and rise of non-communicable diseases Chronic conditions as the driving force behind needs for better care integration Low coverage of preventive services and considerable share of avoidable specialist and hospital care Opportunity to improve management of specific patient groups at the PHC level -> care management for empaneled patients Prediction for which patients breaches in care coordination will occur -> risk-stratification of patients

  25. DM/ Hypertension/ Hyperlipidemia No Yes Not eligible Risk Stratification Until Now Min. and Max. Number/Combination of: CVD/ Respiratory/ Mental Health/ Functional Impairment No Yes No actual prediction analysis done Involvement of providers to gain trust/understanding Behavioral and social criteria are key, but sparsely available -> use insider knowledge of doctors Not eligible Dominant/complex condition (cancer, schizophrenia, rare disease etc.) No Yes Review by GPs (Behavioral & social factors, information not in data) Not eligible No Yes Not eligible ECM Candidate

  26. Enhanced Care Management So Far In Estonia Successful enhanced care management pilot with 15 GPs and < 1,000 patients to assess the feasibility and acceptability of enhanced care management Commitment of the Estonian Health Insurance Fund (EHIF) to scale-up the care management pilot Model for risk stratification: - Clinical algorithm + provider intuition Need for a better risk-stratification approach!?

  27. Research Question

  28. The Prediction Problem Target patients - Who benefits from care management? A combination of disease, social and behavioral factors Objective of ECM - Ultimately improve health outcomes for patients with cardio-vascular, respiratory, and mental disease. What is the right proxy prediction variable in the data? There is not one single relevant adverse event (e.g. death, hospital admission, health complication, high healthcare spending) Some discussions on how to choose the dependent variable -> Unplanned hospital admissions have a large negative impact on patient lives, are costly and relatively frequent. Some are also avoidable

  29. Patients with an Admission in 2011 - Subsequent Hospitalization Rates Many Patients Repeatedly Have Hospitalizations 25 Percentage of patients who were 22.9 20.2 20 18.8 hospitalized in 2011 17.7 16.3 15 13.6 22 percent of patients need to be hospitalized again in the following year 9.3 10 5 0 One year later Two years later Three years later Four years later Five years later Six years later Seven years later

  30. Average costs (in Euros, s) in different types of care in 2016 Hospitalizations account for a bulk of healthcare costs 167.63 Inpatient Care 148.17 Outpatient Care 37.72 Day Care 22.53 PHC 10.01 Inpatient Nursing Care 6.41 Outpatient Rehabilitation Care 6.01 Inpatient Rehabilitation Care 4.94 Outpatient Nursing Care ML Sample (N=712,104) General Population (N=1,0260,630)

  31. Predicting Hospital Admissions Hospital admissions are the main (avoidable) adverse health event But predicting hospitalizations is a hard problem Social factors matter a lot, patients may have a lot or no contacts with the healthcare systems at all Tradeoff to choose which hospitalizations we want to predict Admissions due to specific conditions vs. hospitalizations in general

  32. Predicting Hospital Admissions Hospital Admissions Excluded ICD-10 Chapter Title A00-B99 Certain infectious and parasitic diseases Key question C00-D48 Neoplasms Not What is the best algorithm for predicting hospital admissions? O00-O99 Pregnancy, childbirth and the puerperium P00-P96 Certain conditions originating in the perinatal period But How can we obtain the most useful prediction of hospital admissions for a specific purpose? S00-T98 Injury, poisoning V01-X59 Accidents

  33. Data Overview & Sample Construction

  34. Administrative Claims Data (in Estonia) Very reliable High-quality data availability as of 2007/2008 Comprehensive coding requirements for providers Reporting lag of data is on average 2 weeks No info on clinical outcomes (i.e. test results) Limited information on social conditions and behavioral characteristics Need for a lot of feature engineering to create meaningful variables at the patient level

  35. Description of Available Data Administrative Beneficiary Family Doctor Patient-Year Level Types of Care 1. 2. 3. 4. 5. 6. 7. 8. Day Care Inpatient Care Inpatient Nursing Care Inpatient Rehabilitation Care Outpatient Care Outpatient Nursing Care Outpatient Rehabilitation Care Primary Health Care Utilization, Diagnosis, Procedures (Surgical and Other) Medications Prescriptions and Filling of Prescriptions

  36. Patient Cohort Selection for the ML Analysis

  37. Characteristics of Patients in the ML sample vs. Total Population Age distribution of the population in the data Gender Distribution 12 60 59 58 10 57 Percentage of population 56 8 55 54 6 53 52 4 51 % of Women 2 General Pop. ML Sample 0 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65-69 70-74 75-79 80-84 85+ Percentage of population Percentage of sample Relative to the population, the ML sample is older and more likely to be female.

  38. Characteristics of Patients in the ML sample vs. Total Population AVERAGE COSTS (IN ) FOR PRESCRIPTIONS PRESCRIBED TO PATIENTS IN 2016 General Population ML Sample 375.99 Insurance type 1 = General 49.39 43.63 EUROS ( S) 180.21 2 = Unemployed 2.18 2.04 141.58 3 = Pensioner 28.94 37.59 75.01 4 = Disabled 9.67 12.51 5 = Welfare 0 0.00 TOTAL PRICE OF PRESCRIPTIONS TOTAL PRICE SHARE OF PRESCRIPTIONS BY PATIENTS 6 = Widow 0.28 0.07 ML Sample (N=712,104) Uninsured 9.54 4.15 ML Sample, conditional on patients being hospitalized at least once in 2016

  39. Top-20 Chronic Conditions Percentages of Patients With Condition Hypertension 48.1 Joint Arthrosis 31.1 Most Common Chronic Conditions Hyperlipidemia 26.7 Chronic Gastritis/GERD 25.0 Congestive Heart Failure 15.7 Neuropathies 15.2 Thyroid Diseases 14.5 Mood Disorders 14.3 Ischemic Heart Disease 13.8 The ML Sample population is also more sick on average (i.e. the prevalence of chronic conditions is higher) Cardiac arrhythmias 13.0 Dizziness 11.4 Obesity 10.9 Diabetes Mellitus 10.8 Anemia 10.2 Migraine 9.75 Hemorrhoids 9.59 Vision And Hearing Impairments 8.87 COPD 8.53 Stroke 8.41 Asthma 8.10 General Population (N=1,260,630) ML Sample (N=712,104)

  40. Characteristics of Patients in the ML sample vs. Total Population Percentage of people living in a given county (%) Name of county Harju Saare Tartu J rva Rapla P rnu L ne Viljandi Hiiu L ne-Viru J geva P lva V ru Ida-Viru Valga Poverty rate (%) 1 = 12.6 2 = 12.6-15.8 3 = 15.8-17.63 3 = 15.8-17.63 4 = 17.63-18.3 5 = 18.3-21.7 5 = 18.3-21.7 5 = 18.3-21.7 5 = 18.3-21.7 5 = 18.3-21.7 6 = 21.7-24.7 6 = 21.7-24.7 7 = 24.7-25.1 8 = 25.1-26.9 8 = 25.1-26.9 General Population ML sample 43.15 2.65 11.22 2.39 2.55 6.63 1.62 3.75 0.78 4.65 2.38 2.07 2.85 11.05 2.25 43.23 2.53 10.86 2.4 2.45 6.53 1.61 3.7 0.75 4.61 2.34 2.06 2.8 11.88 2.24

  41. Feature Selection & Engineering

  42. Feature Selection & Engineering Series of attempts with interim features to extract better performance Final set: 141 features

  43. Features Used Feature Categories 1. Healthcare utilization Features Total number of hospital admissions Inpatient Admissions Inpatient Nursing Admissions Inpatient Rehab Admissions Total number of hospital stay days Stay days in Inpatient Care Stay days in Inpatient Nursing Care Stay days in Inpatient Rehabilitation Care Total number of PHC visits Total number of specialist visits Outpatient specialist visits Outpatient rehabilitation specialist visits Total number of surgeries Emergency surgeries Surgeries that took between 1 and 3 hours Respiratory surgeries Whether lab tests were done Cholesterol Fractions Cholesterol Glucose Total number of prescriptions

  44. Features Used Feature Categories 2. Health Status Features Total number of major chronic conditions Any of - Joint arthrosis, Chronic gastritis Whether a patient had major chronic condition Hypertension Diabetes Mellitus Hyperlipidemia COPD Asthma Dementia Vision And Hearing Impairments Prescriptions Diabetic agents Diuretics NSAIDs Anticoagulants Antiplatelets Antihypertensives Antidepressants Narcotics Total price of prescriptions (In Euros) Total out-of-pocket expenditures of prescriptions by patients (In Euros)

  45. Features Used Feature Categories 4. Socioeconomic Status Features Feature Categories 3. Patient Behavior Features Insurance status % of prescriptions picked up by patients 1 = General 2 = Unemployed 3 = Pensioner 4 = Disabled 5 = Welfare 6 = Widow Uninsured Feature Categories 5. Quality of care received Features Total number of family doctors utilized across time Admission rates of GPs, standardized by age and gender of patients in the patient list Compliance with diabetes guidelines by PHC doctor Poverty rates at the county level 1 = 12.6 2 = 12.6-15.8 3 = 15.8-17.63 4 = 17.63-18.3 5 = 18.3-21.7 6 = 21.7-24.7 7 = 24.7-25.1 8 = 25.1-26.9 All tests done No tests done

  46. Getting to Know the Data: Diagnosis and Admissions Single DGN Pairs of DGNs Afib (Atrial Fibrillation And Flutter), Chf (Congestive Heart Failure), Htn (Hypertension), and Ischemic Htd (Ischemic Heart Disease) are strong indicators of potential admissions in the following year (2017) Patient groups with these conditions have a non-trivial (~10% likelihood) of hospital admissions This likelihood increases to ~20%-~30% with one 2016 hospital admission and to >50% with 3 and more admissions in 2016

  47. Evaluation & Modelling Choices

  48. ML Models Selected for Evaluation Selection criteria: Algorithms are readily available, easy-to-use, comprehensive and well-tested open-source libraries in Python (scikit) Algorithms and results are relatively easy to describe/explain (common algorithms) For interpretability and model familiarity, no attempt at exploring more complex models; no deep networks Included in comparison: Decision Tree Random Forest and Extremely Randomized Trees (ExtraTrees) k-Nearest Neighbors* Gaussian Na ve-Bayes** Logistic Regression (L1, L2) SVM (RBF, polynomial)*** Multi-layer Perceptrons (1 hidden layer) Adaboost (Decision Tree and Random Forest) Gradient Boosted Trees (scikit GBT, not XGBoost) Calibrated (isotonic) variations of above classifiers Neural Networks Eventually excluded: *kNN for execution time and memory requirements, **NB for weak performance, and ***SVMs for very slow training (but considered for final paper)

  49. Evaluation metrics Variable to be predicted: Yes/No hospital admission in 2017 Use data from 2011-2016 We deal with an unbalanced sample (i.e. 7.5% of patients had an admission in 2017) Appropriate metrics of model performance in an unbalanced dataset: Precision, Recall, ROC curve and area under the curve (AUC) (Problem-specific custom metric to penalize mistakes) for one type of error more heavily: cost of a false positive (cost of ECM) vs. cost of a missed positive (cost of subsequent hospitalization) Different ML models have different strengths, but differences should not be huge

  50. Intuitive Interpretation of Metrics Precision is the probability that a patient classified as a patient with a hospital admission by an algorithm is actually going to have a hospital admission. Recall is the probability that a patient who is going to have a hospital admission is being classified as such by an algorithm. Which one is more important? It depends a lot on the application. There is a tradeoff between maximizing either of them

Related


More Related Content