PhD Dissertation

The Application of Machine Learning and Causal Inference to Improve Suicide Outcomes Among U.S. Veterans

The dissertation is divided into three parts:

Aim 1: Theory-based improvements to suicide risk prediction models. Current suicide risk prediction models include primarily individual-level factors such as demographics, prior suicide attempts, diagnoses, healthcare utilization, and medication. Including different institutional factors to the existing suicide risk prediction models may improve model performance. Various algorithms including Naïve Bayes, SVM, k-NN, Random Forest, as well as ensemble methods such as Super Learner will be used.

Aim 2: Exploring bias in machine-learning based suicide risk prediction models. There is increasing concern around existing prediction models in healthcare perpetuating existing health disparities. Poor accuracy in certain subpopulations may expose patients to unwanted or unnecessary care. Awareness of the bias in the algorithms would allow us to respond appropriately and adjust the biased algorithms. We propose to stratify suicide risk prediction models by race, ethnicity, sex, and age, and evaluate the accuracy and fairness in each of these subgroups.

Aim 3: Understanding the causal effects of changes to facility-level virtual mental health care on individual-level suicide-related events. The third aim of this dissertation will focus on a closer look at one of the facility-level factors from the previous machine learning models in Aim 1 and 2. We will utilize an instrumental variables approach with a Heckman correction for partially observed outcomes to tease out the potential causal relationship between telehealth availability (an important predictor of access) and suicide-related events.

Other ongoing projects

The Use of Crisis Line Services during the Covid-19 Pandemic

My role: lead author
The Veterans Crisis Line (VCL) is an important public health tool to mitigate suicide risk among the veteran population. The Covid-19 pandemic has been associated with increased mental health challenges; however, we have limited knowledge about how the pandemic affected contact with the VCL. The objective of this study is to explore whether there were changes in the characteristics of veterans who contacted the VCL, changes in the volume of unique and new contacts, and whether counties with a higher Covid burden experienced an increase in contact volume with the VCL.

Excess Mortality at the VHA Facilities During the COVID-19 Pandemic

My role: lead author
The objective of this study is to identify excess mortality among VHA-enrolled veterans during the pandemic for each medical center, and to correlate these estimates with facility characteristics and rates of COVID-19 cases and deaths within each facility’s catchment area. We use VHA administrative data on 11.4 million veterans from January 2016 through February 2020 to estimate a mortality risk prediction model using ten-fold cross-validation and Poisson quasi-likelihood regression.

Delivery System Emergency Department Capacity and its Effect on Non-system Service Utilization

My role: lead author
Emergency department (ED) use is often seen as a source of excess healthcare spending, prompting managers to limit emergency department capacity in their health systems. However, if ED capacity in a delivery system leads patients to seek emergency care elsewhere, then it may compromise healthcare quality and efficient management within the system. This study uses instrumental variables to answer the question: “What is the effect of VHA in-house ED capacity on VHA community care ED claims?”

Medical Training Programs’ Effect on Productivity and Turnover at the VHA

My role: lead author
This project aims to study the effect of the size of physician training programs on productivity and turnover. To do so, we use training allocations and administrative data from the VHA from 2011-2021 (N=136,282) and instrumental variables methods to account for endogeneity of facility-specialty-year training program allocations. Specifically, we construct an exogenous predicted training allocation treatment variable as a function of total national training program allocation in the country at a given year.

A few examples from past projects

Randomized Evaluation of the Stratification Tool for Opioid Risk Mitigation (STORM)

My role: coauthor
The VHA developed a dashboard Stratification Tool for Opioid Risk Mitigation (STROM) to guide clinical practice interventions and released a policy mandating that high-risk patients of an adverse event based on the STORM dashboard to be reviewed by interdisciplinary team of clinicians. The aim of this randomized evaluation was to evaluate whether patients in the oversight arm had a lower risk of opioid-related serious adverse events or death compared to the non-oversight arm.

Associations Between Neighborhood-Level Factors and Opioid-Related Mortality

My role: coauthor
The primary aim of this study was to (1) identify associations between opioid-related mortality and neighborhood-level risk factors and (2) graphically illustrate the distribution of opioid-related and non-natural premature deaths at the neighborhood level. Using state death certificate data linked to area-level data and multi-level models, we compared the influence of individual-level variables to neighborhood-level factors and identify associations that may be useful to public health practitioners, policymakers and clinicians seeking to address opioid-related mortality.