At the intersection between epidemiology and machine learning stands Zachary Butzin-Dozier, a postdoctoral scholar with the Biostatistics Division at the UC Berkeley School of Public Health. His research focuses on applying advanced machine learning techniques to answer critical public health questions, particularly in the domain of COVID-19 and long COVID.
The crux of Zachary's research lies in targeted learning - a methodology that combines machine learning's predictive nature with causal inference's scientific rigor.
According to Zachary, causal inference is a way to figure out if one thing (like a treatment or action) actually causes another thing (like a health outcome). It's different from just noticing that two things happen together. For instance, if we see that people who take a new medicine also get better more often, causal inference helps us determine if the medicine is the reason they get better, not just a coincidence.
His primary focus is the National COVID Cohort Collaborative (N3C), a vast data source that consolidates electronic health records from disparate healthcare systems across the United States. Despite the challenges of accessing and integrating such fragmented data, Zachary sees immense potential in the dataset, which comprises information on over 22 million patients, including more than 8 million COVID-positive cases.
"The sheer volume of data allows us to use advanced methods to retain generalizability and make strong inferences," Zachary explains. However, he also notes the inherent biases in the dataset, which predominantly includes individuals who are sicker, older, and more privileged.
One of his projects involves investigating whether selective serotonin reuptake inhibitors (SSRIs) can mitigate the risk of long COVID. SSRIs are a type of medication commonly used to treat depression and anxiety. Drawing on recent hypotheses linking low serotonin levels to a wide array of long COVID symptoms, Zachary’s team analyzed N3C data to assess the protective effect of SSRIs. They found that SSRI use correlated with a roughly 10% lower risk of long COVID among patients with depression, a finding that contributes to the ongoing exploration of long COVID mechanisms and treatments.
Zachary is driven by both the significance of his research and the collaborative environment at Berkeley. He expresses deep gratitude for his mentors and colleagues.
“I think having an incredible research team supporting me is really motivating. I’m very excited about this generally and want to find good answers to these questions,” he added.
Zachary aims to continue his research with plans to enter the faculty job market. His ongoing projects include exploring the timing of COVID-19 vaccinations and their impact on long COVID risk, as well as examining racial biases in long COVID diagnoses.