The evidence is mixed but suggests
that these overlooked variables have a profound impact on each patient’s
This article was written by Tim
Suther, Nicole Hobbs, Jeff McGinn, Matt Turner with Change Healthcare, John
Halamka, MD, MS, president of Mayo Clinic Platform, and Paul Cerrato, senior
research analyst and communications specialist, Mayo Clinic Platform.
By one estimate, social determinants
of health (SDoH) influence up to 80% of health outcomes. Although reports like this suggest that these
social factors have a major impact, thought leaders continue to debate whether
they can also enhance the accuracy in predictive models. Resolving that debate
is far from simple because the answer depends on the type, source and quality
of the data, and the design of the model under consideration.
In general, we derive SDoH from subjective
and objective sources. Subjective data includes self-reported or clinician-collected
data such as patient reported outcomes, Z codes from ICD-10-CM that report
factors that influence health status and interactions with health service
providers, and other unstructured EHR data. Objective data includes individual-level
and community-level data from government, public and private (and consumer
behavior) sources; it’s usually more structured and often derived from national-level
Unfortunately, the research on the value of SDoH in predictive models varies widely. Some
studies report no appreciable differences when SDoH are injected into models,
while others report significant enhancements to predictive power.
Unsurprisingly, these varying study results depend in part on levels of reliance
on traditional clinical models and, most importantly, on the types and sources
of SDoH data employed in the studies.
For example, a group from Johns
Hopkins Bloomberg School of Public Health demonstrated SDoH predictive models
can fail in part due to predictive model design as well as to EHR-level data
that is unstructured and collected inconsistently. They also demonstrated that dependence on data
from EHR-derived population health databases for SDoH can be problematic
because the data tends to be used as a proxy for individual-level social factors.
The problem lies in the fact that these
proxies are often based on assumptions, not evidence. Other research supports the above and showcases the
challenges of using SDoH data from sources that traditionally struggle with the
comprehensive collection and standardization of these data types.
On a more positive note, several studies and
healthcare articles have reported success by relying on objectively
collected and/or highly structured and consistent data. For example, one study that used EHR-derived SDoH data sources found that the addition of
structured data on median income, unemployment rate, and education from trustworthy non-EHR sources
enhanced their model’s health prediction
some of the most vulnerable subgroups of patients. In another study, collaboration between Stanford, Harvard, and the Imperial
College London found that adding structured SDoH data from the US Census, along
with using machine learning techniques, improved risk prediction model accuracy
for hospitalization, death, and costs. They also showed that their models based
on SDoH alone, as well as those based on clinical comorbidities alone, could predict
health outcomes and costs. Similarly, researchers at The Ohio State University
College of Medicine
added community-level and consumer behavior data not available in standard EHR
data and found it enhanced the study of and impact on obesity prevention. Juhn et. al. at Mayo Clinic tapped telephone survey data and appended housing and
neighborhood characteristic data from local government sources to create a
socioeconomic status index (HOUSES). They first showed that HOUSES correlated
well with outcome measures and later showed that HOUSES could even serve as a predictive tool for graft
failure in patients.
Patient Level SDoH + Clinical Data =
Incorporating social factors into the healthcare equation can
fill gaps needed at the point of care, but it also generates better healthcare predictions,
but only when these determinants are patient level and linked to robust clinical
data. Change Healthcare, for example, has curated such an integrated
national-level dataset, linking billions of historical de-identified distinct
medical claims with patient-level social, physical and behavioral determinants
of health. One of this dataset’s most important uses is to understand the
relative weight of specific patient SDOH factors, in comparison to clinical
factors alone, for various therapeutic conditions, including COVID-19. For
example, across Change Healthcare’s research, economic stability is repeatedly
ranked as the highest or among the highest predictors of the healthcare
experience. Despite this realization, most end users, including providers and
payers, lack such visibility (or rely on geographic averages that are unhelpful
in making accurate predictive models).
Incorporating SDoH data into
predictive models holds much promise. Given the relative newness of SDoH data
in predictive analytics, along with a lack of data standardization and scale, it’s
not surprising to find varying degrees of success in using it to improve predictive
health models. But as researchers learn more about the best types and sources of
SDoH data to use, along with developing better-suited models for these types of
data, we’re likely to see significant advances in healthcare predictive models.
By combining the right data with the right models, SDoH are a powerful asset in
predictive models of health, outcomes, and potential health disparities.
If you’re still with us . . .
Please consider supporting Dr. Steve Parodi, Reed Abelson and I by “voting up” on our panel at the upcoming South by Southwest conference in March of 2022. Our proposed panel, “Extending the Stethoscope Into the Home,” will dive into a discussion about acute health
care for patients in their home and the infrastructures needed to support it. If you are so inclined to vote, please do so here.