Recap: CAFE University’s Pollution and Health—From Data to Evidence

March 18, 2026

Thank you to all who joined CAFE University’s Pollution and Health—From Data to Evidence webinar with Dr. Michael Cork, a postdoctoral research associate at the Harvard T.H. Chan School of Public Health! If you missed it, or want to review key points, here's a recap:

Why Exposure–Response Curves are Useful

Fine particulate matter (PM_2.5) drives around 90% of the health burden from air pollution globally, and approximately 40% of Americans live in areas with unhealthy levels. In this presentation, “exposure” refers to something in the environment that people experience (e.g., air pollution); “outcome” is the health effect we measure in response to the exposure (e.g., hospital emissions); and exposure–response curves (ERCs) are a function that describes how the expected outcome changes as exposure varies, capturing the shape, magnitude, and direction of the relationship across the full exposure range.

‍

ERCs can directly inform major policy benchmarks like the WHO’s Global Air Quality Guidelines and the U.S. National Ambient Air Quality Standards (NAAQS). Because the shape of an ERC informs where a safety threshold is set, using the correct methodology to model PM_2.5 concentrations can lead to more accurate thresholds that better protect exposed populations.

Comparing Methods: Regression vs. Causal Inference

Much of the existing literature relies on regression-based approaches, which estimate the exposure–outcome relationship while adjusting for confounders as additional model terms. Dr. Cork's research evaluates these against causal inference methods, which add an explicit design phase to make observational data resemble a randomized experiment before analysis. Across a simulation study of 72 scenarios varying ERC shape, confounding complexity, and sample size, key findings included:

In simple settings: Regression models performed well and were competitive with more complex approaches.
In realistic settings: With nonlinear relationships and complex confounding, causal inference methods (particularly entropy balancing and GPS matching) outperformed regression.
Sample size matters: Entropy balancing was most reliable at moderate sample sizes; generalized propensity score (GPS) matching performed best with larger datasets.

Real-World Example: PM2.5 and Medicare Mortality

In Dr. Cork’s paper, “Methods for estimating the exposure–response curve to inform the new safety standards for fine particulate matter”, he uses a causal inference approach, including GPS matching and entropy-based weighting, combined with flexible modeling of nonlinear exposure–response relationships. This methodology was applied to ~68 million Medicare beneficiaries across 31,000+ ZIP codes (2000–2016).

Using this framework, Dr. Cork identifies a nonlinear relationship between long-term PM_2.5 and all-cause mortality, with the largest marginal increases in risk occurring at lower exposure levels. This risk gradient appeared well below the former NAAQS limit of 12 µg/m³, supporting the EPA's recent decision to tighten the standard to 9 µg/m³ and providing a case for continued reductions below the current standard. While this is just one example, it demonstrates how causal inference methods can be applied to develop a more accurate ERC, better informing policy makers and ultimately protecting exposed populations.

Recap: CAFE University’s Pollution and Health—From Data to Evidence

Why Exposure–Response Curves are Useful

Comparing Methods: Regression vs. Causal Inference

Real-World Example: PM2.5 and Medicare Mortality

Explore

News

CAFE Dataverse and Coding Resources

Join Our Community of Practice