Difference-in-Differences Analysis in Policy Evaluation

Parallel Trends: How to Trust Difference-in-Differences Analysis in Policy Evaluation

Avery Sinclair in Politics & Policy December 2025 • 4 min read.

"Uncover the pitfalls of relying on parallel trends assumptions with covariates in policy analysis and explore robust alternative strategies to ensure accurate results."

In the world of policy evaluation, the difference-in-differences (DiD) method stands out as a popular tool. It's used to measure the impact of a specific intervention or treatment by comparing changes in outcomes between a treated group and a control group over time. But what happens when we need to account for other factors that might influence these outcomes? This is where covariates come in. Covariates are additional variables that researchers include to ensure a more accurate and reliable analysis.

Imagine you're evaluating a new education policy aimed at improving student test scores. Simply comparing the change in scores before and after the policy in schools that adopted it versus those that didn't might not tell the whole story. Factors like the socioeconomic status of students, prior educational resources, or even regional differences could also play a significant role. To address this, researchers often include covariates in their DiD analysis to isolate the true effect of the policy.

However, including covariates isn't always straightforward. The underlying assumption in DiD analysis is that, without the intervention, the treated and control groups would have followed parallel trends. But what if these trends are only parallel after accounting for the covariates? In a recent research paper, economists Carolina Caetano and Brantly Callaway shed light on the challenges and potential pitfalls of DiD analysis when parallel trends are conditional on covariates, offering alternative strategies for more trustworthy policy evaluations.

The Hidden Weaknesses of TWFE Regressions

Difference-in-Differences Analysis in Policy Evaluation

One of the most common ways to implement DiD with covariates is through two-way fixed effects (TWFE) regressions. TWFE is a statistical technique that allows researchers to control for time-invariant differences between groups (fixed effects) and time-specific shocks that affect all groups equally. It's widely used because it's relatively easy to implement and interpret.

Caetano and Callaway's work, however, reveals several weaknesses of TWFE regressions when covariates are involved. Even in simple cases with just two time periods, TWFE can produce misleading results. These weaknesses arise from several sources:

Linearity Assumptions: TWFE regressions assume a linear relationship between the covariates and the outcome variable. If this assumption is violated, the results can be biased.
Time-Varying Covariates: When covariates change over time, TWFE regressions may not fully capture their impact. The transformation used to eliminate fixed effects can also eliminate important information about the level of the covariates.
Time-Invariant Covariates: TWFE regressions typically absorb time-invariant covariates into the fixed effects, meaning their influence isn't directly estimated. This can be problematic if these time-invariant factors are important confounders.

The researchers highlight a collection of additional issues they refer to as 'hidden linearity bias.' This bias occurs because the transformations used to eliminate the unit fixed effect also transform the covariates, potentially distorting their relationship with the outcome variable. For example, taking first differences (calculating the change over time) can drop time-invariant covariates altogether and cause the estimating equation to only include the change in covariates over time, not their level.

Alternative Estimation Strategies

Given these challenges, Caetano and Callaway propose alternative estimation strategies that can circumvent the limitations of TWFE regressions. These strategies build on recent developments in the DiD literature, particularly augmented inverse propensity score weighting (AIPW) estimators. AIPW estimators combine outcome regression models with propensity score models, offering a more robust approach to causal inference. These models do not suffer from the hidden linearity bias.

About this Article -

This article was crafted using a human-AI hybrid and collaborative approach. AI assisted our team with initial drafting, research insights, identifying key questions, and image generation. Our human editors guided topic selection, defined the angle, structured the content, ensured factual accuracy and relevance, refined the tone, and conducted thorough editing to deliver helpful, high-quality information.See our About page for more information.

This article is based on research published under:

DOI-LINK: https://doi.org/10.48550/arXiv.2406.15288,

Title: Difference-In-Differences When Parallel Trends Holds Conditional On Covariates

Subject: econ.em

Authors: Carolina Caetano, Brantly Callaway

Published: 21-06-2024

Everything You Need To Know

What is the main objective of the Difference-in-Differences (DiD) method in policy evaluation?

The primary goal of the Difference-in-Differences (DiD) method is to measure the impact of a specific intervention or treatment. It achieves this by comparing the changes in outcomes between a treated group and a control group over a specific period. This comparison allows researchers to isolate the effect of the intervention from other factors that might influence the outcomes.

How do covariates influence the accuracy of Difference-in-Differences (DiD) analysis, and why are they important?

Covariates are additional variables included in DiD analysis to enhance accuracy and reliability. They account for factors beyond the intervention that could influence the outcomes. These could include socioeconomic status, prior educational resources, or regional differences. By incorporating covariates, researchers aim to isolate the true effect of the intervention, ensuring a more accurate policy evaluation.

What are the primary weaknesses of using Two-Way Fixed Effects (TWFE) regressions with covariates in Difference-in-Differences (DiD) analysis, as highlighted by Caetano and Callaway?

Caetano and Callaway identified several weaknesses of TWFE regressions when covariates are involved. Firstly, TWFE assumes a linear relationship between covariates and the outcome, which, if violated, can introduce bias. Secondly, TWFE may not fully capture the impact of time-varying covariates. Finally, time-invariant covariates are often absorbed into fixed effects, meaning their influence is not directly estimated, potentially missing significant confounding factors. They also mention hidden linearity bias, which can distort the relationship between covariates and outcomes due to the transformations used to eliminate fixed effects.

Why does the 'hidden linearity bias' occur in Two-Way Fixed Effects (TWFE) regressions, and what impact does it have on the analysis?

The 'hidden linearity bias' arises because the transformations used in TWFE regressions to eliminate unit fixed effects also transform the covariates. This transformation can distort their relationship with the outcome variable. For instance, taking first differences (calculating the change over time) can eliminate time-invariant covariates altogether, causing the estimating equation to include only changes in the covariates rather than their levels. This can lead to misleading or biased results in policy evaluation.

What alternative estimation strategies do Caetano and Callaway suggest to overcome the limitations of Two-Way Fixed Effects (TWFE) regressions in Difference-in-Differences (DiD) analysis?

To circumvent the limitations of TWFE regressions, Caetano and Callaway propose alternative estimation strategies based on augmented inverse propensity score weighting (AIPW) estimators. These estimators combine outcome regression models with propensity score models, offering a more robust approach to causal inference. AIPW estimators are designed to mitigate the issues of hidden linearity bias that plague TWFE regressions, providing a more accurate and reliable method for policy evaluation.