Kernel Choice: The Unsung Hero of Accurate Boundary Inference
"Why selecting the right kernel can dramatically improve the reliability of local polynomial density estimation."
In various fields, including economics, statistics, and social sciences, understanding probability density at boundary points is crucial for accurate analysis. For example, economists study wage distributions at the lower tail to understand poverty and inequality. In regression discontinuity (RD) analysis, testing for manipulation often involves examining the density of a 'running' variable at a specific threshold. A popular tool for this is the local polynomial density (LPD) estimator.
The LPD estimator, known for its desirable features such as automatic boundary adaptation, has become a go-to method for manipulation testing in RD designs. Its widespread adoption is evident in numerous empirical economics papers and related fields. However, despite its popularity, the LPD estimator sometimes exhibits poor numerical performance at boundaries, leading to unreliable results.
Recent research reveals that kernel selection has a profound impact on the performance of LPD estimators at boundary points. Commonly used kernels with compact support can lead to larger variances in both asymptotic and finite samples. This negatively affects manipulation testing, often resulting in low statistical power. Conversely, using a kernel function with unbounded support, such as the spline-type kernel (Laplace density), can significantly improve accuracy and power.
Why Kernel Choice Matters: Beyond the Standard Assumptions

Kernel selection often takes a backseat in discussions about kernel-based estimators, with the general belief that it doesn't significantly impact performance. However, for LPD estimators at boundary points, this assumption doesn't hold. Compactly supported kernels, commonly used in these estimations, lead to larger asymptotic and finite-sample variances. This translates to a non-negligible efficiency loss, making the estimator less reliable.
- Asymptotic Efficiency: Contrary to common assumptions, the asymptotic efficiency of the LPD estimator is heavily influenced by the kernel choice. Compactly supported kernels have a larger asymptotic variance than spline-type kernels.
- Finite Sample Variance: LPD estimators using compactly supported kernels do not have a finite variance in finite samples, a property inherited from local polynomial techniques.
- Impact on Manipulation Testing: Undesirable variance properties can significantly lower the power of manipulation tests in RD designs, especially when sample sizes are small near the cutoff.
The Path Forward: Kernel Selection as a Key Tool
Kernel selection is a critical aspect of boundary inference when using the LPD estimator. Choosing the right kernel function, particularly one with non-compact support like the spline-type kernel, can significantly improve estimation and inference accuracy. This has far-reaching implications, particularly for manipulation testing in RD designs, where accurate and powerful tests are essential for reliable policy and research conclusions.