Posts

Es werden Posts vom Januar, 2026 angezeigt.

Support recovery for kernel density estimation - easy in theory, impossible in practice?

Bild
Here's a question: Suppose we are given a smoothed density, such as from kernel density estimation (KDE) . Additionally, we have a set of data points that is a superset of the points from which the density was estimated. Is it possible to sort out the original support, i.e. the points that went into the estimation, from those that didn't? Besides showing the possibilities and limitations for signal recovery in an idealized setting, the question is of practical interest in the context of privacy-preserving data publishing and statistical disclosure control. A motivating example Consider a data set of sensitive address coordinates, for example the coordinates of households with disease cases in an epidemiological context, or a collection of burglary cases in the analysis of crime (both contexts relying heavily on density estimation as analytical tool).  Let $x_i \in \mathbb{R}^2, i =1, \ldots, n$ denote the sensitive coordinates in question. The KDE is on a regular grid of evalu...