Derivation of the expected nearest neighbor distance in a homogeneous Poisson process

The average distance between nearest neighbors, randomly placed in $m$-dimensional real space $\mathbb{R}^m$, is a common test statistic to assess levels of clustering or avoidance (for an introduction see Getis & Ord, 1992). It is used from geostatistics to population ecology and particle physics.

Though the theoretical value used in inference is well known, its derivation is skipped over in pretty much every textbook on the subject. The value in question is $\frac{1}{2} \sqrt{\frac{1}{\lambda}}$, with $\lambda$ the intensity (no. of events per unit of space) of the assumed underlying point process. The derivation is shown below.

Background

The $G$-function (not to be confused with others of the same name) is a distance-based measure in the analysis of point processes. More specifically, it is the distribution function of distances between nearest neighbors. (For details, see e.g. the introductory overview by FU Berlin.)

Empirically, it is defined as \[\hat{G}(r) = \frac{1}{n} |\{d_i : d_i \leq r\;\forall i=1,\dots,n\}|\] with $d_i = \min_{i \neq j} \{d_{ij}\}$, where $d_{ij}$ is the distance between the $i$th and $j$th of $n$ different events. $d_i$ is the nearest neighbor distance and $\hat{G}(r)$ the share of events for which this distance is smaller than some radius $r$ (with $|\cdot|$ denoting set size). For $r \to 0$ we have $G(r) \to 0$ and for $r \to \infty$ respectively $G(r) \to 1$. Intuitively, if $G(r)$ is close to 1 for $r$ relatively small, this means that most events find their nearest neighbors closeby, hinting at strong spatial clustering.

In the case of randomly distributed (homogeneous) events, the theoretical form of $G$ is derived from the Poisson process of constant intensity $\lambda$ (Stoyan 2006, p.18): \[G(r) = 1 - \exp(-\lambda \pi r^2)\]

Derivation 

We are interested in the expected nearest neighbor distance under the assumption of spatial randomness. The following derivations are built on the contributions by Clark & Evans (1954) as well as Hertz (1909):

From the theoretical distribution function $G(r)$ we get the density via differentiation: \[\frac{\partial \, \big(1- \exp(-\lambda \pi r^2)\big)}{\partial \, r} = 2 \lambda \pi r \, \exp(-\lambda \pi r^2)\] The expectation is then found as the first moment:\[\begin{aligned} \textrm{E}(r) & = \int_0^\infty 2 \lambda \pi r^2 \, \exp(-\lambda \pi r^2) \; \mathrm{d}r \\ & = 2\lambda \pi \, \bigg(\Big[-\frac{r \, \exp(-\lambda \pi r^2)}{2\lambda \pi} \Big]_0^\infty - \int_0^\infty - \, \frac{\exp(-\lambda \pi r^2)}{2\lambda \pi} \; \mathrm{d}r \bigg) \\ & = \int_0^\infty \exp(-\lambda \pi r^2) \; \mathrm{d}r
        \end{aligned}\] After substituting $z := \sqrt{\lambda \pi} r$ we get a non-elementary Gaussian integral with known solution. From $\int_{-\infty}^\infty \exp(-x^2) \; \mathrm{d}x = \sqrt{\pi} \Leftrightarrow \int_0^\infty \exp(-x^2) \; \mathrm{d}x = \frac{\sqrt{\pi}}{2}$ it follows: \[\textrm{E}(r) = \frac{1}{\sqrt{\lambda \pi}} \; \int_0^\infty \exp(-z^2) \; \mathrm{d}z = \frac{1}{\sqrt{\lambda \pi}} \cdot \frac{\sqrt{\pi}}{2} = \underline{\underline{\frac{1}{2} \sqrt{\lambda^{-1}}}}\]

Literature

P.J. Clark & F.C. Evans, "Distance to nearest neighbors as a measure of spatial relationships in populations," Ecology, vol.35, no.4, pp.445-453, 1954.

A. Getis & J.K. Ord, "The analysis of spatial association by use of distance statistics," Geographical Analysis, vol.24, no.3, pp.189-206, 1992.

P. Hertz, "Über den gegenseitigen durchschnittlichen Abstand von Punkten, die mit bekannter mittlerer Dichte im Raume angeordnet sind," Mathematische Annalen, vol.67, pp.387-398, 1909.

D. Stoyan, "Fundamentals of point process statistics," in Case Studies in Spatial Point Process Modeling, Lecture Notes in Statistics 185 (A. Baddeley, P. Gregori, J. Mateu, R. Stoica, and D. Stoyan, eds.), ch.1, pp.3-22, Springer, 2006.

Kommentare

Beliebte Posts aus diesem Blog

On the reversibility of Voronoi geomasking

Herfindahl-Hirschman-Index als Maß für die Diversität von Herkünften auf Gemeindeebene [deutsch]