Generally, a statistic of a point process refers to the stochastic variable defined by evaluating a function at the points. In this setting the class of functions of a single variable are referred to as linear statistics. One example is the counting function for eigenvalues in a specified region. The large region size form of the variance of this statistic, when proportional to the surface area, is used to specify the state as being hyperuniform. In the case of GinUE eigenvalues, a central limit theorem for the distribution can be established, and this can be strengthened for a local limit theorem of the underlying probabilities. Moreover, large deviation formulas for the latter are obtained. The counting function linear statistic is discontinuous. Smooth linear statistics with global scaling exhibit distinct large N behaviours, in particular the variance is now an O(1) quantity for GinUE eigenvalues. A decomposition of this variance as a contribution from the bulk, and a contribution from the boundary, is exhibited. The chapter concludes with a discussion of GinUE eigenvalues in spatial modelling.

1 Counting Function in General Domains

The eigenvalues of a non-Hermitian matrix are examples of point processes in the plane. Statistical quantities characterising the point process are functions of the eigenvalues \(\{z_j\}\) of the form \(\sum _{j=1}^N f(z_j)\)—referred to as linear statistics—for a given f. Such statistics are closely related to the correlation functions. Thus

$$\begin{aligned} \Big \langle \sum _{j=1}^N f(z_j) \Big \rangle = \int _{\mathbb C} f(z) \rho _{(1),N}(z)\, d^2 z, \end{aligned}$$
(3.1)

while

$$\begin{aligned} & \textrm{Cov} \, \Big ( \sum _{l=1}^N f(z_l), \sum _{l=1}^N g(z_l) \Big ) \nonumber \\ &\quad = \int _{\mathbb C} d^2z \int _{\mathbb C} d^2z' \, f(z) g(z') \Big ( \rho _{(2),N}^T(z,z') + \rho _{(1),N}(z) \delta (z - z') \Big ) \nonumber \\ & \quad = - {1\over 2} \int _{\mathbb C} d^2z \int _{\mathbb C} d^2z' \, (f(z) -f(z'))(g(z) - g(z')) \rho _{(2),N}^T(z,z'), \end{aligned}$$
(3.2)

see e.g. [250, Sect. 2.1]. Here \( \rho _{(2),N}^T(z,z') := \rho _{(2),N}(z,z') - \rho _{(1),N}(z) \rho _{(1),N}(z')\).

One of the most prominent examples of a linear statistic is the choice \(f(z) = \chi _{z \in \mathcal {D}}\), where \(\mathcal {D} \in \mathbb C\). The linear statistic is then the counting function for the number of eigenvalues in \(\mathcal {D}\), \(N(\mathcal {D})\) say. Let \(E_N(n; \mathcal {D})\) denote the probability that there are exactly n eigenvalues in \(\mathcal {D}\), so that \(E_N(n; \mathcal {D}) = \textrm{Pr} \, (\sum _{j=1}^N \chi _{z_j \in \mathcal {D}} = n)\). Denote the corresponding generating function (in the variable \(1 - \xi \)) by \(\tilde{E}_N(\xi ; \mathcal {D})\) so that

$$\begin{aligned} \tilde{E}_N (\xi ; \mathcal {D}) = \sum _{n=0}^N E_N(n; \mathcal {D}) (1 - \xi )^n. \end{aligned}$$
(3.3)

Note that with \(1 - \xi = e^{it}\) this corresponds to the characteristic function for the probability mass functions \(\{ E_N(n;\mathcal {D}) \}\). A straightforward calculation (see e.g. [237, Proposition 9.1.1]) shows that \(\tilde{E}_{N} (\xi ;\mathcal {D})\) can be expressed in terms of the correlation functions according to

$$\begin{aligned} \tilde{E}_{N} (\xi ;\mathcal {D} ) = 1 + \sum _{n=1}^N {(-\xi )^n \over n!} \int _{\mathcal {D}} d^2z_1 \cdots \int _{\mathcal {D}} d^2z_n \, \rho _{(n),N}(z_1,\dots ,z_n). \end{aligned}$$
(3.4)

It can readily be checked that (3.1) and (3.2) in the special case \(f(z) = g(z) = \chi _{z \in \mathcal {D}}\) are consistent with (3.4).

Specialising now to the circumstance that the PDF for the eigenvalues is of the form (2.87), a particular product formula for \(\tilde{E}_N (\xi ; \mathcal {D})\) can be deduced, which was known to Gaudin [286].

Proposition 3.1

Let \(\{ \rho _{(n),N} \}\) in (3.4) be given by (2.9) where the correlation kernel \(K_N\) corresponds to (2.87). Let \(\mathbb K_{N,\mathcal {D}}\) denote the integral operator supported on \(z_2 \in \mathcal {D}\) with kernel \(K_N(z_1,z_2)\). This integral operator has at most N non-zero eigenvalues \(\{ \lambda _j(\mathcal {D}) \}_{j=1}^N\), where \( 0 \le \lambda _j(\mathcal {D}) \le 1\), and

$$\begin{aligned} \tilde{E}_{N} (\xi ;\mathcal {D} ) = \prod _{j=1}^N (1 - \xi \lambda _j(\mathcal {D})). \end{aligned}$$
(3.5)

Proof

Let \(\{p_s(z)\}_{s=0}^\infty \) be a set of orthogonal polynomials with respect to the inner product \(\langle f, g \rangle := \int _{\mathbb C}w(|z|^2) f(z) g(\bar{z}) \, d^2 z\) with corresponding normalisation denoted by \(h_s\). Taking as a basis \(\{ (w(|z|^2)^{1/2} p_k(z) \}_{k=0}^\infty \) it is straightforward to check that

$$\begin{aligned} K_{N}(z_1,z_2) = (w(|z_1|^2) w(|z_2|^2) )^{1/2} \sum _{s=0}^{N-1} {p_s(z_1) \overline{p_s(z_2)} \over h_s}, \end{aligned}$$
(3.6)

and so the eigenfunctions of \(\mathbb K_{N,\mathcal {D}}\) are of the form \((w(|z|^2)^{1/2} \sum _{s=0}^{N-1} c_s p_s(z)\) (see e.g. [237, proof of Proposition 5.2.2]). Hence there are at most N nonzero eigenvalues, which moreover can be related to a Hermitian matrix and so must be real. In terms of these eigenvalues, the determinantal form (2.9) substituted in (3.4) implies (3.5)—this is a result from the theory of Fredholm integral operators (see e.g. [532]). Since by definition, each n-point correlation is non-negative, we see from the RHS of (3.4) that \(\tilde{E}_{N} (\xi ;\mathcal {~}\hbox {D} ) > 0\) for \(\xi < 0\), and so \( \lambda _j(\mathcal {D}) \ge 0\) (\(j=1,\dots ,N\)). Also, the definition (3.3) tells us that \(\tilde{E}_N (\xi ; \mathcal {D}) > 0\) for all \(\xi < 1\). This would contradict (3.5) if it was to be that any \( \lambda _j(\mathcal {D}) > 1\), since in this circumstance there would be a \(\xi \) in this range such that \(\tilde{E}_{N} (\xi ;\mathcal {D}~\hbox {)} \) vanishes.    \(\square \)

Consider \(\sum _{j=1}^N x_j\) where \(x_j \in \{0,1\}\) is a Bernoulli random variable with \(\textrm{Pr} \,(x_j=1) = \lambda _j(\mathcal {D})\). The characteristic function is \(\prod _{j=1}^N(1 - \lambda _j(\mathcal {D}) + e^{it} \lambda _j(\mathcal {D}))\). With \(e^{it} = 1 - \xi \) this gives the RHS of (3.5). But it has already been noted that \(\tilde{E}_{N} (\xi ;\mathcal {D} )\) with \(\xi \) related to \(e^{it}\) in this way is the characteristic function for the counting statistic \(\mathcal {N}(\mathcal {D})\), and hence the equality in distribution \(\mathcal {N}(\mathcal {D}) \mathop {=}\limits ^\textrm{d} \sum _{j=1}^N \textrm{Bernoulli} (\lambda _j(\mathcal {D}))\) [317, 326]. From this, it follows using the standard arguments (see [223, Sect. XVI.5, Theorem 2]) that a central limit theorem holds for \(\mathcal {N}(\mathcal {D}_N)\) (here the subscript N on \(\mathcal {D}_N\) is to indicate that the region \(\mathcal {D}\) depends on N),

$$\begin{aligned} \lim _{N \rightarrow \infty } { \mathcal {N}(\mathcal {D}_N) - \langle \mathcal {N}(\mathcal {D}_N) \rangle \over (\textrm{Var} \, \mathcal {N}(\mathcal {D}_N))^{1/2}} \mathop {=}\limits ^\textrm{d} \textrm{N}[0,1], \end{aligned}$$
(3.7)

valid provided \( \textrm{Var} \, \mathcal {N}(\mathcal {D}_N) \rightarrow \infty \) as \(N \rightarrow \infty \); see also [172, 498, 510].

A stronger result, extending the central limit theorem (3.7), follows from the fact that (3.5) in the variable \(z=1 - \xi \) has all its zeros on the negative real axis [262].

Proposition 3.2

In the setting of the applicability of Proposition 3.1, and with \(\sigma _{\mathcal {N}_D} := (\textrm{Var} \, {\mathcal {N}}(\mathcal {D}_N))^{1/2}\), we then have that \(\{E_N(k;\mathcal {D}_N) \}\) satisfy the local central limit theorem

$$\begin{aligned} \lim _{N \rightarrow \infty } \, \mathop {\sup }\limits _{x \in (-\infty , \infty )} \Big | \sigma _{\mathcal {N}_D} E_N([\sigma _{\mathcal {N}_D} x + \langle {\mathcal {N}}(\mathcal {D}_N) \rangle ]; \mathcal {D}_N) - {1 \over \sqrt{2 \pi }} e^{- x^2/2} \Big | = 0. \end{aligned}$$
(3.8)

Proof

The fact that the zeros of (3.5) with the variable \(z = 1 - \xi \) are on the negative real axis implies, by Newton’s theorem on log-concavity of the sequence of elementary symmetric functions [448], that \(\{E_N(k;\mathcal {D})\}\) is log concave. It is known that log-concavity is a sufficient condition for extending a central limit theorem to a local limit theorem [84].    \(\square \)

For large N, inside the disk of radius \(\sqrt{N}\), the eigenvalue density for GinUE is constant and the full distribution is rotationally invariant. In such circumstances, for a two-dimensional point process in general, it is known [76] that Var\(\, \mathcal {N}(\mathcal {D}_N)\) cannot grow slower than of order \(| \partial \mathcal {D}_N|\), i.e. the length of the boundary of \(\mathcal {D}_N\). Thus both (3.4) and (3.5) are valid for any region \(\mathcal {D}_N\) constrained strictly inside the disk of radius \(\sqrt{N}\) and with a boundary of length tending to infinity with N. In fact for GinUE more precise asymptotic information is available [148, Eq. (11)], [224, Eq. (2.7)], which gives that for any \(D_0 \subseteq \{z: |z| \le 1 \}\),

$$\begin{aligned} \textrm{Var} \, \mathcal {N}(\sqrt{N} D_0) = \sqrt{N} {| \partial D_0| \over 2 \pi } \int _{-\infty }^\infty \Big ( \textrm{Var} \, \chi _{U \le (1+\textrm{erf}(t/\sqrt{2}))/2} \Big ) \, dt + \textrm{O} \Big ( {1 \over N^{1/2}} \Big ). \end{aligned}$$
(3.9)

Here U is a random variable uniform in [0, 1]. A direct calculation gives that the integral evaluates to \(\sqrt{1 \over \pi }\)—the advantage of the form (3.9) is that it remains true if the appearance of the variance throughout is replaced by any even cumulant [148, 224]. In particular this shows that the growth of the variance with respect to the region is the smallest order possible—the corresponding point process is then referred to as being hyperuniform [289, 523, 524]. A corollary of the property of being hyperuniform, together with the fast decay of the correlations, is that the bulk-scaled GinUE exhibits number rigidity [290, 292]. This means that conditioning on the positions of the (infinite number of) points outside a region \(\mathcal {D}\) fully determines the number of (but not positions of) the points inside \(\mathcal {D}\), and their centre of mass.

1.1 Counting Function in a Disk and Scaled Asymptotics

In the special case that \(\mathcal {D}_N\) is a disk of radius R centred at the origin (we write this as \( D_R\)), the polynomials in (3.6) are simply the monomials \(p_s(z) = z^s\). The eigenfunctions of \(\mathbb K_{N,\mathcal {D}}\) are also given in terms of the monomials as \(\{(w(|z|^2))^{1/2} z^{j-1} \}_{j=1,\dots ,N}\), and hence for the corresponding eigenvalues we have

$$\begin{aligned} \lambda _j( D_R) = \int _0^{R^2} r^{j-1} w(r) \, dr \Big / \int _0^{\infty } r^{j-1} w(r) \, dr, \qquad j=1,\dots ,N. \end{aligned}$$

Substituting in (3.5) and choosing \(w(|z|^2) = \exp (-|z|^2)\) then shows that for the GinUE [232]

$$\begin{aligned} \tilde{E}_N(\xi ;D_R) = \prod _{j=1}^N \Big ( 1 - \xi {\gamma (j;R^2) \over \Gamma (j)} \Big ), \end{aligned}$$
(3.10)

where \(\gamma (j;x)\) denotes the (lower) incomplete gamma function. Note that this remains valid for \(N \rightarrow \infty \) in keeping with the discussion of the previous paragraph. Setting \(\xi = 1\), asymptotic expansions for the incomplete gamma function can be used to deduce the \(N \rightarrow \infty \) asymptotic expansion of \(E_N(0; D_{\alpha \sqrt{N}})\),

$$\begin{aligned} E_N(0; D_{\alpha \sqrt{N}}) = \exp \Big ( C_1N^2 + C_2 N \log N + C_3 N + C_4 \sqrt{N} + {1 \over 3} \log N + \textrm{O}(1) \Big ), \end{aligned}$$
(3.11)

where \(0 < \alpha < 1\). Here the constants \(C_1,\dots ,C_4\) depend on \(\alpha \) and are known explicitly (e.g. \(C_1=-\alpha ^4/4\)), being first given in [232]. The first two of these can be deduced from the large R expansion of the quantity \(F_\infty (0;D_R)\), defined in Remark 3.1.1 below, given in the still earlier work [304]. The \(\log N\) term was determined recently in [149], as too was the explicit form of the next order term, a constant with respect to N.

Let \(\bar{ D}_{\alpha \sqrt{N}}\) denote the region \(\{z: |z| > \alpha \sqrt{N} \}\), i.e. the region outside the disk of radius \(\alpha \sqrt{N}\), where it is assumed \(0 < \alpha < 1\). Note that then \(E_N(0;\bar{ D}_{\alpha \sqrt{N}}) = E_N(N; D_{\alpha \sqrt{N}})\). The analogue of (3.11) has been calculated in [149], where in particular it is found that

$$\begin{aligned} C_1=\alpha ^4/4-\alpha ^2+(1/2)\log \alpha ^2 + 3/4, \end{aligned}$$
(3.12)

the coefficient \(C_4\) is unchanged, while the coefficient \({1 \over 3}\) for \(\log N\) seen in (3.11) is to be replaced by \(- {1 \over 4}\). (We also refer to [175] for an earlier work for which this expansion was obtained up to \(C_3\).) One sees from (3.12) that \(C_1=0\) for \(\alpha = 1\), and the result of [149] gives that \(C_2,C_3\) similarly vanish, giving that \(E_N(N; D_{ \sqrt{N}}) \sim e^{C_4 \sqrt{N}}\). Extending \(\alpha \) larger that 1 according to the precise N dependent value

$$\begin{aligned} \alpha = 1 + {1 \over 2 \sqrt{N}} \Big ( \sqrt{\gamma _N} + {x \over \sqrt{\gamma _N}} \Big ), \qquad \gamma _N = \log {N \over 2 \pi } - 2 \log \log N, \end{aligned}$$
(3.13)

gives the extreme value result \(\lim _{N \rightarrow \infty } E_N(N; D_{ \alpha _N \sqrt{N}}) = \exp (-\exp (-x))\) [469] (see too [145, Th. 1.3 with \(\alpha = 2\)] for a generalisation to the case of (2.81), considered also in [149] for \(\alpha <1\)), which is the Gumbel law. Other references on fluctuations of the spectral radius under various boundary conditions include [63, 105, 128, 142, 283, 489]. Furthermore, an intermediate fluctuation regime which interpolates between the Gumbel law with the large deviation regime (3.12) was investigated in [382]. Another case considered in [149] is when \(D_N\) is specified as the outside of an annulus contained inside of the disk of radius \(\sqrt{N}\). Two features of the corresponding asymptotic expansion (3.11) are: (1) the absence of a term proportional to \(\log N\), and (2) the presence of oscillations of order 1 that are described in terms of the Jacobi theta function. We also refer to [54–56, 151] for further recent studies in this direction in the presence of hard edges.

For general \(\mathcal {D}_N\) with \(| \mathcal {D}_N | \rightarrow \infty \) the coefficient \(C_1\) in (3.11) relates to an energy minimisation (electrostatics) problem, and similarly for the \(| \mathcal {D}| \rightarrow \infty \) expansion of \(E_\infty (0;\mathcal {D})\) [3, 4, 175, 207, 342]. Thus for \(\mathcal {D} = D_{\alpha \sqrt{N}}\) the electrostatics problem is to compute the potential due to a uniform charge density \(1/\pi \) inside a disk of radius \(\alpha \), with a neutralising uniform surface charge density \(-\alpha /2\pi \) on the boundary. The applicability of electrostatics remains true for the asymptotic expansion of \(E_N(k; D_{\alpha \sqrt{N}})\) (and \(E_\infty (k;\mathcal {D})\)) in the so-called large deviation regime, when \(k \ll N \alpha ^2\) or \(k \gg N \alpha ^2\). For a disk the electrostatics problem can be solved explicitly to give [45]

$$\begin{aligned} \begin{aligned} & E_N(k; D_{\alpha \sqrt{N}}) \sim e^{-N^2 \psi _0(\alpha ;k/N)}, \\ &\psi _0(\alpha ;x) = {1 \over 4} \Big | (\alpha ^2 - x)(\alpha ^2 - 3 x) -2 x^2 \log (x/\alpha ^2) \Big |. \end{aligned} \end{aligned}$$
(3.14)

Note in particular that \(\psi _0(\alpha ;0) = \alpha ^4/4\), which is the value of \(-C_1\) in (3.11), while setting \(x=1\) gives the value of \(-C_1\) noted in the above paragraph. There is also a scaling regime, where \(|k - N \alpha ^2| = \textrm{O}(N^{1/2})\), for the asymptotic value of \(E_N(k; D_{\alpha \sqrt{N}})\) which interpolates between (3.14) and the local central limit theorem result (3.8) [224, 383]. In the case of \(E_\infty (k; D_R)\), it makes sense to consider k proportional to not only \(\alpha R^2\) but also to \(\alpha R^\gamma \) with \(\gamma > 2\). Then [224, 342]

$$\begin{aligned} E_\infty (\alpha R^\gamma ; D_R) \sim e^{- {1 \over 2} (\gamma - 2) \alpha ^2 R^{2 \gamma } \log R (1 + \textrm{o}(1) )}. \end{aligned}$$

Remark 3.1

1. Closely related to the probability \(E_N(N;\mathcal {D})\) is the conditioned quantity \(F_N(n;\mathcal {D}) :=\textrm{Pr} (\sum _{j=1}^N \chi _{z_j \in \mathcal {D}} = n | z_j = 0)\). Denote the corresponding generating function by \(\tilde{F}_N(\xi ;\mathcal {D})\). Proceeding as in the derivation of (3.10) shows \(\tilde{F}_N(\xi ; D_R) = \tilde{E}_N(\xi ; D_R)/(1 - \xi (1 - e^{-R^2}))\). Thus in particular \({F}_N(0; D_R) = e^{R^2} E_N(0; D_R)\) [304]. Note that \(-{d \over d R} F(0;D_R)\) gives the spacing distribution between an eigenvalue conditioned to be at the origin, and its nearest neighbour at a distance R. The work [495] gives results relating to the PDF for the minimum of all the nearest neighbour spacings with global scaling, obtaining a scale of \(N^{-3/4}\) and a PDF proportional to \(x^3 e^{-x^4}\).

2. For any \(p \ge 2\), the p-th cumulant \(\kappa _p(D_R)\) of the number of eigenvalues in \(D_R\) can be written as

$$\begin{aligned} \kappa _p(R) = (-1)^{p+1} \sum _{j = 0}^{N-1} \mathrm{{Li}}_{1 - p}\Big ( 1 - \frac{1}{ \lambda _j( D_R) } \Big ), \end{aligned}$$
(3.15)

where \(\mathrm{{Li}}_s(x) = \sum _{k = 1}^{\infty } k^{-s} x^k\) is the polylogarithm function. The formula (3.15) as well as its large N behaviour both in the bulk and at the edge were obtained in [384].

1.2 Counting Function in the Infinite System

We turn our attention now to the circumstance that a (possibly scaled) large N limit has already been taken, and ask about the fluctuations of the number of particles in a region \(\mathcal {N}(\mathcal {D})\) for large values of \(|\mathcal {D}|\), i.e. the volume of \(\mathcal {D}\). The first point to note is that if the coefficient of \(\xi ^n\) in (3.4) tends to zero as \(N \rightarrow \infty \), then the expansion remains valid in this limit [402]. The decay is easy to establish in the determinantal case, since then (see e.g. [237, Eq. (9.13)]) \(\rho _{(n),N}(z_1,\dots ,z_n) \le \prod _{l=1}^n \rho _{(1),N}(z_l)\). Hence it is sufficient that \(\int _\Omega \rho _{(1),N}(z) \, d^2z\) be bounded for \(N \rightarrow \infty \). With the limiting form of (3.4) valid, the theory of Fredholm integral operators [532] tells us that the limit of (3.5) is well defined with the RHS identified as the Fredholm determinant \(\det (\mathbb I - \xi \mathbb K_{\infty ,\mathcal {D}})\). In (3.10) the limit corresponds to simply replacing the upper terminal of the product by \(\infty \). We stipulate the further structure that the correlation kernel be Hermitian, as holds for the appropriately scaled form of (3.6). Then the argument of the proof of Proposition 3.1 tells us that the eigenvalues of \( \mathcal {K}_{\infty ,\mathcal {D}}\) are all between 0 and 1, which in turn allows the reasoning leading to (3.7) to be repeated. The conclusion is, assuming \(\textrm{Var} \, \mathcal {N}(\mathcal {D}) \rightarrow \infty \) as \(|\mathcal {D}| \rightarrow \infty \) which as already remarked is guaranteed by the results of [76], that (3.7) remains valid with \(\mathcal {D}_N\) replaced by \(\mathcal {D}\), and the limit \(N \rightarrow \infty \) replaced by the limit \(|\mathcal {D}| \rightarrow \infty \) [172, 326, 510].

It is moreover the case that in the above setting and with these modifications the local central limit theorem of Proposition 3.2 remains valid [262]. Another point of interest is that the expansion (3.11) is uniformly valid in the variable \(R=\alpha \sqrt{N}\), provided this quantity grows with N, and hence also provides the large R expansion of \(E_\infty (0; D_R)\). Finally, we consider results of [395] as they apply to number fluctuations in the infinite GinUE. The plane is to be divided into squares \(\Gamma _j\) of area \(L^2\) with centres at \(L \mathbb Z^2\). Define \(\Upsilon _j = \mathcal {N}(\Gamma _j)/\sqrt{\textrm{Var} \, \mathcal {N}(\Gamma _j)}\). For large L, in keeping with (3.9) we have \(\textrm{Var} \, (\Gamma _j) \sim 2L/\pi ^{3/2}\). The question of interest is the joint distribution of \(\{ \Upsilon _j \}\). It is established in [395] that for \(L \rightarrow \infty \) this distribution is Gaussian, with covariance \({1 \over 4} [-\Delta ]_{j,k}\), where \(\Delta \) is the discrete Laplacian on \(\mathbb Z^2\). Consequently, fluctuations of \(N(\Gamma _j)\) induce opposite fluctuations in the regions neighbouring \(\Gamma _j\).

For the infinite GinUE the exact result in terms of modified Bessel functions

$$\begin{aligned} \textrm{Var} \, \mathcal {N}( D_R) = R^2 e^{-2 R^2} \Big ( I_0(2 R^2) + I_1(2 R^2) \Big ) = \sum _{j=1}^\infty \frac{ \gamma (j; R^2) }{ \Gamma (j) } \Big ( 1- \frac{ \gamma (j; R^2) }{ \Gamma (j) } \Big ) \end{aligned}$$
(3.16)

is known [496, Th. 1.3], [224, Appendix B]. The second expression in (3.16) also appears in [384] as a large N limit of the number variance of the finite Ginibre ensemble in the deep bulk regime. This exhibits the leading large R form \(R/\sqrt{\pi }\) which is consistent with identifying \(\sqrt{N} | \partial D_0|\) as \(2 \pi R\) on the RHS of (3.9); see also [20].

Remark 3.2

Consider the weakly non-Hermitian limit of the elliptic Ginibre ensemble specified by the correlation kernel (2.42), parametrised by \(\alpha > 0\). Let \(E_\infty ^{\alpha }(0;\mathcal {D}(\chi _{|x|<s}))\) denote the probability that there are no eigenvalues in the strip of the complex plane \(\textrm{Re}\, z < s\). It is a celebrated result of the Kyoto school that for \(\alpha \rightarrow 0\) this gap probability can be expressed as a Painlevé V transcendent in sigma form (\(\sigma \)PV function) [347]. Recently, Bothner and Little [106] extended this result to all \(\alpha > 0\), with the role of the \(\sigma \)PV function now played by a certain integro-differential Painlevé function. In [106] the same authors have obtained an analogous characterisation of \(E_\infty ^{\textrm{edge}, \alpha }(0, \mathcal {D}(\chi _{x > s}))\), i.e. the probability of no eigenvalues with \(\textrm{Re} \, (z) > 0\) for the edge-scaled weakly non-Hermitian limit of the elliptic Ginibre ensemble.

2 Smooth Linear Statistics

The theory of fluctuation formulas for GinUE in the case that f(z) in (3.1) is smooth has some different features to the discontinuous case \(f(z) = \chi _{z \in \mathcal {D}}\). This can be seen by considering the bulk-scaled limit, and in particular the truncated two-point correlation (2.24). From this we can compute the structure factor

$$\begin{aligned} S_\infty ^\textrm{GinUE}(\textbf{k}) := \int _{\mathbb R^2} \Big ( \rho _{(2), \infty }^{\textrm{b} ,T}(\textbf{0}, \textbf{r}) + {1 \over \pi } \delta (\textbf{r}) \Big ) e^{i \textbf{k}\cdot \textbf{r}} \, d \textbf{r}= {1 \over \pi } \Big ( 1 - e^{-|\textbf{k}|^2/4} \Big ). \end{aligned}$$
(3.17)

The knowledge of the structure factor allows the limiting covariance (3.2) to be computed using the Fourier transform

$$\begin{aligned} \textrm{Cov}^\mathrm{GinUE_\infty } \, \Big ( \sum f(\textbf{r}_l), \sum g(\textbf{r}_l) \Big ) = {1 \over (2 \pi )^2} {1 \over \pi } \int _{\mathbb R^2} \hat{f}(\textbf{k}) \hat{g}(- \textbf{k}) \Big ( 1 - e^{-|\textbf{k}|^2/4} \Big ) \, d \textbf{k}, \end{aligned}$$
(3.18)

valid provided the integral converges. Here, with \(z = x + i y\), \(\textbf{r}= (x,y)\) and the Fourier transform \(\hat{f}(\textbf{k})\) is defined by integrating \(f(\textbf{r})\) times \(e^{i\textbf{k}\cdot \textbf{r}}\) over \(\mathbb R^2\)—thus according to (3.17) \(S_\infty ^\textrm{GinUE}(\textbf{k})\) is a particular Fourier transform. Now introduce a scale R so that \(f(\textbf{r}) \mapsto f(\textbf{r}/R), \, g(\textbf{r}) \mapsto g(\textbf{r}/R)\). It follows from (3.18) that

$$\begin{aligned} \lim _{R \rightarrow \infty } \textrm{Cov}^\mathrm{GinUE_\infty } \, \Big ( \sum f(\textbf{r}_l/R), \sum g(\textbf{r}_l/R) \Big ) = {1 \over (2 \pi )^2} {1 \over 4 \pi } \int _{\mathbb R^2} \hat{f}(\textbf{k}) \hat{g}(- \textbf{k}) |\textbf{k}|^2 \, d \textbf{k}, \end{aligned}$$
(3.19)

again provided the integral converges. Most noteworthy is that this limiting quantity is O(1). In contrast, with \(f(\textbf{r}) = g(\textbf{r}) = \chi _{|\textbf{r}| < 1}\), and then introducing R as prescribed above, we know that (3.18) has the evaluation (3.16). As previously commented, the large R form of the latter is proportional to R, which in turn is proportional to the circumference of the disk-shaped region implied by the linear statistic \(\chi _{|\textbf{r}| < R}\).

Remark 3.3

Consider the linear statistic \(A(\textbf{x}) = - \sum _{j=1}^N ( \log | \textbf{x}- \textbf{r}_j| - \log | \textbf{r}_j| )\). In the Coulomb gas picture relating to (1.12), this corresponds to the difference in the potential at \(\textbf{x}\) and the origin. For bulk-scaled GinUE, one has from (3.18) the exact result [42]

$$\begin{aligned} \textrm{Var}^\mathrm{GinUE_\infty } \, A(\textbf{x}) = {1 \over 2} \Big ( 2 \log | \textbf{x}| + ( |\textbf{x}|^2 + 1 ) \int _{|\textbf{x}|^2}^\infty {e^{-t} \over t} \, dt - e^{- | \textbf{x}|^2} + C + 1 \Big ), \end{aligned}$$
(3.20)

where C denotes Euler’s constant. In particular, for large \(|\textbf{x}|\), \( \textrm{Var}^\mathrm{GinUE_\infty } \, A(\textbf{x}) \sim \log | \textbf{x}|\). This last point shows that the introduction of a scale R as in (3.19) would give rise to a divergence proportional to \(\log R\). Such a log-correlated structure underlies a relationship between the logarithm of the absolute value of the characteristic polynomial for GinUE and Gaussian multiplicative chaos [387].

The covariance with test functions \(f(\textbf{r}) \mapsto f(\textbf{r}/\sqrt{N})\), \(g(\textbf{r}) \mapsto g(\textbf{r}/\sqrt{N})\) assumed to take on real or complex values is also an order-one quantity for GinUE in the \(N \rightarrow \infty \) limit, upon the additional assumption that fg are differentiable and don’t grow too fast at infinity [58, 59, 235, 472].

Proposition 3.3

Require that fg have the properties as stated above. Let

$$\begin{aligned} f(\textbf{r}) |_{\textbf{r}= (\cos \theta , \sin \theta )} = \sum _{n=-\infty }^\infty f_n e^{i n \theta } \end{aligned}$$

and similarly for the Fourier expansion of \(g(\textbf{r}) \) for \(\textbf{r}= (\cos \theta , \sin \theta )\). We have

$$\begin{aligned} &\lim _{N \rightarrow \infty } \textrm{Cov}^\textrm{GinUE} \Big ( \sum _{j=1}^N f(\textbf{r}_j/\sqrt{N}), \sum _{j=1}^N \bar{g}(\textbf{r}_j/\sqrt{N}) \Big ) \nonumber \\ & \qquad \qquad \qquad \qquad \qquad = {1 \over 4 \pi } \int _{ |\textbf{r}| < 1} \nabla f \cdot \nabla \bar{g} \, dx dy + {1 \over 2} \sum _{n=-\infty }^\infty |n| f_n \bar{g}_{-n}. \end{aligned}$$
(3.21)

Proof

(Sketch) In the method of [472], a direct calculation using (3.2), (2.18) and (2.10) allows (3.21) to be established for fg polynomials jointly in \(z=x+iy\) and \(\bar{z} = x - iy\). The required integrations can be computed exactly using polar coordinates. To go beyond the polynomial case, the so-called dbar (Cauchy–Pompeiu) representation is used. This gives that for any once continuously differentiable f in the unit disk \( D_1\), and z contained in the interior of the disk,

$$\begin{aligned} f(z) = - {1 \over \pi } \int _{ D_1} {\partial _{\bar{w}} f(w) \over w - z} \, d^2 w + {1 \over 2 \pi i} \int _{\partial D_1} {f(w) \over w - z} \, dw, \end{aligned}$$
(3.22)

where with \(w = \alpha + i \gamma \), \(\partial _{\bar{w}} = {1\over 2} ( {\partial \over \partial \alpha } + i {\partial \over \partial \gamma } )\). The covariance problem is thus reduced to the particular class of linear statistics of the functional form \(h(z) = 1/(w-z)\). The required analysis in this case is facilitated by the use of the corresponding Laurent expansion, with only a finite number of terms contributing after integration.    \(\square \)

Remark 3.4

1. As predicted in [235], upon multiplying the RHS by \(2/\beta \), (3.21) remains valid for the Coulomb gas model (1.12) [75, 394, 491]. In the case of the elliptic GinUE, a simple modification of (3.21) holds true. Thus the domain \(| \textbf{r}| < 1\) in the first term is to be replaced by the appropriate ellipse, and the Fourier components of the second term are now in the variable \(\eta \), where \((A \cos \eta , B \sin \eta )\), \(0 \le \eta \le 2 \pi \) parametrises the boundary of the ellipse. The results of [59, 235, 394] also cover this case.

2. In the case of an ellipse, major and minor axes AB say, there is particular interest in the linear statistic \(P_x := \sum _{j=1}^N x_j\) [155]. Linear response theory gives for the xx component of the susceptibility tensor \(\chi \)—relating the polarisation density to the applied electric field—the formula \(\chi _{xx} = (\beta /(\pi A B)) \lim _{N \rightarrow \infty } \textrm{Var} \, P_x\) (and similarly for the xy and yy components). This same quantity can be computed by consideration of macroscopic electrostatics, which gives \(\chi _{xx} = (A+B)/(\pi B)\). Using (3.21) modified as in the above paragraph, the consistency of these formulas can be verified.

3. Considering further the case of elliptic GinUE, dividing by N and taking the limit \(\tau \rightarrow 1\) gives the GUE with eigenvalues supported on \((-2,2)\), and similarly for the \(\beta \) generalisation limiting to (1.12) restricted to this interval. For this model it is known (see e.g. [250, Eq. (3.2) with the identification \(x= 2 \cos \theta \)]) that

$$\begin{aligned} \lim _{N \rightarrow \infty } \textrm{Cov} \Big ( \sum _{j=1}^N f( x_j), \sum _{j=1}^N {g}(x_j) \Big ) = {2 \over \beta } \sum _{n=1}^\infty n f_n^\textrm{c} g_n^\textrm{c}, \end{aligned}$$

where \(f(x) |_{x = 2 \cos \theta } = f_0^\textrm{c} + 2 \sum _{n=1}^\infty f_n^\textrm{c} \cos n \theta \) and similarly for g(x). We observe that this is identical to the final term in (3.21), modified according to the specifications of point 1. above.

4. There has been a recent application of Proposition 3.3 in relation to the computation of the analogue of the Page curve for a density matrix constructed out of GinUE matrices [164].

We turn our attention now to the limiting distribution of a smooth linear statistic. By way of introduction, consider the particular linear statistic \({1 \over N} \sum _{j=1}^N | \textbf{r}_j |^2\) for GinUE. An elementary calculation gives that the corresponding characteristic function, \(\hat{P}_N(k)\) say, has the exact functional form

$$\begin{aligned} \hat{P}_N(k) = (1 - ik/N)^{-N(N+1)/2}. \end{aligned}$$
(3.23)

It follows from this that after centring by the mean, the limiting distribution is a Gaussian with variance given by (3.21) (which is this specific case evaluates to one). A limiting Gaussian form holds in the general case of the applicability of (3.21), as first proved by Rider and Virág [472].

Proposition 3.4

Let f be subject to the same conditions as in Proposition 3.3, and denote the case \(f = g\) of (3.21) by \(\sigma _f^2\). For the GinUE, if f takes on complex values then as \(N \rightarrow \infty \)

$$\begin{aligned} \sum _{j=1}^N f(\textbf{r}_j/\sqrt{N}) - \Big \langle \sum _{j=1}^N f(\textbf{r}_j/\sqrt{N}) \Big \rangle \mathop {\rightarrow }\limits ^\textrm{d} \textrm{N}[0,\sigma _f] + i \textrm{N}[0,\sigma _f], \end{aligned}$$

while if f is real-valued the RHS of this expression is to be replaced by \( \textrm{N}[0,\sigma _f]\). Moreover, this same limit formula holds for the elliptic GinUE [59] and its \(\beta \) generalisation [394] (both subject to further technical restrictions on f), with the variance modified according to Remark 3.4.1.

Proof

(Comments only) The proof of [472] proceeds by establishing that the higher-order cumulants beyond the variance tend to zero as \(N \rightarrow \infty \). Essential use is made of the rotation invariance of GinUE. The method of [59] uses a loop equation strategy, while [394] involves energy minimisers and transport maps. For GinUE with f a function of \(| \textbf{r}|\), a simple derivation based on the proof of Proposition 2.14 together with a Laplace approximation of the integrals [235] (see also [120, Appendix B]).    \(\square \)

Remark 3.5

Other settings in which Proposition 3.4 has proved to be valid include products of GinUE matrices [171, 371] (with the additional assumption that the test function has support strictly inside the unit circle), and for the complex spherical ensemble of Sect. 2.5 after stereographic projection onto the sphere [88, 471].

3 Spatial Modelling and the Thinned GinUE

The GinUE viewed as a point process in the plane has been used to model geographical regions by way of the corresponding Voronoi tessellation [391], the positions of objects, for example trees in a plantation [390] or the nests of birds of prey [12], and the spatial distribution of base stations in modern wireless networks [188, 440], amongst other examples. The wireless network application has made use of the thinned GinUE, whereby each eigenvalue is independently deleted with probability \((1 - \zeta )\), \(0 < \zeta \le 1\). The effect of this is simple to describe in terms of the correlation functions, according to the replacement

$$\begin{aligned} \rho _{(n),N}(z_1,\dots ,z_n) \mapsto \zeta ^N \rho _{(n),N}(z_1,\dots ,z_n). \end{aligned}$$
(3.24)

With the bulk density of GinUE uniform and is equal to \(1/\pi \), we can also rescale the position \(z_j \mapsto z_j/\zeta \) so that this remains true in the thinned ensemble. For this (3.24) is to be updated to read

$$\begin{aligned} \rho _{(n),N}(z_1,\dots ,z_n) \mapsto \rho _{(n),N}(z_1/\sqrt{\zeta },\dots ,z_n/\sqrt{\zeta }). \end{aligned}$$
(3.25)

Recalling now (2.9) and (2.18), for the bulk-scaled limit of the thinned GinUE we have in particular

$$\begin{aligned} \rho _{(1),\infty }^\textrm{tGinUE}(z) = {1 \over \pi }, \qquad \rho _{(2),\infty }^{\textrm{tGinUE}, T}(w,z) = - {1 \over \pi ^2} e^{-|w - z |^2/\zeta }. \end{aligned}$$

From these functional forms we see

$$\begin{aligned} \int _{\mathbb C} \rho _{(2),\infty }^{\textrm{tGinUE}, T}(w,z) \, d^2z = - {\zeta \over \pi } \ne - \rho _{(1),\infty }^{\textrm{tGinUE}}(w) \qquad \textrm{unless} \, \zeta = 1, \end{aligned}$$

where the superscript “tGinU” denotes the thinned GinUE. Equivalently, in terms of the structure factor (3.17),

$$\begin{aligned} S_\infty ^{\textrm{tGinUE}}(\textbf{0}) = {1 - \zeta \over \pi } \ne 0 \qquad \textrm{unless} \, \zeta = 1. \end{aligned}$$

Due to this last fact, the O(1) scaled covariance for smooth linear statistics (3.18) is no longer true, and now reads instead

$$\begin{aligned} \textrm{Cov}^\textrm{tGinUE} \, \Big ( \sum f(\textbf{r}_l/R), \sum g(\textbf{r}_l/R) \Big ) \mathop {\sim }\limits _{R \rightarrow \infty } {R^2 \over (2 \pi )^2} {(1 - \zeta ) \over \pi } \int _{\mathbb R^2} \hat{f}(\textbf{k}) \hat{g}(- \textbf{k}) \, d \textbf{k}. \end{aligned}$$
(3.26)

This leading dependence on \(R^2\) holds too for the counting function \(f(z) = \chi _{|z| < 1}\), since in distinction to (3.18) the integral now converges. Hence, in the terminology of the text introduced below (3.9), the statistical state is no longer hyperuniform. There is an analogous change to the O(1) scaled covariance (3.21), which is now proportional to N and reads

$$\begin{aligned} \textrm{Cov}^\textrm{tGinUE} \Big ( \sum _{j=1}^N f(\textbf{r}_j/\sqrt{N}), \sum _{j=1}^N \bar{g}(\textbf{r}_j/\sqrt{N}) \Big ) \mathop {\sim }\limits _{N \rightarrow \infty } N {(1 - \zeta )\over \pi } \int _{ |\textbf{r}| < 1} f \bar{g} \, dx dy. \end{aligned}$$
(3.27)

Notwithstanding this difference, the corresponding limiting distribution function is still Gaussian [386]. A more subtle limit, also considered in [386], is when \(N \rightarrow \infty \) and \(\zeta \rightarrow 1^-\) simultaneously, with \(N(1-\zeta )\) fixed. The quantity (3.21) returns to being O(1), but consists of a contribution of the form (3.21), and a term characteristic of a Poisson process; see also [455].

We turn our attention now to the probabilities \(\{ E_N^\textrm{tGUE}(k;D_{\alpha \sqrt{N}}) \}\). Upon consideration of the thinning prescription (3.25), the proof of Proposition 3.1, and (3.10) shows that the corresponding generating function is given by

$$\begin{aligned} \tilde{E}_N^\textrm{tGUE}(\xi ;D_{\alpha \sqrt{\zeta N}}) = \prod _{j=1}^N \bigg ( 1 - \xi \zeta {\gamma (j;\alpha ^2 N) \over \Gamma (j)} \bigg ). \end{aligned}$$
(3.28)

Setting \(\xi = 1\) in this gives the probability \( E_N^\textrm{tGUE}(0;D_{\alpha \sqrt{\zeta N}})\). Note that the implied formula shows \(E_N^\textrm{tGUE}(0;D_{\alpha \sqrt{\zeta N}}) = \tilde{E}_N^\textrm{GUE}(\zeta ;D_{\alpha \sqrt{ N}})\). The large N asymptotics of \(\tilde{E}_N^\textrm{GUE}(\zeta ;D_{\alpha \sqrt{ N}})\), and various generalisations, are available in the literature [116, 150]. Here we present a self-contained derivation of the first two terms (cf. (3.11)).

Proposition 3.5

For large N and with \(0 < \alpha , \zeta < 1\) we have

$$\begin{aligned} \tilde{E}_N^\textrm{tGUE}(0;D_{\alpha \sqrt{\zeta N}}) = \exp \Big ( {\alpha ^2 N } \log (1 - \zeta ) + \sqrt{\alpha ^2 N } \, h(\zeta ) + \textrm{O}(1) \Big ), \end{aligned}$$
(3.29)

 where

$$\begin{aligned} h(\zeta ) & = \int _0^\infty \log \Big ( {1 - (\zeta /2) (1 + \textrm{erf}(t/\sqrt{2})) \over 1 - \zeta } \Big ) \, dt \nonumber \\ &\quad + \int _0^\infty \log \Big ( 1 - (\zeta /2) (1 - \textrm{erf}(t/\sqrt{2})) \Big ) \, dt. \end{aligned}$$
(3.30)

Proof

Our main tool is the uniform asymptotic expansion [529]

$$\begin{aligned} {\gamma (M-j+1;M) \over \Gamma (M-j+1)} \mathop {\sim }\limits _{M \rightarrow \infty } {1 \over 2} \Big ( 1 + \textrm{erf} \Big ( {j \over \sqrt{2M}} \Big ) \Big ); \end{aligned}$$
(3.31)

cf. the leading term in (2.20). Here it is known that the error term has the structure \((1/\sqrt{M})g(j/\sqrt{2M})\), where g(t) is integrable on \(\mathbb R\) and decays rapidly at infinity. This expansion suggests we rewrite the product in (3.28) with \(\xi = 1\) in the form

$$\begin{aligned} (1 - \zeta )^{[M^*]} \bigg ( \prod _{j=1}^{M^*} {1 - \zeta \gamma (j;M^*)/\Gamma (j) \over 1 - \zeta } \bigg ) \bigg ( \prod _{j=M^*+1}^{N} (1 - \zeta \gamma (j;M^*)/\Gamma (j)) \bigg ), \end{aligned}$$

where \(M^* = [\alpha ^2 N]\).

We see that the first term in this expression gives the leading-order term in (3.29). In the first product we change labels \(j \mapsto M^* - j + 1\) \((j=1,\dots ,M^*)\). In the second we change labels \(j \mapsto M^* + j + 1\) (\(j=0,\dots ,N - M^* - 1\)). Now writing both these products as exponentials of sums and applying (3.31) gives, upon recognising the sums as Riemann integrals, the O\(( \sqrt{\alpha ^2 N } )\) term in (3.29).    \(\square \)

The leading term in (3.29) is consistent with the general form expected for thinned log-gas systems, being of the form of the area of the rescaled excluded region, times the density, times \(\log (1 - \zeta )\) [242, Conj. 10].

Remark 3.6

1. The topic of spatial modelling using Ginibre eigenvalues naturally leads to questions on the efficient simulation of the point process confined to a compact subset of \(\mathbb C\). Practical algorithms for this task have been given in [181, 182].

2. Beyond Voronoi cells as a geometrical measure, a topological measure known as persistence diagrams has been introduced in the context of Ginibre point processes in [291].

3. A question of long-standing interest is the spectrum of an adjacency matrix for a random graph or network; see for example [196]. In the particular case of directed random networks, for which the adjacency matrix is asymmetric, relationships to bulk GinUE spectral statistics have been found [460, 537].