30
$\begingroup$

The Maximum of $X_1,\dots,X_n. \sim$ i.i.d. Standardnormals converges to the Standard Gumbel Distribution according to Extreme Value Theory.

How can we show that?

We have

$$P(\max X_i \leq x) = P(X_1 \leq x, \dots, X_n \leq x) = P(X_1 \leq x) \cdots P(X_n \leq x) = F(x)^n $$

We need to find/choose $a_n>0,b_n\in\mathbb{R}$ sequences of constants such that: $$F\left(a_n x+b_n\right)^n\rightarrow^{n\rightarrow\infty} G(x) = e^{-\exp(-x)}$$

Can you solve it or find it in literature?

There are some examples pg.6/71, but not for the Normal case:

$$\Phi\left(a_n x+b_n\right)^n=\left(\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{a_n x+b_n} e^{-\frac{y^2}{2}}dy\right)^n\rightarrow e^{-\exp(-x)}$$

$\endgroup$
0

3 Answers 3

33
$\begingroup$

An indirect way, is as follows:
For absolutely continuous distributions, Richard von Mises (in a 1936 paper "La distribution de la plus grande de n valeurs", which appears to have been reproduced -in English?- in a 1964 edition with selected papers of his), has provided the following sufficient condition for the maximum of a sample to converge to the standard Gumbel, $G(x)$:

Let $F(x)$ be the common distribution function of $n$ i.i.d. random variables, and $f(x)$ their common density. Then, if

$$\lim_{x\rightarrow F^{-1}(1)}\left (\frac d{dx}\frac {(1-F(x))}{f(x)}\right) =0 \Rightarrow X_{(n)} \xrightarrow{d} G(x)$$

Using the usual notation for the standard normal and calculating the derivative, we have

$$\frac d{dx}\frac {(1-\Phi(x))}{\phi(x)} = \frac {-\phi(x)^2-\phi'(x)(1-\Phi(x))}{\phi(x)^2} = \frac {-\phi'(x)}{\phi(x)}\frac {(1-\Phi(x))}{\phi(x)}-1$$

Note that $\frac {-\phi'(x)}{\phi(x)} =x$. Also, for the normal distribution, $F^{-1}(1) = \infty$. So we have to evaluate the limit

$$\lim_{x\rightarrow \infty}\left (x\frac {(1-\Phi(x))}{\phi(x)}-1\right) $$

But $\frac {(1-\Phi(x))}{\phi(x)}$ is Mill's ratio, and we know that the Mill's ratio for the standard normal tends to $1/x$ as $x$ grows. So

$$\lim_{x\rightarrow \infty}\left (x\frac {(1-\Phi(x))}{\phi(x)}-1\right) = x\frac {1}{x}-1= 0$$

and the sufficient condition is satisfied.

The associated series are given as $$a_n = \frac 1{n\phi(b_n)},\;\;\; b_n = \Phi^{-1}(1-1/n)$$

ADDENDUM

This is from ch. 10.5 of the book H.A. David & H.N. Nagaraja (2003), "Order Statistics" (3d edition).

$\xi_a = F^{-1}(a)$. Also, the reference to de Haan is "Haan, L. D. (1976). Sample extremes: an elementary introduction. Statistica Neerlandica, 30(4), 161-172." But beware because some of the notation has different content in de Haan -for example in the book $f(t)$ is the probability density function, while in de Haan $f(t)$ means the function $w(t)$ of the book (i.e. Mill's ratio). Also, de Haan examines the sufficient condition already differentiated.

enter image description here

$\endgroup$
4
  • $\begingroup$ I'm not quite sure I understood your solution. So you took $F$ to be the standard normal CDF. I followed through and agree that the sufficient condition is satisfied. But how is the associated series $a_n$ and $b_n$ all of the sudden given by those? $\endgroup$ Commented Jul 7, 2014 at 14:16
  • $\begingroup$ @renrenthehamster I think these two parts are independently stated (no direct connection). $\endgroup$
    – emcor
    Commented Jul 7, 2014 at 15:12
  • $\begingroup$ And so how might the associated series be obtained? Anyway, I opened a question about this issue (and more generally, for other distributions beyond the standard normal) $\endgroup$ Commented Jul 7, 2014 at 15:29
  • $\begingroup$ @renrenthehamster I have added relevant material. I don't believe there is a standard recipe for all cases, to find these series. $\endgroup$ Commented Jul 7, 2014 at 16:10
23
$\begingroup$

The question asks two things: (1) how to show that the maximum $X_{(n)}$ converges, in the sense that $(X_{(n)}-b_n)/a_n$ converges (in distribution) for suitably chosen sequences $(a_n)$ and $(b_n)$, to the Standard Gumbel distribution and (2) how to find such sequences.

The first is well-known and documented in the original papers on the Fisher-Tippett-Gnedenko theorem (FTG). The second appears to be more difficult; that is the issue addressed here.

Please note, to clarify some assertions appearing elsewhere in this thread, that

  1. The maximum does not converge to anything: it diverges (albeit extremely slowly).

  2. There appear to be different conventions concerning the Gumbel distribution. I will adopt the convention that the CDF of a reversed Gumbel distribution is, up to scale and location, given by $1-\exp(-\exp(x))$. A suitably standardized maximum of iid Normal variates converges to a reversed Gumbel distribution.


Intuition

When the $X_i$ are iid with common distribution function $F$, the distribution of the maximum $X_{(n)}$ is

$$F_n(x) = \Pr(X_{(n)}\le x) = \Pr(X_1 \le x)\Pr(X_2 \le x) \cdots \Pr(X_n \le x) = F^n(x).$$

When the support of $F$ has no upper bound, as with a Normal distribution, the sequence of functions $F^n$ marches forever to the right without limit:

Figure 1

Partial graphs of $F_n$ for $n=1,2,2^2, 2^4, 2^8, 2^{16}$ are shown.

To study the shapes of these distributions, we can shift each one back to the left by some amount $b_n$ and rescale it by $a_n$ to make them comparable.

Figure 2

Each of the previous graphs has been shifted to place its median at $0$ and to make its interquartile range of unit length.

FTG asserts that sequences $(a_n)$ and $(b_n)$ can be chosen so that these distribution functions converge pointwise at every $x$ to some extreme value distribution, up to scale and location. When $F$ is a Normal distribution, the particular limiting extreme value distribution is a reversed Gumbel, up to location and scale.


Solution

It is tempting to emulate the Central Limit Theorem by standardizing $F_n$ to have unit mean and unit variance. This is inappropriate, though, in part because FTG applies even to (continuous) distributions that have no first or second moments. Instead, use a percentile (such as the median) to determine the location and a difference of percentiles (such as the IQR) to determine the spread. (This general approach should succeed in finding $a_n$ and $b_n$ for any continuous distribution.)

For the standard Normal distribution, this turns out to be easy! Let $0 \lt q \lt 1$. A quantile of $F_n$ corresponding to $q$ is any value $x_q$ for which $F_n(x_q) = q$. Recalling the definition of $F_n(x) = F^n(x)$, the solution is

$$x_{q;n} = F^{-1}(q^{1/n}).$$

Therefore we may set

$$b_n = x_{1/2;n},\ a_n = x_{3/4;n} - x_{1/4;n};\ G_n(x) = F_n(a_n x + b_n).$$

Because, by construction, the median of $G_n$ is $0$ and its IQR is $1$, the median of the limiting value of $G_n$ (which is some version of a reversed Gumbel) must be $0$ and its IQR must be $1$. Let the scale parameter be $\beta$ and the location parameter be $\alpha$. Since the median is $\alpha + \beta \log\log(2)$ and the IQR is readily found to be $\beta(\log\log(4) - \log\log(4/3))$, the parameters must be

$$\alpha = \frac{\log\log 2}{\log\log(4/3) - \log\log(4)};\ \beta = \frac{1}{\log\log(4) - \log\log(4/3)}.$$

It is not necessary for $a_n$ and $b_n$ to be exactly these values: they need only approximate them, provided the limit of $G_n$ is still this reversed Gumbel distribution. Straightforward (but tedious) analysis for a standard normal $F$ indicates that the approximations

$$a_n^\prime = \frac{\log \left(\left(4 \log^2(2)\right)/\left(\log^2\left(\frac{4}{3}\right)\right)\right)}{2\sqrt{2\log (n)}},\ b_n^\prime = \sqrt{2\log (n)}-\frac{\log (\log (n))+\log \left(4 \pi \log ^2(2)\right)}{2 \sqrt{2\log (n)}}$$

will work fine (and are as simple as possible).

Figure 3

The light blue curves are partial graphs of $G_n$ for $n=2, 2^6, 2^{11}, 2^{16}$ using the approximate sequences $a_n^\prime$ and $b_n^\prime$. The dark red line graphs the reversed Gumbel distribution with parameters $\alpha$ and $\beta$. The convergence is clear (although the rate of convergence for negative $x$ is noticeably slower).


References

B. V. Gnedenko, On The Limiting Distribution of the Maximum Term in a Random Series. In Kotz and Johnson, Breakthroughs in Statistics Volume I: Foundations and Basic Theory, Springer, 1992. Translated by Norman Johnson.

$\endgroup$
6
  • $\begingroup$ @Vossler The formula in Alecos's post for $a_n$ converges to $0$ as $n\to\infty$. It behaves like $\left(2 \log(n) - \log(2\pi)\right)^{-1/2}$ for large $n$. $\endgroup$
    – whuber
    Commented Mar 16, 2016 at 17:42
  • $\begingroup$ Yes, that's true, I realized this shortly after I posted my comment so I deleted it immediately. Thank you! $\endgroup$
    – Vossler
    Commented Mar 16, 2016 at 17:49
  • 2
    $\begingroup$ @Jess I had hoped that this answer would be understood as showing, among other things, that there is no such thing as "the" formula: there are uncountably many correct formulas for the $a_n$ and $b_n.$ $\endgroup$
    – whuber
    Commented Oct 23, 2019 at 21:11
  • $\begingroup$ @Jess That's better, because demonstrating an alternative approach was the motivation to write this answer. I don't understand your insinuation that I considered it "useless to write down an answer," because that's explicitly what I have done here. $\endgroup$
    – whuber
    Commented Oct 23, 2019 at 21:21
  • $\begingroup$ @Jess I cannot continue this conversation because it's entirely one-sided: I have yet to recognize anything I have written in any of your characterizations. I'm quitting while I'm behind. $\endgroup$
    – whuber
    Commented Oct 23, 2019 at 21:39
2
$\begingroup$

Here is a "direct" approach. Let $a_n > 0$, $b_n$ to be determined so that $a_nx+b_n \rightarrow +\infty$ for all $x$.

From L'Hospital's rule, $$ \underset{A \rightarrow \infty }{lim} \frac{\int_A^{+\infty} e^{-u^2/2} du}{A^p e^{-A^2/2}} = 1 $$ when $p=-1$, so we have: $$ F(a_n x + b_n) = 1 - \frac{1}{\sqrt{2\pi}}\, \frac{e^{-(a_nx+b_n)^2/2}}{(a_nx+b_n)}(1+o(1)) $$ where $o(1) \rightarrow 0$ under the running assumption on $a_n, b_n$.

Then $$ \begin{align*} ln\!\left(\ F(a_n x + b_n)^n \right) &= n\, ln\!\left( 1 - \frac{1}{\sqrt{2\pi}}\, \frac{e^{-(a_nx+b_n)^2/2}}{(a_nx+b_n)}(1+o(1))\right) \\ & = - \frac{n}{\sqrt{2\pi}}\, \frac{e^{-(a_nx+b_n)^2/2}}{(a_nx+b_n)}(1+o(1)) \\ & = - \frac{n}{\sqrt{2\pi}}\, \frac{e^{-a_n^2x^2/2-b_n^2/2 - a_nb_nx}}{(a_nx+b_n)}(1+o(1)) \end{align*} $$ If we take $a_n = 1/b_n \rightarrow 0^+$, the required assumption $a_nx+b_n \rightarrow +\infty$ for all $x$ is satisfied, and:

$$ \begin{align*} ln\!\left(\ F(a_n x + b_n)^n \right) &= - \frac{n}{\sqrt{2\pi}\, b_n \,e^{b_n^2/2} }\, e^{-x}(1+o(1)) \end{align*} $$

Now all that remains to do to get the result is to show one can choose $b_n$ so that $$ b_n \,e^{b_n^2/2} ~=~ \frac{n}{\sqrt{2\pi}}(1+o(1)) \ \ , $$ which is clearly feasible.

Solving this equation "asymptotically" is amusing. Obviously the big factor on the left is $e^{b_n^2/2}$, which suggests $b_n \approx \sqrt{2\,ln(n)}$. After some trial and error, one possible explicit solution is:

$$ b_n = \sqrt{ ln\!\left( \frac{n^2}{4\pi\, ln(n)} \right) } = \sqrt{2 ln(n) \left( 1 - \frac{ln(4\pi\,ln(n))}{2\, ln(n)} \right)} $$

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.