
I am having some trouble proving Corollary 6.3.2 in Borovkov's Probability Theory (for reference, this material is on pages 147 to 149 in the book). For convenience, I provide some definitions and related theorems. Skip to the end for a TLDR.

Definition: We say a function $G$ is a generalized distribution function if it satisfies monotonicity and right-continuity. We denote the class of generalized distribution functions to be $\mathcal{G}$ and the class of distribution functions to be $\mathcal{F}$. Of course, $\mathcal{F}\subseteq \mathcal{G}$ and the only difference is that distribution functions have $\lim_{x\to\infty}F(x) = 1$ and $\lim_{x\to-\infty}F(x) = 0$.

Definition: We say that a sequence of generalized distribution functions $\{G_n\}\subseteq\mathcal{G}$ converges weakly to some $G\in \mathcal{G}$ if, for all points of continuity $x\in\mathbb{R}$ of $G$, we have $G_n(x)\to G(x)$.

Note that the above definition is not as "nice" as weak convergence in distribution functions. Recall that, for $\{F_n\}\subseteq \mathcal{F}$ and $F\in\mathcal{F}$ such that $F_n(x)\to F(x)$ for each point of continuity of $F$, an equivalent condition is to say that, for all bounded continuous functions $f$, we have $$\int f\ \mathrm{d}F_n \to \int f\ \mathrm{d}F.$$ However, this equivalence is not true for the case where $G_n,G\in\mathcal{G}$. Some extra definitions:

Definition: A sequence of probability measures $\{\mathbb{F}_n\}_{n=1}^\infty$ is said to be tight if, for all $\varepsilon>0$, there exists some $N\in\mathbb{N}$ such that $$\inf_n \mathbb{F}_n([-N,N]) > 1- \varepsilon.$$

Definition: A class of bounded continuous functions $\mathcal{L}$ is said to be distribution determining if, for $F\in \mathcal{F}$ and $G\in \mathcal{G}$, $$\int f\ \mathrm{d}F = \int f\ \mathrm{d}G \qquad (\forall f\in \mathcal{L})$$ implies that $F=G$.

The book then presents a variation of Helly's selection theorem and a corollary:

Theorem (Helly's Selection Theorem): Let $\{G_n\}_{n=1}^\infty\subseteq \mathcal{G}$ be a sequence of generalized distribution functions, then there exists some subsequence $\{G_{n_k}\}_{k=1}^\infty$ and $G\in \mathcal{G}$ such that $G_{n_k}$ converges weakly to $G$ for some $G\in \mathcal{G}$. That is, the space $\mathcal{G}$ is sequentially compact with respect to weak convergence.

Corollary: If every convergent subsequence of a sequence $G_n$ converges weakly to the same $G\in\mathcal{G}$ then the entire sequence $G_n$ converges weakly to $G$.

The following theorem is true:

Theorem: Let $\mathcal{L}$ be a distribution determining class and $\{\mathbb{F}_n\}_{n=1}^\infty$ be a sequence of probability measures. Then $\mathbb{F}_n$ converges weakly to some probability measure $\mathbb{F}$ if and only if the sequence $\{\mathbb{F}_n\}_{n=1}^\infty$ is tight and $\lim_n \int f\ \mathrm{d}\mathbb{F}_n$ exists for all $f\in \mathcal{L}$.

I will provide a proof of the converse that basically outlines Borovkov's proof:

Proof. By Helly's selection theorem, there exists a subsequence of distribution functions (that correspond to the probability measures) $F_{n_k}$ that converges weakly to some $F\in \mathcal{G}$. Now, let $\varepsilon>0$. By the tightness of $\mathbb{F}_{n_k}$, can find some $M$ such that, for all $x\geq M$, we have $$\inf_k F_{n_k}(x) \geq \inf_k \mathbb{F}_{n_k}([-M,M]) > 1- \varepsilon.$$ Let $x$ be a point of continuity of $F$ with $x\geq M$ (this must exist as $F$ only has countably many points of discontinuity). We then have $$F(x) = \lim_k F_{n_k}(x) \geq \inf_k F_{n_k}(x) >1- \varepsilon.$$ Further, for all $y\geq x$, we would have $F(y) > 1- \varepsilon$. Therefore, we have $\lim_{x\to\infty}F(x) = 1$. A similar argument shows that $\lim_{x\to-\infty}F(x) = 0$. Hence, we have $F\in \mathcal{F}$ is actually a distribution function.

It remains to show that the entire sequence $\mathbb{F}_n$ converges weakly to the $\mathbb{F}$ (given by the distribution function $F$) above. By Corollary to Helly's selection theorem, it suffices to show, for arbitrary $\{F_{n_j}\}_{j=1}^\infty$ subsequence that converges weakly to some $G\in \mathcal{G}$, we have $F=G$. First, notice that the above argument ensures the limit function is actually a distribution function, so we actually have $G\in\mathcal{F}$.

Let $\mathbb{F}$ and $\mathbb{G}$ denote the probability measures induced by distribution functions $F$ and $G$, respectively. Since $F_{n_j}$ converges weakly to $G \in\mathcal{F}$, we have, for all $f\in \mathcal{L}$, $$\lim_j \int f\ \mathrm{d}\mathbb{F}_{n_j} = \int f\ \mathrm{d}\mathbb{G}.$$ Further, the weak convergence of our original sequence $\{\mathbb{F}_{n_k}\}_{k=1}^\infty$ to $\mathbb{F}$ gives us that, for all $f\in \mathcal{L}$, $$\lim_k \int f\ \mathrm{d}\mathbb{F}_{n_k} = \int f\ \mathrm{d}\mathbb{F}.$$ Since $\lim_n \int f\ \mathrm{d}\mathbb{F}_n$ exists for all $f\in \mathcal{L}$, all its subsequences must converge to the same limit, which gives us $$\int f\ \mathrm{d}\mathbb{F} = \lim_k \int f\ \mathrm{d}\mathbb{F}_{n_k} = \lim_j \int f\ \mathrm{d}\mathbb{F}_{n_j} = \int f\ \mathrm{d}\mathbb{G}$$ As the above equality holds for all $f\in \mathcal{L}$, a distribution determining class, we have $\mathbb{F} = \mathbb{F}'$ and thus $F=F'$, as desired.

However, I am having trouble proving the following corollary (I copied and paste straight from the text): enter image description here

I am having trouble showing that (2) is sufficient. The argument they gave appears to be down the lines of the following (just basing off the proof of the previous theorem)

Consider the sequence $F_n$, by Helly's selection theorem, we have $F_{n_k}$ converges weakly to some $G\in \mathcal{G}$. Further, we have, for each of those subsequences that converge weakly, $$\lim_{k\to\infty}\int f\ \mathrm{d}F_{n_k} = \int f\ \mathrm{d}F.$$ Now, it would be nice if we also write (I think the author just assumes this, but this does not appear to be true for me) $$\lim_{k\to\infty}\int f\ \mathrm{d}F_{n_k} = \int f\ \mathrm{d}G$$ If that's true, then we're done because that would imply $G=F$ and every subsequence that converges in $F_n$ converges to the same distribution function. But we cannot do that, because $G\in \mathcal{G}$ and we know that weak convergence (as defined by pointwise convergence to all points of continuity in the limit function) is no longer equivalent to convergence in integrals.

TLDR: I have having trouble proving (2) is sufficient in the above corollary and the hint the author gave is confusing.

  I looked up Helly's selection theorem on wikipedia and it states that the functions need to be uniformly bounded. Are we assuming elements of $\mathcal{G}$ are uniformly bounded by some constants $a$ and $b$, or is there some stronger version of Helly's Theorem?
  Ok, I looked up the book. They are assuming that for any $G \in\mathcal{G}$, $0\leq G\leq 1$, so that gives us the uniform bound.
  For a family $\mathcal{G}$ of uniformly bounded generalized distributions, every subsequence has a subsequence that converges vaguely (i.e. the test functions are continuous functions with bounded support) this is basically Alaoglu's theorem. If in addition, $\mathcal{G}$ is tight, then to convergence is weakly (test functions are bounded continuous functions).
    – Mittens
    Commented Apr 4, 2021 at 18:17

Let $\mathcal{L} = \{f \in C_b(\mathbb{R}) : f \text{ has period $k$ for some }k \in \mathbb{N}\}$. This family is distribution determining because sub-probability measures on $\mathbb{R}$ are inner regular. The basic idea is that you can approximate the integral of $f \in C_c(\mathbb{R})$ by one of these functions in both $L^1(\eta)$ and $L^1(\mu)$ simultaneously by working on a compact set $K$ which contains the support of $f$ and for which $\eta(K^c),\mu(K^c)< \epsilon$. I can add more details if you need them. Note also that $1 \in \mathcal{L}$.

Call $k_f$ the minimal non-negative integer period of $f$ and let $\mu_n(dx) = \delta_{n!}(dx)$. For each function $f \in \mathcal{L}$ and all $n \geq k_f$, since $k_f$ divides $n!$, we have $f(n!) = \int f(x)\mu_n(dx)= \int f(x) \delta_0(dx) = f(0)$. Clearly $\delta_0$ is a probability measure.

On the other hand, the family $\mu_n$ converges to the zero measure vaguely and in particular is not tight.

  Nice and simple counterexample.

Basically I was able to prove my case of interest, which is just Lévy's continuity theorem. It's easy to see that Lévy's theorem falls easily out of this corollary because $\{e^{itx}\}_{t\in\mathbb{R}}$ is a distribution determining class. However, this is a "special" distribution determining class because, as it turns out, if $\varphi_n(t)\to\varphi(t)$, where $\varphi_n$ and $\varphi$ are characteristic functions of $\mathbb{F}_n$ and $\mathbb{F}$, respectively, one could prove that this condition implies $\mathbb{F}_n$ will be a tight sequence of measures. The proof of tightness relies on the fact that $\varphi(t)$ will be a continuous function, in particular it will be continuous at $0$. Of course, this is a very special case of the Corollary and I'm not sure how much generalization does this hold.

Edit: David Williams' Probability with Martingales proves this small bit in the Central Limit Theorem chapter.


Edit: The initial argument I suggested was incorrect. I am not deleting the answer because it has accumulated some relevant comments. See the answer by Chris Janjigian that gives a counterexample to Cor 6.3.2 parts (2) and (3).

  I don't think it is true that if $F_{n_k}$ converges weakly to a generalized distribution $G$, you have the convergence of integrals (a fact that OP stated in their question). (One can think of $F_n$ being delta masses at $n$, which weakly converge to the generalized distribution that puts 0 mass on all reals, but $\int 1 dF_n \not \rightarrow \int 1 dG$)
    – E-A
    Commented Apr 3, 2021 at 7:33
  • 1
    can you explain your specific situation further as your answer if you get the chance? will definitely upvote it (and appreciate it) as I spent quite a bit thinking about this and even a special, relevant case could be interesting. I do believe theorem as stated is false (if only for the simple reason that Durrett or Chung would have included it); I think some additional property should be imposed on L precisely to allow these limits.
    – E-A
    Commented Apr 4, 2021 at 5:00
  • 1
    sure, but I am not so sure as to whether you'll find it super enlightening as it's basically just a part of Levy's continuity theorem.
    – varpi
    Commented Apr 4, 2021 at 5:10
  • 1
    added a brief outline of my special case and solution
    – varpi
    Commented Apr 4, 2021 at 5:17
  • 1
    much appreciated; you were right; I am not super enlightened since I went over that case trying to answer your question. I believe the theorem as stated is false, but I could not come up with a class L that violates it.
    – E-A
    Commented Apr 4, 2021 at 5:23

