0
$\begingroup$

Introduction (part 1). In the following excerpts of Villani (2008) Optimal transport, old and new, Villani (i) defines the Wasserstein distance among two probability measures $\mu$ and $\nu$ on a ${\color{red}{\textbf{Polish space}}}$ $(\mathcal{X}, d)$ where $d$ is/should be a metric/distance and (ii) shows a relationship between the Wasserstein distance and the total variation distance. Therefore, for both the Wasserstein distance and the total variation distance, the two probability measures $\mu$ and $\nu$ are on a ${\color{red}{\textbf{Polish space}}}$ $(\mathcal{X}, d)$:

Page 10:

All measures considered in the text are Borel measures on ${\color{red}{\textbf{Polish space}}}$, which are complete, separable metric spaces, equipped with their Borel $\sigma$-algebra.

Page 105:

Definition 6.1 (Wasserstein distances).

Let $(\mathcal{X}, d)$ be a ${\color{red}{\textbf{Polish metric space}}}$, and let $p \in > [1, \infty)$. For any two probability measures $\mu$,$\nu$ on $\mathcal{X}$, the Wasserstein distance of order $p$ between $\mu$ and $\nu$ is defined by the formula ...

Page 115:

Theorem 6.15 (Wasserstein distance is controlled by weighted total variation).

Let $\mu$ and $\nu$ be two probability measures on a ${\color{red}{\textbf{Polish space}}}$ $(\mathcal{X}, d)$. Let $p \in [1, \infty)$ and $x_0 \in \mathcal{X}$. Then ...

Page 115:

Particular Case 6.16.

In the case $p=1$, if the diameter of $\mathcal{X}$ is bounded by $D$, this bound implies $W_1(\mu,\nu) \leq D \lVert \mu - \nu \rVert_{TV} $ (Limone's note: $\lVert \mu - \nu \rVert_{TV}$ is/should be the total variation distance)

Introduction (part 2). In the following excerpts of Tsybakov (2009) Introduction to nonparametric estimation, Tsybakov defines the total variation distance among two probability measures $P$ and $Q$ on a ${\color{blue}{\textbf{measurable space}}}$ $\left(\mathcal{X},\mathcal{A}\right)$ (with "The sample space $\mathcal{X}$, the $\sigma-$algebra $\mathcal{A}$", on page 121):

Page 83:

2.4 Distances between probability measures

Let $\left(\mathcal{X},\mathcal{A}\right)$ be a ${\color{blue}{\textbf{measurable space}}}$ and let $P$ and $Q$ be two probability measures on $\left(\mathcal{X},\mathcal{A}\right)$. Suppose that $\nu$ is a $\sigma-$finite measure on $\left(\mathcal{X},\mathcal{A}\right)$ satisfying $P\ll\nu$ and $Q\ll\nu$. Define $p = \frac{dP}{d\nu}$, $q = \frac{dQ}{d\nu}$. Observe that such a measure $\nu$ always exists since we can take, for example, $\nu = P + Q$.

Page 83:

Definition 2.4

The total variation distance between P and Q is defined as follows: $V(P,Q)=$...

Question. Since both Villani and Tsybakov introduce the total variation distance in their books, is the ${\color{red}{\textbf{Polish space}}}$ $(\mathcal{X}, d)$ introduced by Villani the same, or similar, to the ${\color{blue}{\textbf{measurable space}}}$ $\left(\mathcal{X},\mathcal{A}\right)$ introduced by Tsybakov?

Note. My question does not depend on the different notations used by Villani and Tsybakov to indicate the probability measures ($\mu$ and $\nu$, in Villani are , respectively, $P$ and $Q$ in Tsybakov)!

$\endgroup$
2
  • 1
    $\begingroup$ There is a typo in your definition 2.4 $\mu$ is presumably $\nu$. For what it’s worth, there is no relationship between the measurable space and the Polish space. However, you then would need to ask yourself what measurable space you are interested in which is not a Polish space. $\endgroup$
    – Andrew
    Commented Sep 15, 2023 at 15:59
  • $\begingroup$ Thanks a lot @Andrew! Type corrected (I hope) :-) About the "type of space".... I was just wondering, why would the same "total variation distance" be introduced, through the same probability measures (but called in different ways by Villani and Tsybakov) by using two different spaces (a Polish space from one side and a measurable space from the other side)? Why can distances be introduced on two different spaces? Btw, as a general guideline, I would use the most general space to introduce one distance, and then, if any specific case, I would recall a suitable space for that specific case $\endgroup$
    – Ommo
    Commented Sep 15, 2023 at 16:08

1 Answer 1

2
$\begingroup$

In stating his definitions, Villani always assumes that all measures are defined on the Borel $\sigma$-algebra of a Polish space. The topology and/or distance do not play any role in the definition of the total variation, since the latter only depends on the measurable structure.

Thus, Villani's definition is simply a particular case of other standard definitions in the literature. It is particular only in that he assumes the $\sigma$-algebra $\mathcal A$ to be the Borel $\sigma$-algebra of a given Polish topological space $X$, and the measurable space which Villani implicitly uses is the space $(X,\mathcal A)$.

This does not affect the definition of total variation in any other way.

$\endgroup$
3
  • $\begingroup$ Thanks a lot @AlephBeth! So, if I understood correctly, Villani, implicitly uses a measurable space $\left(\mathcal{X},\mathcal{A}\right)$, i.e. the same $\left(\mathcal{X},\mathcal{A}\right)$ used by Tsybakov (where $\mathcal{X}$ is the sample space and $\mathcal{A}$ is the $\sigma-$algebra), right? $\endgroup$
    – Ommo
    Commented Sep 15, 2023 at 16:17
  • 1
    $\begingroup$ Yes. Specifying a topology (or even a distance) on some space is stronger than just specifying a $\sigma$-algebra. So, Villani specifies a topology (which he assumes Polish for reasons unrelated to defining the total variation), and it is implicitly understood that the relevant $\sigma$-algebra $\mathcal A$ making $X$ into a measurable space $(X,\mathcal A)$ is the Borel $\sigma$-algebra generated by the (once and for all given) Polish topology on $X$ $\endgroup$
    – AlephBeth
    Commented Sep 15, 2023 at 16:21
  • $\begingroup$ Thanks a lot, it is clearer now! :-) $\endgroup$
    – Ommo
    Commented Sep 15, 2023 at 16:23

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .