4
$\begingroup$

Good afternoon.

Could you please suggest me some books or may be articles where I can read about the intuition of Kolmogorov's axiomatics. I know it, I can solve university problems but I can't feel it. I know measure theory, but it doesn't help me with the understanding such questions like: "Why exactly sigma-algebra?", "Why not semi-algebra?", "Where did we get it from?", "Where's the logic here?" and so on... I can't feel the use of measure theory in probability theory .

$\endgroup$
1
  • $\begingroup$ I think the axioms of probability seem intuitive; the unintuitive part is that we don't assign a probability to every subset of the sample space. But there's a similar issue in measure theory, in that we can't assign a measure to every subset of $X$. Counterexamples like the Banach-Tarski paradox show that in some cases it's impossible to assign a measure to every subset of $X$ in a satisfactory way. So we have to compromise and only assign a measure or a probability to certain special subsets. It turns out we get a nice theory if we require the special subsets to form a sigma algebra. $\endgroup$
    – littleO
    Commented Jul 10, 2014 at 8:27

1 Answer 1

6
$\begingroup$

The general idea behind the measure-theoretic formulation of probability is that assigning probabilities to events is in many ways similar to assigning a measure to subsets of a measurable space.

This similarity is especially striking under the "sample space/event space" formalism: Given an experiment $E$ whose outcome is random or uncertain, one defines

  1. the sample space $\Omega$ as a set that contains every possible outcome of the experiment $E$;
  2. the event space $\mathscr F$ as a collection of subsets $A\subset\Omega$ (each of which is called an event).

Then, one defines a function $\Pr:\mathscr F\to[0,1]$ that assigns to every event $A$ its chance of success $\Pr(A)$. Then, if you think about the properties that $\Pr$ should satisfy (for example, the probability of $[A$ or $B]$, where $A$ and $B$ cannot both occur simultaneously, is given by $\Pr(A)+\Pr(B)$; or $\Pr(\Omega)=1$) you will notice that they are very similar to the definition of a measure.

As for the specific reasons why the mathematical objects studied in measure theory are defined the way they are, you can look at the math.stackexchange question "What is the Definition of a Measurable Set?". My answer to that question also includes a reference for further reading.

Also, if you have lots of time on your hands and you are interested in the history of probability and the role of measure theory in this history, you can read "The Sources of Kolmogorov’s Grundbegriffe" by Shafer and Vovk.

$\endgroup$
10
  • $\begingroup$ Thank you very much. This is a perfect answer. Thanks for additional material. $\endgroup$ Commented Jul 29, 2014 at 10:59
  • $\begingroup$ I'm looking for information on why it seems that measure-theoretic definition of random variables do not require that the σ-algebra has all the singletons of the sample space. Is there any reason for that? Also, how would you rigorously handle sequences of experiments, each of which may depend on the (random) results of the preceding ones? Once an experimental result is defined as a random variable with a certain sample space, it seems to me that we cannot easily define a co-varying random variable while easily extracting the desired properties. $\endgroup$
    – user21820
    Commented Sep 10, 2020 at 7:01
  • $\begingroup$ @user21820 Re singletons Part 1: In practice there is usually no obstacle in having singletons in your measurable space. For example if you're dealing with a discrete measurable space $\Omega=\{\omega_1,\omega_2,\omega_3,\cdots\}$, then there typically is no reason to not assume that your $\sigma$-algebra is the power set of $\Omega$. More generally, if you're working with a topological space in which singletons are closed, then the Borel $\sigma$-algebra will contain all singletons. This will be the case if your sample space is a Euclidean space. $\endgroup$
    – user78270
    Commented Sep 11, 2020 at 20:04
  • $\begingroup$ @user21820 Re singletons Part 2: If we want to do probability at the most fundamental or abstract level, however, this is not needed. Moreover if you're at all interested in using continuous distributions as a useful approximation of random phenomena, then it's not clear that singletons being measurable will be all that useful to you when making computations. E.g., if $X$ is a Gaussian random variable on $\mathbb R$, we have that $\Pr(X=x)=0$ for every $x\in\mathbb R$. $\endgroup$
    – user78270
    Commented Sep 11, 2020 at 20:10
  • $\begingroup$ @user21820 Re sequences Part 1: Here I'm afraid I don't fully understand what you're asking. From one point of view, what you're asking is trivial: Given a random variable $X:\Omega\to E$ from a sample space $\Omega$ to a measurable space $E$, and a measurable function $f:\Omega\times E\to E$, then we can define a random variable $Y:\Omega\to E$ as $Y(\omega):=f(\omega,X(\omega))$. In principle, this $Y$ will "depend" on both the randomness in the sample space and the result of $X$. $\endgroup$
    – user78270
    Commented Sep 11, 2020 at 20:15

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .