
I am learning about the normal distribution and was watching this video.

At 6:28, the question imposed is what is the probability of an ice-cream weighing exactly 120 grams (using the normal distribution). She states that the answer to this is zero, as the probability of any exact value is zero in a normal distribution. She then states that there are infinitely many weights from 199.9 to 120.1, and that the probability of any specific weight is 1 over infinity, which is zero.

enter image description here

I am a bit confused about this. Why is the probability one over infinity for a specific value, like at 120? She then states that an ice cream could weight 120 grams or 120.000001 grams; how is that related to the probability of a specific point being zero?


12 Answers 12


The video suggests that $\mu=112$ g and $\sigma=9$ g in this particular normal distribution.

If that is the case, we can find the probability that the weight is in a given interval, in the video described as the area under the graph for that interval. For example, the probability it is between $119.5$ g and $120.5$ g is $$\Phi\left(\tfrac{120.5-112}{9}\right) - \Phi\left(\tfrac{119.5-112}{9}\right) = \Phi\left(\tfrac{17}{18}\right) - \Phi\left(\tfrac{15}{18}\right)\approx 0.82753- 0.79767=0.02986$$ which the video describes as about $0.03$

Similarly we can look at other intervals around $120$ g:

Lower     Upper     Probability
119       121       0.05969  
119.5     120.5     0.02986
119.9     120.1     0.00592
119.99    120.01    0.00059
119.999   120.001   0.00006 

and as we cut the width of the interval by a factor of $10$ each time, the probability of the weight being in that narrower also roughly falls by a factor of $10$. So as the interval falls towards zero, the probability of being in that interval also falls towards zero.

In that sense the probability of being exactly $120$ must be smaller than any positive number and so must be $0$.

  • 1
    $\begingroup$ Sorry what does phi mean? Is it the same as a z-score? $\endgroup$ Commented Aug 7, 2020 at 3:38
  • $\begingroup$ @ChristopherU $\Phi$ is the cumulative distribution function of a standard normal distribution (i.e. with mean $0$ and variance $1$). So if $Z$ has a standard normal distribution then $\Pr(Z \le z)=\Phi(z)$ and for example $\Phi(1.96) \approx 0.975$ $\endgroup$
    – Henry
    Commented Aug 7, 2020 at 8:22
  • $\begingroup$ Thanks, where did you get Z≤z? $\endgroup$ Commented Aug 7, 2020 at 10:27
  • $\begingroup$ @ChristopherU $Z \le z$ is just the event that a random variable $Z$ is less than a specific value $z$, so $\Pr(Z \le z)$ is the probability of this event. $\endgroup$
    – Henry
    Commented Aug 7, 2020 at 13:25

I guess the statement could be made more precise and then it could be easier to understand. First of all, $f(x) = \tfrac{1}{C}$, where $C$ is a constant so that it integrates to unity, is a probability density of an uniform distribution that assigns the same probability density to each point. A normal distribution does not have the same flat shape so that different probability densities apply to different values. In what follows, $\frac{1}{\infty}$ is just used as an example to show general ideas about probability densities.

But let's stick to the example. $\frac{1}{\infty}$ is not equal to zero (see Quora, or math.stackoverflow.com answers). You cannot divide by infinity, because it is not a number. What you can say is that the limit is zero

$$ \lim_{x\to\infty} \frac{1}{x} = 0 $$

so as $x$ increases, $\tfrac{1}{x}$ gets closer and closer to zero. This is why there is a convention to say that it "is" zero. In the case of continuous random variables, there are infinitely many values on the real line; hence even in the simplest case of a uniform distribution, we cannot calculate the probability. In probability theory, we do not calculate the probabilities for continuous random variables, because they are so infinitesimal, so that we say they are zero.

See also the $P[X=x]=0$ when $X$ is continuous variable thread.

  • $\begingroup$ Good answer. Another way to look at this is by using the term "almost never" which means that an event has a probability arbitrarily close to zero, but the set of events which satisfy the criterion is non-empty (you could actually find a weight of exactly 120, but it's infinitesimally likely). This is in contrast to an "impossible" event, which truly has probability zero, and has no possible outcomes that satisfy the criterion (you could not find a negative weight, for example). $\endgroup$ Commented Aug 6, 2020 at 16:59
  • $\begingroup$ Thanks very much for the useful answer. I get the lim -> infinity part now, but I'm still a bit confused why the probability is "zero" at a certain point. Say if we're trying to find the probability at 23 on a normal distribution, why is that 0? $\endgroup$ Commented Aug 6, 2020 at 20:09
  • 1
    $\begingroup$ @ChristopherU because the points lie on a real line, so there is infinitely many such points. If there is infinitely many of them, then the chance of picking any one of them is infinitely small. $\endgroup$
    – Tim
    Commented Aug 6, 2020 at 20:19

If you take a random person out of a country with a well studied population distribution, what are the odds that they are 30 years old? Surely there's an answer to that question, if you consider that someone born 30 years and 2 months ago to be 30 years old. But what if you are looking for monthly precision? Then only people born 30 years ago would fit your criteria. What if you keep limiting your requirements, second precision, milisecond precision, picosecond precision, planck time precision. Eventually you'll find that no one fits your narrow criteria of 30 years old, but it still will be possible that someone fits that criteria, and you can account for that probability with fractional numbers.

If you keep narrowing your age range so that you only consider people of exactly 30 years old, then you have effectively narrowed your range to its fullest, it's a range comprised of exactly one number, the upper bound is equal to the lower bound, as you can surmise from the progression from broad to narrow time ranges, the probability that someone will be exactly 30 years old tends towards 0.

This happens only if we are considering our domain (time/age) to be continuous value, rather than discrete, so there are infinite intermediate values between one value and any other value.

If we consider time to be discrete, for example by considering a planck time to be the shortest possible time span, then the probability of someone being exactly 30 years old can be expressed in the order of planck-time/year, which albeit being very small, is finite.


For continuous distributions, like normal distribution, probability of the random variable being equal to a specific value is $0$. Although it's not mathematically precise, the video is just trying to build some intuition. It's saying that if there was some non-zero probability for $P(X=x)$, the sum $\sum_x P(X=x)$ would go to $\infty$, which violates the axioms of probability because there are uncountably many numbers between 119.9 and 120.1.


Let's consider a slightly simpler example of generating a random number uniformly between 0 and 1.

Let's start with an even simpler problem of picking a random value that's just either 0 or 1. There are 2 possible values, so the chance of getting exactly 0 is $\frac{1}{2} = 0.5$.

Now consider if you have another point between those 2 so you have 0, 0.5 and 1. There are 3 possible values, so the chance of getting exactly 0 is $\frac{1}{3} = 0.33$.

Now put another point between each of those so you have 0, 0.25, 0.5, 0.75 and 1. There are 5 possible values, so the chance of getting exactly 0 is $\frac{1}{5} = 0.2$.

Now put another point between each of those so you have 0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875 and 1. There are 9 possible values, so the chance of getting exactly 0 is $\frac{1}{9} = 0.11$.

We're still between 0 and 1, so all these values would be possible values if we're picking a value between 0 and 1 and you can see that the probability is getting smaller.

Keep going like this and there will be more and more points and the probability of getting a specific one of them gets smaller and smaller, tending towards 0.

The same idea holds with a normal distribution: there are infinitely many points in any given range, so the probability of getting any specific one of them tends towards 0.

Whether it's actually strictly equal to 0 I'll leave for other people to argue about.



Let's try this out for a standard normal distribution.

x = rnorm(10^4)

Out if these $10^4$ values many will be close to the mean (that is zero). But none of them are equal to zero.

    [1] -6.264538e-01  1.836433e-01 -8.356286e-01  1.595281e+00  3.295078e-01
    [6] -8.204684e-01  4.874291e-01  7.383247e-01  5.757814e-01 -3.053884e-01
   [11]  1.511781e+00  3.898432e-01 -6.212406e-01 -2.214700e+00  1.124931e+00
   [16] -4.493361e-02 -1.619026e-02  9.438362e-01  8.212212e-01  5.939013e-01
   [21]  9.189774e-01  7.821363e-01  7.456498e-02 -1.989352e+00  6.198257e-01
   [26] -5.612874e-02 -1.557955e-01 -1.470752e+00 -4.781501e-01  4.179416e-01
   [31]  1.358680e+00 -1.027877e-01  3.876716e-01 -5.380504e-02 -1.377060e+00
   [36] -4.149946e-01 -3.942900e-01 -5.931340e-02  1.100025e+00  7.631757e-01
   [41] -1.645236e-01 -2.533617e-01  6.969634e-01  5.566632e-01 -6.887557e-01
   [46] -7.074952e-01  3.645820e-01  7.685329e-01 -1.123462e-01  8.811077e-01
   [51]  3.981059e-01 -6.120264e-01  3.411197e-01 -1.129363e+00  1.433024e+00
   [56]  1.980400e+00 -3.672215e-01 -1.044135e+00  5.697196e-01 -1.350546e-01
   [61]  2.401618e+00 -3.924000e-02  6.897394e-01  2.800216e-02 -7.432732e-01
   [66]  1.887923e-01 -1.804959e+00  1.465555e+00  1.532533e-01  2.172612e+00
   [71]  4.755095e-01 -7.099464e-01  6.107264e-01 -9.340976e-01 -1.253633e+00
   [76]  2.914462e-01 -4.432919e-01  1.105352e-03  7.434132e-02 -5.895209e-01
   [81] -5.686687e-01 -1.351786e-01  1.178087e+00 -1.523567e+00  5.939462e-01
   [86]  3.329504e-01  1.063100e+00 -3.041839e-01  3.700188e-01  2.670988e-01
   [91] -5.425200e-01  1.207868e+00  1.160403e+00  7.002136e-01  1.586833e+00
   [96]  5.584864e-01 -1.276592e+00 -5.732654e-01 -1.224613e+00 -4.734006e-01
  [101] -6.203667e-01  4.211587e-02 -9.109216e-01  1.580288e-01 -6.545846e-01
  [106]  1.767287e+00  7.167075e-01  9.101742e-01  3.841854e-01  1.682176e+00
  [111] -6.357365e-01 -4.616447e-01  1.432282e+00 -6.506964e-01 -2.073807e-01
  [116] -3.928079e-01 -3.199929e-01 -2.791133e-01  4.941883e-01 -1.773305e-01
  [121] -5.059575e-01  1.343039e+00 -2.145794e-01 -1.795565e-01 -1.001907e-01
  [126]  7.126663e-01 -7.356440e-02 -3.763417e-02 -6.816605e-01 -3.242703e-01
  [131]  6.016044e-02 -5.888945e-01  5.314962e-01 -1.518394e+00  3.065579e-01
  [136] -1.536450e+00 -3.009761e-01 -5.282799e-01 -6.520948e-01 -5.689678e-02
  [141] -1.914359e+00  1.176583e+00 -1.664972e+00 -4.635304e-01 -1.115920e+00
  [146] -7.508190e-01  2.087167e+00  1.739562e-02 -1.286301e+00 -1.640606e+00
  [151]  4.501871e-01 -1.855983e-02 -3.180684e-01 -9.293621e-01 -1.487460e+00
  [156] -1.075192e+00  1.000029e+00 -6.212667e-01 -1.384427e+00  1.869291e+00
  [161]  4.251004e-01 -2.386471e-01  1.058483e+00  8.864227e-01 -6.192430e-01
  [166]  2.206102e+00 -2.550270e-01 -1.424495e+00 -1.443996e-01  2.075383e-01
  [171]  2.307978e+00  1.058024e-01  4.569988e-01 -7.715294e-02 -3.340008e-01
  [176] -3.472603e-02  7.876396e-01  2.075245e+00  1.027392e+00  1.207908e+00
  [181] -1.231323e+00  9.838956e-01  2.199248e-01 -1.467250e+00  5.210227e-01
  [186] -1.587546e-01  1.464587e+00 -7.660820e-01 -4.302118e-01 -9.261095e-01
  [191] -1.771040e-01  4.020118e-01 -7.317482e-01  8.303732e-01 -1.208083e+00
  [196] -1.047984e+00  1.441158e+00 -1.015847e+00  4.119747e-01 -3.810761e-01
  [201]  4.094018e-01  1.688873e+00  1.586588e+00 -3.309078e-01 -2.285236e+00
  [206]  2.497662e+00  6.670662e-01  5.413273e-01 -1.339952e-02  5.101084e-01
  [211] -1.643758e-01  4.206946e-01 -4.002467e-01 -1.370208e+00  9.878383e-01
  [216]  1.519745e+00 -3.087406e-01 -1.253290e+00  6.422413e-01 -4.470914e-02
  [221] -1.733218e+00  2.131860e-03 -6.303003e-01 -3.409686e-01 -1.156572e+00
  [226]  1.803142e+00 -3.311320e-01 -1.605513e+00  1.971934e-01  2.631756e-01
  [231] -9.858267e-01 -2.888921e+00 -6.404817e-01  5.705076e-01 -5.972328e-02
  [236] -9.817874e-02  5.608207e-01 -1.186459e+00  1.096777e+00 -5.344028e-03
  [241]  7.073107e-01  1.034108e+00  2.234804e-01 -8.787076e-01  1.162965e+00
  [246] -2.000165e+00 -5.447907e-01 -2.556707e-01 -1.661210e-01  1.020464e+00
  [251]  1.362219e-01  4.071676e-01 -6.965481e-02 -2.476643e-01  6.955508e-01
  [256]  1.146228e+00 -2.403096e+00  5.727396e-01  3.747244e-01 -4.252677e-01
  [261]  9.510128e-01 -3.892372e-01 -2.843307e-01  8.574098e-01  1.719627e+00
  [266]  2.700549e-01 -4.221840e-01 -1.189113e+00 -3.310330e-01 -9.398293e-01
  [271] -2.589326e-01  3.943792e-01 -8.518571e-01  2.649167e+00  1.560117e-01
  [276]  1.130207e+00 -2.289124e+00  7.410012e-01 -1.316245e+00  9.198037e-01
  [281]  3.981302e-01 -4.075286e-01  1.324259e+00 -7.012317e-01 -5.806143e-01
  [286] -1.001072e+00 -6.681786e-01  9.451850e-01  4.337021e-01  1.005159e+00
  [291] -3.901187e-01  3.763703e-01  2.441649e-01 -1.426257e+00  1.778429e+00
  [296]  1.344477e-01  7.655990e-01  9.551367e-01 -5.056570e-02 -3.058154e-01
  [301]  8.936737e-01 -1.047298e+00  1.971337e+00 -3.836321e-01  1.654145e+00
  [306]  1.512213e+00  8.296573e-02  5.672209e-01 -1.024548e+00  3.230065e-01
  [311]  1.043612e+00  9.907849e-02 -4.541369e-01 -6.557819e-01 -3.592242e-02
  [316]  1.069161e+00 -4.839749e-01 -1.210101e-01 -1.294140e+00  4.943128e-01
  [321]  1.307902e+00  1.497041e+00  8.147027e-01 -1.869789e+00  4.820295e-01
  [326]  4.561356e-01 -3.534003e-01  1.704895e-01 -8.640360e-01  6.792308e-01
  [331] -3.271010e-01 -1.569082e+00 -3.674508e-01  1.364435e+00 -3.342814e-01
  [336]  7.327500e-01  9.465856e-01  4.398704e-03 -3.523223e-01 -5.296955e-01
  [341]  7.395892e-01 -1.063457e+00  2.462108e-01 -2.894994e-01 -2.264889e+00
  [346] -1.408850e+00  9.160193e-01 -1.912790e-01  8.032832e-01  1.887474e+00
  [351]  1.473881e+00  6.772685e-01  3.799627e-01 -1.927984e-01  1.577892e+00
  [356]  5.962341e-01 -1.173577e+00 -1.556425e-01 -1.918910e+00 -1.952588e-01
  [361] -2.592328e+00  1.314002e+00 -6.355430e-01 -4.299788e-01 -1.693183e-01
  [366]  6.122182e-01  6.783402e-01  5.679520e-01 -5.725426e-01 -1.363291e+00
  [371] -3.887222e-01  2.779141e-01 -8.230811e-01 -6.884093e-02 -1.167662e+00
  [376] -8.309014e-03  1.288554e-01 -1.458756e-01 -1.639110e-01  1.763552e+00
  [381]  7.625865e-01  1.111431e+00 -9.232070e-01  1.643418e-01  1.154825e+00
  [386] -5.652142e-02 -2.129361e+00  3.448458e-01 -1.904955e+00 -8.111702e-01
  [391]  1.324004e+00  6.156368e-01  1.091669e+00  3.066049e-01 -1.101588e-01
  [396] -9.243128e-01  1.592914e+00  4.501060e-02 -7.151284e-01  8.652231e-01
  [401]  1.074441e+00  1.895655e+00 -6.029973e-01 -3.908678e-01 -4.162220e-01
  [406] -3.756574e-01 -3.666309e-01 -2.956775e-01  1.441820e+00 -6.975383e-01
  [411] -3.881675e-01  6.525365e-01  1.124772e+00 -7.721108e-01 -5.080862e-01
  [416]  5.236206e-01  1.017754e+00 -2.511646e-01 -1.429993e+00  1.709121e+00
  [421]  1.435070e+00 -7.103711e-01 -6.506757e-02 -1.759469e+00  5.697230e-01
  [426]  1.612347e+00 -1.637281e+00 -7.795685e-01 -6.411769e-01 -6.811314e-01
  [431] -2.033286e+00  5.009636e-01 -1.531798e+00 -2.499764e-02  5.929847e-01
  [436] -1.981954e-01  8.920084e-01 -2.571507e-02 -6.476605e-01  6.463594e-01


  and so on

(As Ben Bolker mentions in the comments, this exercise actually has a non-zero probability to give exactly a particular number. But that is because computers have a finite or discrete set of numbers. The true normal distribution is a continuous distribution with an infinite possibility of numbers as outcome)


If there are an infinite possibilities then the probability of any of them may be zero. To get a non-zero measure you need a range of values. For instance you can speak about the probability of the ice-cream being inbetween 119.9 and 120.1 grams.


Intuition, Imagine you want to pick a rational number between 0 and 1 by throwing a dart on a board and wherever the dart ends up will be the number that you pick. Say, the probability where the dart ends will be continuous and uniform. Then the probability for the dart to end up in a certain region can be determined by the size of the interval (the Lebesgue measure).

The probability for the dart to end in a particular interval will be equal to the size of the interval. For instance the probability for the dart to end up between 0 and 0.5 is 0.5, the probability for the dart to end up between 0.211 and 0.235 is 0.024, and so on.

But now imagine the "size" of the region for a single point... it is zero.

  • 1
    $\begingroup$ but eventually if you pick enough random values some of them will be zero to the precision that the computer uses (I think it will take something less than 10^16 samples ...) $\endgroup$
    – Ben Bolker
    Commented Aug 6, 2020 at 21:17
  • 1
    $\begingroup$ @BenBolker, yes the "computer's normal distribution" is not a true normal distribution. But the big list of random numbers should show what is going on. $\endgroup$ Commented Aug 6, 2020 at 21:23
  • 2
    $\begingroup$ Some readers will find this example enlightening, some will find it confusing. I think it would be greatly improved with some discussion of what's going on, not just presented as a fact from which you expect the reader to draw the right conclusion ... $\endgroup$
    – Ben Bolker
    Commented Aug 6, 2020 at 21:27
  • 2
    $\begingroup$ Yes, and for that reason the example the OP's instructor gave (an ice cream (scoop? cone? sample?) will never weigh exactly 120 g) is not true in the real world: the scale has finite precision, so the ice cream could weigh 120.000000000000 grams up to the precision of the scale. But it's exactly this distinction between math-world (working with real numbers, in the technical sense) and real-world that may help readers understand what math-world means. I repeat that your answer would be better with further explanatory text ... $\endgroup$
    – Ben Bolker
    Commented Aug 6, 2020 at 21:30
  • 1
    $\begingroup$ @BenBolker thank you for pushing me to improve the answer. I sometimes like to give a cryptical answer, but I believe that it is better now with the additional text. $\endgroup$ Commented Aug 6, 2020 at 21:48

For a real-world analogy, imagine throwing a pencil in the air in such a way that it has an equal probability of landing at any angle, measured relative to north. What is the probability that it lands at exactly 120 degrees? It might get really close, and about 1 in 360 times it will be between 120.5 and 119.5 degrees, but it will never be at exactly 120, because if you can measure the angle a bit more precisely you'll find it's actually at 120.002, or 119.99999999999997, and so on, under the assumption that real space is actually continuous and you can measure an angle to an infinite number of digits.

The point is that because this probability distribution is continuous, there are infinitely many numbers right next to any number you can choose. The somewhat strange corollary is that events with probability zero happen all the time: before you throw the pencil, the probability of it landing at any specific angle is zero, but it will land at some specific angle.


TL;DR: Don't confuse the probability density with the probability. In the given example, the probability is zero: $\mathrm{Pr}(m=120\,\mathrm{g})=0$, but the probability density is non-zero: $p_M(m=120\,\mathrm{g}) \approx 0.0299\,\mathrm{g^{-1}}$.

There have already been quite a few answers, but I think that visualizing things might help understanding, here.

I agree with Itamar Mushkin's comments to the OP that there probably is some confusion of probability (let's write it as $\mathrm{Pr}(m)$) and probability density (let's write it as $p_M(m)$), which hasn't been properly addressed in any of the answers, yet.

Full answer

In the video a normal distribution with mean $\mu=112\,\mathrm{g}$ and standard deviation $\sigma=9\,\mathrm{g}$ is used as a probability density function (commonly abbreviated by "pdf"). Let's call $p_M(m)$ the pdf of the random variable $M$ (our ice cream mass), such that: $$ p_M(m) = \mathcal{N}(\mu=112\,\mathrm{g},\sigma=9\,\mathrm{g}) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{\frac{-(m-\mu)^2}{2\sigma^2}} $$

               normal distribution with mean 112 and standard deviation 9

Note (and this is crucial!), how the probability density $p$ is not dimensionless, but has units of $\mathrm{g^{-1}}$, since it is a density, i.e. it gives the probability per mass interval. Note further, that the probability density is non-zero for any finite mass (probability density $p_M$, not probability $\mathrm{Pr}$!). When we commonly talk about densities, we typically refer to mass per volume, e.g. the density of a diamond is about $3.51\,\mathrm{g/cm^3}$. Here, when talking about probability density, the probability takes the role of the diamond mass and the ice cream mass interval takes the role of the diamond volume, giving units of probability per mass.

Now, to get to an actual probability, we basically need to multiply the probability density with some mass interval $\Delta m = m_2-m_1$ (in the same way that we would need to multiply the diamond density with the diamond volume to get the diamond's mass). I say basically, because the proper way of doing this is by integrating the pdf over that mass interval, giving you the area under the curve (and area under a curve is basically just multiplying x-interval times y-interval in fine strips):

$$ \begin{align} \mathrm{Pr}(M \in [m_1, m_2]) &= \int_{m_1}^{m_2} p_M(m) \, dm \tag{1}\\ &= P_M(m) |_{m_1}^{m_2} \\ &= P_M(m_2) - P_M(m_1) \tag{2} \end{align} $$

probability from area under probability density

In the above formula $P_M(m)$ is the cumulative distribution function (commonly abbreviated to cdf and which Henry called $\Phi$ in his answer), which is the integral of the pdf:

$$ \begin{align} P_M(m) &= \int_{-\infty}^m p_M(\tilde{m}) \, d\tilde{m} \\ &= \mathrm{Pr}(M \le m) \end{align} $$

Thus, the cdf would directly give you the answer to the question: "What is the probability that the ice cream has a mass of at least mass $m$?" And the answer would be non-zero.

The corresponding picture for $\mathrm{Pr}(M \in [m_1, m_2])$ in terms of the cdf is as follows:

probability from cumulative distribution function

So far so good, this is the starting point for most of the other answers, many of which give examples to intuitively understand why the probability that the mass takes a specific value goes to zero.

To answer that question, here, with the images and equations above: If you want to know the probability that the mass takes on some exact value, e.g. $m_\ast = 120\,\mathrm{g}$, you could take a look at equation (1) and the second image and realise that by looking at $\mathrm{Pr}(M = m_\ast)$ you are effectively sending both of your integration limits to the same mass $m_1, m_2 \rightarrow m_\ast$ which sends the mass interval to zero $\Delta m = m_2 - m_1 \rightarrow 0$, and thus the area under the curve will be zero, too: $\int_{m_1 \rightarrow m_\ast}^{m_2 \rightarrow m_\ast} p_M(m) \, dm \rightarrow 0$. Equivalently, you could look at equation (2) and see directly that: $P_M(m_2 \rightarrow m_\ast) - P_M(m_1 \rightarrow m_\ast) \rightarrow 0$.

Note, the probability that the mass is exactly $m_\ast=120\,\mathrm{g}$ goes to zero: $\mathrm{Pr}(M=120\,\mathrm{g})=0$, the probability density at the mass $m_\ast=120\,\mathrm{g}$ is not zero: $p_M(m=120\,\mathrm{g}) \approx 0.0299\,\mathrm{g^{-1}}$.


For those interested in the python code that generated the above images:

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
from scipy.integrate import quad

mu = 112  # mean
sigma = 9  # standard deviation
norm = norm(loc=mu, scale=sigma)  # normal distribution
p = norm.pdf  # probability density function
P = norm.cdf  # cumulative distribution function
m = np.linspace(mu-5*sigma, mu+5*sigma, 10*sigma+1)  # ice cream mass range

# plot of probability density function (pdf)
fig = plt.figure()
plt.plot(m, p(m), lw=3)
plt.axvline(mu, color='C1', label="$\mu=%d\,\mathrm{g}$" % mu)
plt.hlines(p(norm.ppf((1-0.6827)/2)), xmin=mu-sigma, xmax=mu+sigma, color='C2', 
           label="$\sigma=%d\,\mathrm{g}$" % sigma)
plt.legend(bbox_to_anchor=(1, 1), loc='upper left')
plt.xlabel("$m$   $\mathrm{[g]}$ \n ice cream mass   ")
plt.ylabel("probability density function \n $p_M(m)$   $[\mathrm{g^{-1}}]$")

# plot showing area under pdf corresponding to Pr(m1 <= m <= m2)
m1 = 115  # lower mass limit
m2 = 125  # upper mass limit
Delta_m = np.linspace(m1, m2, int(m2 - m1))  # mass interval

fig = plt.figure()
plt.plot(m, p(m), lw=3)
plt.fill_between(Delta_m, 0, p(Delta_m), color='C3', alpha=0.7, 
                 label="$\mathrm{Pr}(%d \le m \le %d) "
                   "= \int_{%d}^{%d} p_M(m) dm$ \n\n"
                       ".$\hphantom{\mathrm{Pr}(.5\le m\le125)} \\approx %.3f$" 
                       % (m1, m2, m1, m2, quad(p, m1, m2)[0]))
plt.legend(bbox_to_anchor=(1, 1), loc='upper left')
plt.xlabel("$m$   $\mathrm{[g]}$ \n ice cream mass   ")
plt.ylabel("probability density function \n $p_M(m)$   $[\mathrm{g^{-1}}]$")

# plot of cumulative distribution function and highlighting values for m1 and m2
fig = plt.figure()
plt.plot(m, P(m), lw=3)
plt.hlines(P(m1), min(m), m1, color='C3')
plt.hlines(P(m2), min(m), m2, color='C3')
plt.vlines(m1, 0, P(m1), color='C3')
plt.vlines(m2, 0, P(m2), color='C3', 
           label="$\mathrm{Pr}(%d \le m \le %d) = P_M(%d) - P_M(%d)$ \n\n"
                 ".$\hphantom{\mathrm{Pr}(.5\le m\le125)} = %.3f - %.3f$ \n\n"
                 ".$\hphantom{\mathrm{Pr}(.5\le m\le125)} \\approx %.3f$" 
                 % (m1, m2, m1, m2, P(m2), P(m1), P(m2) - P(m1)))
plt.legend(bbox_to_anchor=(1, 1), loc='upper left')
plt.xlabel("$m$   $\mathrm{[g]}$ \n ice cream mass   ")
plt.ylabel("cumulative distribution function \n $P_M(m)$")


Normal distribution is a continuous probability distribution and in continuous probability distribution the method of finding probability is by integrating over the range or area under the curve. when you want to find a probability for a single value it will become a line in graph of pdf. We cannot find its area. Or in other words if the lower limit and the upper limit of an integral are the same value, the result of that integral is zero.


What we need is a bit of an intuition of continuous random variables:

Teacher: Let's say we tossed a fair coin 10 times. What outcome would you place your bet on?

(Naive) Student: (5H, 5T), as it is fair coin.

Teacher: So that's what you would expect but may not necessarily get. In fact, the probability of (5H, 5T) is ${10\choose5}*(1/2)^5*(1/2)^5 = 0.25$.

Student: I guess we are tossing too less number of times. A fair coin should give half times Head and half times Tails if we tossed enough times.

Teacher: Fair point. So let's say I give you $100 if you get equal number of heads and tails. And you have to decide whether you tossed the coin 10 times or 100 times. How many times would you toss the coin?

Student: 100 times.

Teacher: Interestingly enough, the probability of getting (50H, 50T) is actually smaller now: ${100\choose50}*(1/2)^{50}*(1/2)^{50} = 0.08$

In fact the highest probability of getting equal number of Heads and Tails will be when you just throw the coin two times.

And if you throw the coin 2 million times, the probability of getting a exactly million each of Heads and Tails is almost zero.

Student But then where is my intuition going wrong?

Teacher: Your intuition about choosing more number of tosses is correct but what your intuition got wrong is exact half is not almost half. As you increase the number of tosses, the probability that the proportion of number of Heads (equivalently, Tails) will be in the neighbourhood of $0.5$ will be greater as we increase tosses. The probability of getting 40% to 60% Heads is about $0.66$ with 10 tosses and $0.96$ in 100 tosses.

So you see as the number of possible events tend to infinity, the probability of getting an exact outcome (even the expected outcome) shrinks to zero. This captures the essence of continuous random variables. For such cases, when there are just too many possibilities, we (intuitively) think about intervals and not exact outcomes.


When I was teaching this exact concept, the following picture proved to be very intuitively understandable by the students.

We start from the fact that, as you probably, know, the probability for the random variable $X$ to take a value between $x_0$ and $x_1$ is calculated as the area under your normal bell curve:

$$ P([x_0, x_1]) = \int_{x_0}^{x_1} f(X) dx = P(x_1) - P(x_0)$$

If this formula looks weird, just look at it this way: what should the probability be for X to take any value at all? It will be $1$, so the whole area under the curve is $1$.

Now, if you want to calculate the probability for an ever more specific value, that means you're bringing the limits of integration closer and closer to each other. And when you have them at a value (e.g. 120g ice cream), that's the same as writing

$$ P([120g, 120g]) = \int_{120g}^{120g} f(X) dx = P(120g) - P(120g) = 0$$


Suppose that we have continuous random variable $X$ with distribution $\mathbf{P}_X$, so it can take uncountably many values. We claim that every possible value has positive probability, what is equal to say, that each singleton (set with just one element, in a shape $\{x\}$) has probability bigger than zero. Define family of sets $\{A_n\}$, and establish that some point $x_0\in A_n$ when $\mathbf{P}_X(\{x_0\})> \frac{1}{n}$. Look that in a set $A_n$ we can have at most $n$ values: otherwise the probability would exceed $1$. If probablity $\mathbf{P}_X(\{x_0\})>0$, then there exists some $n_0$, for which $x_0\in A_{n_0}$. If we take the union of these sets over all natural numbers, we obtain countable union of finite sets, so countable set. It provides that only finite singletons could have probability bigger than zero, what is contradiction with our claim.


Not the answer you're looking for? Browse other questions tagged or ask your own question.