62

I was wondering if there were statistics functions built into math libraries that are part of the standard C++ libraries like cmath. If not, can you guys recommend a good stats library that would have a cumulative normal distribution function?

More specifically, I am looking to use/create a cumulative distribution function.

1
  • 3
    If the CDF of the normal distribution is all you need, why not just implement it yourself? It contains no magic so implementation is straight forward. Commented Feb 24, 2010 at 21:56

8 Answers 8

55

Theres is no straight function. But since the gaussian error function and its complementary function is related to the normal cumulative distribution function (see here, or here) we can use the implemented c-function erfc (complementary error function):

double normalCDF(double value)
{
   return 0.5 * erfc(-value * M_SQRT1_2);
}

Which considers the relation of erfc(x) = 1-erf(x) with M_SQRT1_2 = √0,5.

I use it for statistical calculations and it works great. No need for using coefficients.

0
39

Here's a stand-alone C++ implementation of the cumulative normal distribution in 14 lines of code.

http://www.johndcook.com/cpp_phi.html

#include <cmath>

double phi(double x)
{
    // constants
    double a1 =  0.254829592;
    double a2 = -0.284496736;
    double a3 =  1.421413741;
    double a4 = -1.453152027;
    double a5 =  1.061405429;
    double p  =  0.3275911;

    // Save the sign of x
    int sign = 1;
    if (x < 0)
        sign = -1;
    x = fabs(x)/sqrt(2.0);

    // A&S formula 7.1.26
    double t = 1.0/(1.0 + p*x);
    double y = 1.0 - (((((a5*t + a4)*t) + a3)*t + a2)*t + a1)*t*exp(-x*x);

    return 0.5*(1.0 + sign*y);
}

void testPhi()
{
    // Select a few input values
    double x[] = 
    {
        -3, 
        -1, 
        0.0, 
        0.5, 
        2.1 
    };

    // Output computed by Mathematica
    // y = Phi[x]
    double y[] = 
    { 
        0.00134989803163, 
        0.158655253931, 
        0.5, 
        0.691462461274, 
        0.982135579437 
    };

        int numTests = sizeof(x)/sizeof(double);

    double maxError = 0.0;
    for (int i = 0; i < numTests; ++i)
    {
        double error = fabs(y[i] - phi(x[i]));
        if (error > maxError)
            maxError = error;
    }

        std::cout << "Maximum error: " << maxError << "\n";
}
3
  • Note that this is a single precision approximation but there is a double precision implementation below by thus spake a.k. stackoverflow.com/a/23119456/190452
    – David
    Commented Mar 4, 2019 at 15:57
  • 1
    The function calculates the [pnorm]( cosmosweb.champlain.edu/people/stevens/webtech/R/…) with mean=0 and sd=1. Is there a way I can specify custom mean and sd? In that case how to modify the function?
    – Suman
    Commented Oct 12, 2019 at 3:54
  • 1
    @suman-khanal: You need a z-transformation. Just add the parameters mean and sd to the function header and x = (x - mean) / sd before the line int sign = 1.
    – jwdietrich
    Commented Oct 27, 2020 at 17:04
13

I figured out how to do it using gsl, at the suggestion of the folks who answered before me, but then found a non-library solution (hopefully this helps many people out there who are looking for it like I was):

#ifndef Pi 
#define Pi 3.141592653589793238462643 
#endif 

double cnd_manual(double x)
{
  double L, K, w ;
  /* constants */
  double const a1 = 0.31938153, a2 = -0.356563782, a3 = 1.781477937;
  double const a4 = -1.821255978, a5 = 1.330274429;

  L = fabs(x);
  K = 1.0 / (1.0 + 0.2316419 * L);
  w = 1.0 - 1.0 / sqrt(2 * Pi) * exp(-L *L / 2) * (a1 * K + a2 * K *K + a3 * pow(K,3) + a4 * pow(K,4) + a5 * pow(K,5));

  if (x < 0 ){
    w= 1.0 - w;
  }
  return w;
}
5
  • 7
    ouch... don't use pow, use Horner's rule. I downvote until this is corrected (please notify me). Commented Mar 11, 2011 at 10:31
  • 5
    this code will lose precision. Horner's rule is stabler (and also faster). Commented May 27, 2011 at 19:04
  • 1
    why not just use double pK3 = K*K*K and so on? Commented Jun 18, 2015 at 18:31
  • (As far as I know) But pow is defined as a macro, I think in order to allow good implementations to optimize common powers, such as 2 and 3. So don't give up on pow too soon! Commented Dec 4, 2015 at 9:09
  • Thanks for the code! I assume this is a Taylor expansion of the Normal Distribution and then integration of the obtained polynom. Anyway, I tested the code against Boost libraries in the z range of -3.8 to +3.8 with an increment of 0.01 and the sum of the absolute differences abs(boost-cnd_manul) is in the order of 10^-6.
    – macroland
    Commented May 1, 2018 at 1:34
11

Boost is as good as the standard :D here you go: boost maths/statistical.

5
  • Is there a standard one built in? Commented Feb 24, 2010 at 18:05
  • No, the standard library does not yet have any.
    – dirkgently
    Commented Feb 24, 2010 at 18:07
  • normal distribution, yes I think so. Or are you talking about built into the standard library -- in the latter case, no. Commented Feb 24, 2010 at 18:07
  • 3 reasons for using boost: 1) You can copy out libraries of your choosing using a utility that comes with boost. 2) the libraries tend to be header only. 3) most libraries work very well and work everywhere with a C++ compiler. Commented Feb 24, 2010 at 18:12
  • 2
    No... The boost documentation page will never fade away.
    – SmallChess
    Commented Mar 18, 2015 at 6:18
11

The implementations of the normal CDF given here are single precision approximations that have had float replaced with double and hence are only accurate to 7 or 8 significant (decimal) figures.
For a VB implementation of Hart's double precision approximation, see figure 2 of West's Better approximations to cumulative normal functions.

Edit: My translation of West's implementation into C++:

double
phi(double x)
{
  static const double RT2PI = sqrt(4.0*acos(0.0));

  static const double SPLIT = 7.07106781186547;

  static const double N0 = 220.206867912376;
  static const double N1 = 221.213596169931;
  static const double N2 = 112.079291497871;
  static const double N3 = 33.912866078383;
  static const double N4 = 6.37396220353165;
  static const double N5 = 0.700383064443688;
  static const double N6 = 3.52624965998911e-02;
  static const double M0 = 440.413735824752;
  static const double M1 = 793.826512519948;
  static const double M2 = 637.333633378831;
  static const double M3 = 296.564248779674;
  static const double M4 = 86.7807322029461;
  static const double M5 = 16.064177579207;
  static const double M6 = 1.75566716318264;
  static const double M7 = 8.83883476483184e-02;

  const double z = fabs(x);
  double c = 0.0;

  if(z<=37.0)
  {
    const double e = exp(-z*z/2.0);
    if(z<SPLIT)
    {
      const double n = (((((N6*z + N5)*z + N4)*z + N3)*z + N2)*z + N1)*z + N0;
      const double d = ((((((M7*z + M6)*z + M5)*z + M4)*z + M3)*z + M2)*z + M1)*z + M0;
      c = e*n/d;
    }
    else
    {
      const double f = z + 1.0/(z + 2.0/(z + 3.0/(z + 4.0/(z + 13.0/20.0))));
      c = e/(RT2PI*f);
    }
  }
  return x<=0.0 ? c : 1-c;
}

Note that I have rearranged expressions into the more familiar forms for series and continued fraction approximations. The last magic number in West's code is the square root of 2π, which I've deferred to the compiler on the first line by exploiting the identity acos(0) = ½ π.
I've triple checked the magic numbers, but there's always the chance that I've mistyped something. If you spot a typo, please comment!

The results for the test data John Cook used in his answer are

 x               phi                Mathematica
-3     1.3498980316301150e-003    0.00134989803163
-1     1.5865525393145702e-001    0.158655253931
 0     5.0000000000000000e-001    0.5
0.5    6.9146246127401301e-001    0.691462461274
2.1    9.8213557943718344e-001    0.982135579437

I take some small comfort from the fact that they agree to all of the digits given for the Mathematica results.

5
  • How does this compare to erfc ? Commented May 24, 2016 at 19:58
  • That would depend upon the precision guarantees of erfc. There's certainly going to be a slight rounding of the product of the argument and the square root of one half.which may propagate to the final value. Hart's algorithm is claimed to be accurate to double precision for every argument, although I've not independantly verified that. In any event both will be much, much better than single precision approximations in which float is replaced with double! Commented May 25, 2016 at 20:59
  • Is there any easy way to modify this code to account for degrees of freedom? Like Scipy's t-test?
    – tantrev
    Commented May 21, 2017 at 4:37
  • @tantrev I very much doubt that there's a simple way to transform the series and continued fraction approximations of the normal CDF into that of the t distribution I'm afraid. Commented May 27, 2017 at 0:39
  • @thusspakea.k. Thank you! I suppose my hope for quickly approximating ~1E6 p-values with the same N size was just fool-hardy.
    – tantrev
    Commented May 27, 2017 at 1:48
10

From NVIDIA CUDA samples:

static double CND(double d)
{
    const double       A1 = 0.31938153;
    const double       A2 = -0.356563782;
    const double       A3 = 1.781477937;
    const double       A4 = -1.821255978;
    const double       A5 = 1.330274429;
    const double RSQRT2PI = 0.39894228040143267793994605993438;

    double
    K = 1.0 / (1.0 + 0.2316419 * fabs(d));

    double
    cnd = RSQRT2PI * exp(- 0.5 * d * d) *
          (K * (A1 + K * (A2 + K * (A3 + K * (A4 + K * A5)))));

    if (d > 0)
        cnd = 1.0 - cnd;

    return cnd;
}

Copyright 1993-2012 NVIDIA Corporation. All rights reserved.

0
2

From https://en.cppreference.com/w/cpp/numeric/math/erfc

Normal CDF can be calculated as below:

#include <iostream>
#include <cmath>
#include <iomanip>
using namespace std;

double normalCDF(double x) // Phi(-∞, x) aka N(x)
{
    return erfc(-x / sqrt(2))/2;
}

Using 2.0 instead of 2 in the denominator helps in getting decimals instead of integers.

Hope that helps.

1

Since this question was asked almost 13 years ago and the answers are quite outdated, as of now, to calculate the cdf of a normal distribution, we can use the boost library which can be downloaded from here https://www.boost.org/. Once you have installed the latest version, #include any distribution for example #include "boost/math/distributions/normal.hpp" and you can use the cdf directly. Remember to use the namespace boost::math. You can refer to this link for further reference: https://www.boost.org/doc/libs/1_80_0/boost/math/distributions.hpp

1
  • Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.
    – Community Bot
    Commented Oct 8, 2022 at 10:52

Not the answer you're looking for? Browse other questions tagged or ask your own question.