Cumulative Normal Distribution Function in C/C++

Question

I was wondering if there were statistics functions built into math libraries that are part of the standard C++ libraries like cmath. If not, can you guys recommend a good stats library that would have a cumulative normal distribution function?

More specifically, I am looking to use/create a cumulative distribution function.

If the CDF of the normal distribution is all you need, why not just implement it yourself? It contains no magic so implementation is straight forward. — Hannes Ovrén, Commented Feb 24, 2010 at 21:56

JFS · Accepted Answer · 2018-12-30 13:12:03Z

55

Theres is no straight function. But since the gaussian error function and its complementary function is related to the normal cumulative distribution function (see here, or here) we can use the implemented c-function erfc (complementary error function):

double normalCDF(double value)
{
   return 0.5 * erfc(-value * M_SQRT1_2);
}

Which considers the relation of erfc(x) = 1-erf(x) with M_SQRT1_2 = √0,5.

I use it for statistical calculations and it works great. No need for using coefficients.

edited Dec 30, 2018 at 13:12

answered Sep 13, 2013 at 12:52

JFS

3,0763 gold badges39 silver badges48 bronze badges

Add a comment |

Innokentiy Alaytsev · Accepted Answer · 2018-03-08 15:14:06Z

Here's a stand-alone C++ implementation of the cumulative normal distribution in 14 lines of code.

http://www.johndcook.com/cpp_phi.html

#include <cmath>

double phi(double x)
{
    // constants
    double a1 =  0.254829592;
    double a2 = -0.284496736;
    double a3 =  1.421413741;
    double a4 = -1.453152027;
    double a5 =  1.061405429;
    double p  =  0.3275911;

    // Save the sign of x
    int sign = 1;
    if (x < 0)
        sign = -1;
    x = fabs(x)/sqrt(2.0);

    // A&S formula 7.1.26
    double t = 1.0/(1.0 + p*x);
    double y = 1.0 - (((((a5*t + a4)*t) + a3)*t + a2)*t + a1)*t*exp(-x*x);

    return 0.5*(1.0 + sign*y);
}

void testPhi()
{
    // Select a few input values
    double x[] = 
    {
        -3, 
        -1, 
        0.0, 
        0.5, 
        2.1 
    };

    // Output computed by Mathematica
    // y = Phi[x]
    double y[] = 
    { 
        0.00134989803163, 
        0.158655253931, 
        0.5, 
        0.691462461274, 
        0.982135579437 
    };

        int numTests = sizeof(x)/sizeof(double);

    double maxError = 0.0;
    for (int i = 0; i < numTests; ++i)
    {
        double error = fabs(y[i] - phi(x[i]));
        if (error > maxError)
            maxError = error;
    }

        std::cout << "Maximum error: " << maxError << "\n";
}

Note that this is a single precision approximation but there is a double precision implementation below by thus spake a.k. stackoverflow.com/a/23119456/190452 — David, Commented Mar 4, 2019 at 15:57
The function calculates the [pnorm]( cosmosweb.champlain.edu/people/stevens/webtech/R/…) with mean=0 and sd=1. Is there a way I can specify custom mean and sd? In that case how to modify the function? — Suman, Commented Oct 12, 2019 at 3:54
@suman-khanal: You need a z-transformation. Just add the parameters mean and sd to the function header and x = (x - mean) / sd before the line int sign = 1. — jwdietrich, Commented Oct 27, 2020 at 17:04

Tyler Brock · Accepted Answer · 2010-08-18 13:27:33Z

13

I figured out how to do it using gsl, at the suggestion of the folks who answered before me, but then found a non-library solution (hopefully this helps many people out there who are looking for it like I was):

#ifndef Pi 
#define Pi 3.141592653589793238462643 
#endif 

double cnd_manual(double x)
{
  double L, K, w ;
  /* constants */
  double const a1 = 0.31938153, a2 = -0.356563782, a3 = 1.781477937;
  double const a4 = -1.821255978, a5 = 1.330274429;

  L = fabs(x);
  K = 1.0 / (1.0 + 0.2316419 * L);
  w = 1.0 - 1.0 / sqrt(2 * Pi) * exp(-L *L / 2) * (a1 * K + a2 * K *K + a3 * pow(K,3) + a4 * pow(K,4) + a5 * pow(K,5));

  if (x < 0 ){
    w= 1.0 - w;
  }
  return w;
}

answered Aug 18, 2010 at 13:27

Tyler Brock

29.9k16 gold badges80 silver badges81 bronze badges

7

ouch... don't use pow, use Horner's rule. I downvote until this is corrected (please notify me).
– Alexandre C.
Commented Mar 11, 2011 at 10:31
5

this code will lose precision. Horner's rule is stabler (and also faster).
– Alexandre C.
Commented May 27, 2011 at 19:04
1

why not just use double pK3 = K*K*K and so on?
– Daniel Bonetti
Commented Jun 18, 2015 at 18:31
(As far as I know) But pow is defined as a macro, I think in order to allow good implementations to optimize common powers, such as 2 and 3. So don't give up on pow too soon!
– Aaron McDaid
Commented Dec 4, 2015 at 9:09
Thanks for the code! I assume this is a Taylor expansion of the Normal Distribution and then integration of the obtained polynom. Anyway, I tested the code against Boost libraries in the z range of -3.8 to +3.8 with an increment of 0.01 and the sum of the absolute differences abs(boost-cnd_manul) is in the order of 10^-6.
– macroland
Commented May 1, 2018 at 1:34

Add a comment |

Hassan Syed · Accepted Answer · 2010-02-24 18:04:44Z

11

Boost is as good as the standard :D here you go: boost maths/statistical.

answered Feb 24, 2010 at 18:04

Hassan Syed

20.3k11 gold badges89 silver badges176 bronze badges

Is there a standard one built in?
– Tyler Brock
Commented Feb 24, 2010 at 18:05
No, the standard library does not yet have any.
– dirkgently
Commented Feb 24, 2010 at 18:07
normal distribution, yes I think so. Or are you talking about built into the standard library -- in the latter case, no.
– Hassan Syed
Commented Feb 24, 2010 at 18:07
3 reasons for using boost: 1) You can copy out libraries of your choosing using a utility that comes with boost. 2) the libraries tend to be header only. 3) most libraries work very well and work everywhere with a C++ compiler.
– Hassan Syed
Commented Feb 24, 2010 at 18:12
2

No... The boost documentation page will never fade away.
– SmallChess
Commented Mar 18, 2015 at 6:18

Add a comment |

thus spake a.k. · Accepted Answer · 2014-04-18 07:24:47Z

The implementations of the normal CDF given here are single precision approximations that have had float replaced with double and hence are only accurate to 7 or 8 significant (decimal) figures.
For a VB implementation of Hart's double precision approximation, see figure 2 of West's Better approximations to cumulative normal functions.

Edit: My translation of West's implementation into C++:

double
phi(double x)
{
  static const double RT2PI = sqrt(4.0*acos(0.0));

  static const double SPLIT = 7.07106781186547;

  static const double N0 = 220.206867912376;
  static const double N1 = 221.213596169931;
  static const double N2 = 112.079291497871;
  static const double N3 = 33.912866078383;
  static const double N4 = 6.37396220353165;
  static const double N5 = 0.700383064443688;
  static const double N6 = 3.52624965998911e-02;
  static const double M0 = 440.413735824752;
  static const double M1 = 793.826512519948;
  static const double M2 = 637.333633378831;
  static const double M3 = 296.564248779674;
  static const double M4 = 86.7807322029461;
  static const double M5 = 16.064177579207;
  static const double M6 = 1.75566716318264;
  static const double M7 = 8.83883476483184e-02;

  const double z = fabs(x);
  double c = 0.0;

  if(z<=37.0)
  {
    const double e = exp(-z*z/2.0);
    if(z<SPLIT)
    {
      const double n = (((((N6*z + N5)*z + N4)*z + N3)*z + N2)*z + N1)*z + N0;
      const double d = ((((((M7*z + M6)*z + M5)*z + M4)*z + M3)*z + M2)*z + M1)*z + M0;
      c = e*n/d;
    }
    else
    {
      const double f = z + 1.0/(z + 2.0/(z + 3.0/(z + 4.0/(z + 13.0/20.0))));
      c = e/(RT2PI*f);
    }
  }
  return x<=0.0 ? c : 1-c;
}

Note that I have rearranged expressions into the more familiar forms for series and continued fraction approximations. The last magic number in West's code is the square root of 2π, which I've deferred to the compiler on the first line by exploiting the identity acos(0) = ½ π.
I've triple checked the magic numbers, but there's always the chance that I've mistyped something. If you spot a typo, please comment!

The results for the test data John Cook used in his answer are

 x               phi                Mathematica
-3     1.3498980316301150e-003    0.00134989803163
-1     1.5865525393145702e-001    0.158655253931
 0     5.0000000000000000e-001    0.5
0.5    6.9146246127401301e-001    0.691462461274
2.1    9.8213557943718344e-001    0.982135579437

I take some small comfort from the fact that they agree to all of the digits given for the Mathematica results.

That would depend upon the precision guarantees of erfc. There's certainly going to be a slight rounding of the product of the argument and the square root of one half.which may propagate to the final value. Hart's algorithm is claimed to be accurate to double precision for every argument, although I've not independantly verified that. In any event both will be much, much better than single precision approximations in which float is replaced with double! — thus spake a.k., Commented May 25, 2016 at 20:59
Is there any easy way to modify this code to account for degrees of freedom? Like Scipy's t-test? — tantrev, Commented May 21, 2017 at 4:37
@tantrev I very much doubt that there's a simple way to transform the series and continued fraction approximations of the normal CDF into that of the t distribution I'm afraid. — thus spake a.k., Commented May 27, 2017 at 0:39
@thusspakea.k. Thank you! I suppose my hope for quickly approximating ~1E6 p-values with the same N size was just fool-hardy. — tantrev, Commented May 27, 2017 at 1:48

serbaut · Accepted Answer · 2012-11-09 21:09:21Z

From NVIDIA CUDA samples:

static double CND(double d)
{
    const double       A1 = 0.31938153;
    const double       A2 = -0.356563782;
    const double       A3 = 1.781477937;
    const double       A4 = -1.821255978;
    const double       A5 = 1.330274429;
    const double RSQRT2PI = 0.39894228040143267793994605993438;

    double
    K = 1.0 / (1.0 + 0.2316419 * fabs(d));

    double
    cnd = RSQRT2PI * exp(- 0.5 * d * d) *
          (K * (A1 + K * (A2 + K * (A3 + K * (A4 + K * A5)))));

    if (d > 0)
        cnd = 1.0 - cnd;

    return cnd;
}

Manohar Reddy Poreddy · Accepted Answer · 2020-04-19 14:46:02Z

2

From https://en.cppreference.com/w/cpp/numeric/math/erfc

Normal CDF can be calculated as below:

#include <iostream>
#include <cmath>
#include <iomanip>
using namespace std;

double normalCDF(double x) // Phi(-∞, x) aka N(x)
{
    return erfc(-x / sqrt(2))/2;
}

Using 2.0 instead of 2 in the denominator helps in getting decimals instead of integers.

Hope that helps.

edited Apr 19, 2020 at 14:46

answered May 1, 2019 at 3:18

Manohar Reddy Poreddy

26.8k10 gold badges161 silver badges144 bronze badges

Add a comment |

Khanh · Accepted Answer · 2022-10-30 04:07:16Z

1

Since this question was asked almost 13 years ago and the answers are quite outdated, as of now, to calculate the cdf of a normal distribution, we can use the boost library which can be downloaded from here https://www.boost.org/. Once you have installed the latest version, #include any distribution for example #include "boost/math/distributions/normal.hpp" and you can use the cdf directly. Remember to use the namespace boost::math. You can refer to this link for further reference: https://www.boost.org/doc/libs/1_80_0/boost/math/distributions.hpp

edited Oct 30, 2022 at 4:07

answered Oct 1, 2022 at 10:20

Khanh

717 bronze badges

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.
– Community Bot
Commented Oct 8, 2022 at 10:52

Add a comment |

Collectives™ on Stack Overflow

Cumulative Normal Distribution Function in C/C++

8 Answers 8

Not the answer you're looking for? Browse other questions tagged
c++
c
math
statistics
distribution
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

Not the answer you're looking for? Browse other questions tagged c++cmathstatisticsdistribution or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
c++
c
math
statistics
distribution
or ask your own question.