49
double r = 11.631;
double theta = 21.4;

In the debugger, these are shown as 11.631000000000000 and 21.399999618530273.

How can I avoid this?

14 Answers 14

57

These accuracy problems are due to the internal representation of floating point numbers and there's not much you can do to avoid it.

By the way, printing these values at run-time often still leads to the correct results, at least using modern C++ compilers. For most operations, this isn't much of an issue.

4
  • It is something programmers should be aware of though, especially if they work with very large or very small numbers where accuracy may be important.
    – tloach
    Commented Oct 7, 2008 at 9:56
  • Not necessarily very large or very small--floating point precision is the same regardless of overall number size. The problem is when you mix very large and very small values, such as adding them together. Commented Oct 7, 2008 at 9:58
  • 4
    Dark -- that's not actually true. The space of representable values is much denser near 0, and much more sparse as you go out to infinity ( for example, 2^24+1 can't be represented exactly using the IEEE floating point standard for 32-bit doubles)
    – SquareCog
    Commented Oct 10, 2008 at 16:07
  • Exponentially sparser, in fact, because you're applying an exponent.
    – Peter Wone
    Commented Oct 12, 2008 at 2:31
39

I liked Joel's explanation, which deals with a similar binary floating point precision issue in Excel 2007:

See how there's a lot of 0110 0110 0110 there at the end? That's because 0.1 has no exact representation in binary... it's a repeating binary number. It's sort of like how 1/3 has no representation in decimal. 1/3 is 0.33333333 and you have to keep writing 3's forever. If you lose patience, you get something inexact.

So you can imagine how, in decimal, if you tried to do 3*1/3, and you didn't have time to write 3's forever, the result you would get would be 0.99999999, not 1, and people would get angry with you for being wrong.

2
  • 7
    If you tried to do 3*1/3, you'd multiply the three by the one and have three. Then you'd divide three by three and no one should be mad. I'm assuming Joel meant to say 3*(1/3).
    – Nosredna
    Commented Jun 14, 2009 at 16:28
  • 2
    @Nosredna It depends whether the language you are using has a higher operator precedence for * or /. Commented Jun 10, 2011 at 6:05
13

If you have a value like:

double theta = 21.4;

And you want to do:

if (theta == 21.4)
{
}

You have to be a bit clever, you will need to check if the value of theta is really close to 21.4, but not necessarily that value.

if (fabs(theta - 21.4) <= 1e-6)
{
}
1
  • 1
    double theta = 21.4; bool b = theta == 21.4;// here b is always true Commented Nov 4, 2008 at 15:24
7

This is partly platform-specific - and we don't know what platform you're using.

It's also partly a case of knowing what you actually want to see. The debugger is showing you - to some extent, anyway - the precise value stored in your variable. In my article on binary floating point numbers in .NET, there's a C# class which lets you see the absolutely exact number stored in a double. The online version isn't working at the moment - I'll try to put one up on another site.

Given that the debugger sees the "actual" value, it's got to make a judgement call about what to display - it could show you the value rounded to a few decimal places, or a more precise value. Some debuggers do a better job than others at reading developers' minds, but it's a fundamental problem with binary floating point numbers.

1
  • 2
    Jon, the question was originally tagged as C++/VC6 so we actually knew the platform before someone decided that this information wasn't important and edited the tags. Commented Oct 7, 2008 at 11:00
5

Use the fixed-point decimal type if you want stability at the limits of precision. There are overheads, and you must explicitly cast if you wish to convert to floating point. If you do convert to floating point you will reintroduce the instabilities that seem to bother you.

Alternately you can get over it and learn to work with the limited precision of floating point arithmetic. For example you can use rounding to make values converge, or you can use epsilon comparisons to describe a tolerance. "Epsilon" is a constant you set up that defines a tolerance. For example, you may choose to regard two values as being equal if they are within 0.0001 of each other.

It occurs to me that you could use operator overloading to make epsilon comparisons transparent. That would be very cool.


For mantissa-exponent representations EPSILON must be computed to remain within the representable precision. For a number N, Epsilon = N / 10E+14

System.Double.Epsilon is the smallest representable positive value for the Double type. It is too small for our purpose. Read Microsoft's advice on equality testing

1
  • Quick note (but not a contradiction) - if you use the System.Decimal type in .NET, be aware that that's still a floating point type. It's a floating decimal point, but still a floating point. Oh, and also beware of System.Double.Epsilon, as it's not what you might expect it to be :)
    – Jon Skeet
    Commented Oct 7, 2008 at 9:36
4

I've come across this before (on my blog) - I think the surprise tends to be that the 'irrational' numbers are different.

By 'irrational' here I'm just referring to the fact that they can't be accurately represented in this format. Real irrational numbers (like π - pi) can't be accurately represented at all.

Most people are familiar with 1/3 not working in decimal: 0.3333333333333...

The odd thing is that 1.1 doesn't work in floats. People expect decimal values to work in floating point numbers because of how they think of them:

1.1 is 11 x 10^-1

When actually they're in base-2

1.1 is 154811237190861 x 2^-47

You can't avoid it, you just have to get used to the fact that some floats are 'irrational', in the same way that 1/3 is.

2
  • 1
    Keith, actually none of your examples are irrational. Sqrt(2) is irrational, PI is irrational, but any integer divided by an integer is, by definition, rational.
    – Sklivvz
    Commented Oct 10, 2008 at 14:25
  • You're quite right - hence the single quotes. In math-theory these are rational numbers, they just can't be expressed in the storage mechanism used.
    – Keith
    Commented Oct 10, 2008 at 15:55
3

One way you can avoid this is to use a library that uses an alternative method of representing decimal numbers, such as BCD

2
  • There are better techniques than BCD. Commented Oct 7, 2008 at 8:10
  • 1
    It would have been nice saying one or two of those techniques.
    – anon
    Commented Oct 7, 2008 at 9:32
3

If you are using Java and you need accuracy, use the BigDecimal class for floating point calculations. It is slower but safer.

0
3

Seems to me that 21.399999618530273 is the single precision (float) representation of 21.4. Looks like the debugger is casting down from double to float somewhere.

2

You cant avoid this as you're using floating point numbers with fixed quantity of bytes. There's simply no isomorphism possible between real numbers and its limited notation.

But most of the time you can simply ignore it. 21.4==21.4 would still be true because it is still the same numbers with the same error. But 21.4f==21.4 may not be true because the error for float and double are different.

If you need fixed precision, perhaps you should try fixed point numbers. Or even integers. I for example often use int(1000*x) for passing to debug pager.

1
  • One might actually prefer int(1000*x+.5) to make 21.4 appear as expected.
    – Reunanen
    Commented Nov 5, 2008 at 18:39
1

If it bothers you, you can customize the way some values are displayed during debug. Use it with care :-)

Enhancing Debugging with the Debugger Display Attributes

0

Refer to General Decimal Arithmetic

Also take note when comparing floats, see this answer for more information.

0

According to the javadoc

"If at least one of the operands to a numerical operator is of type double, then the
operation is carried out using 64-bit floating-point arithmetic, and the result of the
numerical operator is a value of type double. If the other operand is not a double, it is
first widened (§5.1.5) to type double by numeric promotion (§5.6)."

Here is the Source

Not the answer you're looking for? Browse other questions tagged or ask your own question.