86

In every place I've looked, it says that double is superior to float in almost every way. float has been made obsolete by double in Java, so why is it still used?

I program a lot with Libgdx, and they force you to use float (deltaTime, etc.), but it seems to me that double is just easier to work with in terms of storage and memory.

I also read When do you use float and when do you use double, but if float is really only good for numbers with a lot of digits after the decimal point, then why can't we just use one of the many variations of double?

Is there any reason as to why people insist on using floats even though it doesn't really have any advantages anymore? Is it just too much work to change it all?

15
  • 8
    Possible duplicate of When do you use float and when do you use double Commented Apr 26, 2016 at 16:10
  • 59
    How in the world did you infer "float is really only good for numbers with a lot of digits after the decimal point" from the answers to that question?! They say the direct opposite!
    – Ordous
    Commented Apr 26, 2016 at 16:48
  • 20
    @Eames Note how it says "numbers", not "digits". Floats are worse when you need precision or range, they are better when you need lots and lots of not-so-precise data. That's what those answers say.
    – Ordous
    Commented Apr 26, 2016 at 17:22
  • 29
    Why do we have byte and short and int when there's long? Commented Apr 27, 2016 at 6:55
  • 15
    A much more fitting question is "why would you remove a keyword and primitive datatype from a language with decades of code that would just break for no reason"?
    – sara
    Commented Apr 27, 2016 at 18:02

6 Answers 6

170

LibGDX is a framework mostly used for game development.

In game development you usually have to do a whole lot of number crunching in real-time and any performance you can get matters. That's why game developers usually use float whenever float precision is good enough.

The size of the FPU registers in the CPU is not the only thing you need to consider in this case. In fact most of the heavy number crunching in game development is done by the GPU, and GPUs are usually optimized for floats, not doubles.

And then there is also:

  • memory bus bandwidth (how fast you can shovel data between RAM, CPU and GPU)
  • CPU cache (which makes the previous less necessary)
  • RAM
  • VRAM

which are all precious resources of which you get twice as much when you use 32bit float instead of 64bit double.

20
  • 2
    Thank you! This really helped because you went in depth on what the memory usage changed and why
    – Eames
    Commented Apr 26, 2016 at 21:06
  • 7
    Also, for SIMD operations, 32-bit values can have twice the throughput. As 8bittree's answer points out, GPUs have an even greater performance penalty with double precision.
    – user87195
    Commented Apr 27, 2016 at 2:20
  • 5
    Many graphic pipeline even support 16-bit half-floats to increase performance where precision is sufficient.
    – Adi Shavit
    Commented Apr 27, 2016 at 9:48
  • 22
    @phresnel All are. You have to move positions, update data and what not. And this is the simple part. Then you have to render (= read, rotate, scale and translate) the textures, distances, get it to the screens format ... There's a lot to do.
    – Sebb
    Commented Apr 27, 2016 at 13:05
  • 8
    @phresnel as a former VP Operations of a game development enterprise, I assure you almost every game there is a ton of number crunching. Note it's usually contained in libraries and 100% abstracted away from the engineer, I would hope they understand and respect that all that crunching is going on. Magic inverse square root, anyone?
    – corsiKa
    Commented Apr 27, 2016 at 16:41
58

Floats use half as much memory as doubles.

They may have less precision than doubles, but many applications don't require precision. They have a larger range than any similarly-sized fixed point format. Therefore, they fill a niche that needs wide ranges of numbers but does not need high precision, and where memory usage is important. I've used them for large neural network systems in the past, for example.

Moving outside of Java, they're also widely used in 3D graphics, because many GPUs use them as their primary format - outside of very expensive NVIDIA Tesla / AMD FirePro devices, double-precision floating point is very slow on GPUs.

2
  • 8
    Speaking of neural networks, CUDA currently has support for half-precision (16-bit) floating point variables, even less precise but with even lower memory footprints, due to the increased usage of accelerators for machine learning work.
    – JAB
    Commented Apr 26, 2016 at 22:08
  • And when you program FPGAs you tend to select the amount of bits for both mantissa and exponent manually every time :v
    – Sebi
    Commented Apr 28, 2016 at 13:04
48

Backwards Compatibility

This is the number one reason for keeping behavior in an already existing language/library/ISA/etc.

Consider what would happen if they took floats out of Java. Libgdx (and thousands of other libraries and programs) wouldn't work. It's going to take a lot of effort to get everything updated, quite possibly years for many projects (just look at the backwards compatibility-breaking Python 2 to Python 3 transition). And not everything will be updated, some things will be broken forever because the maintainers abandoned them, perhaps sooner than they would have because it would take more effort than they want to update, or because it's no longer possible to accomplish what their software was supposed to do.

Performance

64 bit doubles take twice the memory and are almost always slower to process than 32 bit floats (the very rare exceptions being where 32 bit float capability is expected to be used so rarely or not at all, that no effort was made to optimize for them. Unless you're developing for specialized hardware, you won't experience this in the near future.)

Especially relevant to you, Libgdx is a game library. Games have a tendency to be more performance sensitive than most software. And gaming graphics cards (i.e. AMD Radeon and NVIDIA Geforce, not FirePro or Quadro) tend to have very weak 64 bit floating point performance. Courtesy of Anandtech, here's how double precision performance compares to single precision performance on some of AMD's and NVIDIA's top gaming cards available (as of early 2016)

AMD
Card    R9 Fury X      R9 Fury       R9 290X    R9 290
FP64    1/16           1/16          1/8        1/8

NVIDIA
Card    GTX Titan X    GTX 980 Ti    GTX 980    GTX 780 Ti
FP64    1/32           1/32          1/32       1/24

Note that the R9 Fury and GTX 900 series are newer than the R9 200 and GTX 700 series, so relative performance for 64 bit floating point is decreasing. Go back far enough and you'll find the GTX 580, which had a 1/8 ratio like the R9 200 series.

1/32 of the performance is a pretty big penalty to pay if you have a tight time constraint and don't gain much by using the larger double.

3
  • 1
    note that the performance for 64-bit floating point is decreasing relative to the 32-bit performance due to increasingly-highly optimized 32-bit instructions, not because the actual 64-bit performance is decreasing. it also depends on the actual benchmark used; I wonder if the 32-bit performance deficit highlighted in these benchmarks is due to memory bandwidth issues as well as actual computational speed
    – sig_seg_v
    Commented Apr 26, 2016 at 22:35
  • If you're going to talk about DP performance in graphics cards you should definitely mention the Titan/Titan Black. Both feature mods that allow the card to reach 1/3 performance, at the cost of single precision performance.
    – SGR
    Commented Apr 27, 2016 at 8:32
  • @sig_seg_v There are definitely at least some cases where the 64-bit performance decreases absolutely, not just relatively. See these results for a double precision Folding@Home benchmark, where a GTX 780 Ti beats both a GTX 1080 (another 1/32 ratio card) and a 980 Ti, and on AMD's side, the 7970 (a 1/4 ratio card), as well as the R9 290 and R9 290X all beat the R9 Fury series. Compare that to the single precision version of the benchmark, where the newer cards all handily outperform their predecessors.
    – 8bittree
    Commented Jul 26, 2016 at 0:12
37

Atomic operations

In addition to what others have already said, a Java-specific disadvantage of double (and long) is that assignments to 64-bit primitive types are not guaranteed to be atomic. From the Java Language Specification, Java SE 8 Edition, page 660 (emphasis added):

17.7 Non-atomic Treatment of double and long

For the purposes of the Java programming language memory model, a single write to a non-volatile long or double value is treated as two separate writes: one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write.

Yuck.

To avoid this, you have to declare the 64-bit variable with the volatile keyword, or use some other form of synchronization around assignments.

3
  • 2
    Don't you need to synchronize concurrent access to ints and floats anyways to prevent lost updates and make them volatile to prevent overeager caching? Am I wrong in thinking that the only thing the int/float atomicity prevents is that they can never contain "mixed" values they weren't supposed to hold?
    – ASA
    Commented Apr 28, 2016 at 11:33
  • 3
    @Traubenfuchs That is, indeed what is guaranteed there. The term I have heard used for it is "tearing," and I think it captures the effect quite nicely. The Java programming language model guarantees that 32 bit values, when read, will have a value which was written to them at some point. That is a surprisingly valuable guarantee.
    – Cort Ammon
    Commented Apr 28, 2016 at 23:54
  • This point about atomicity is super-important. Wow, I'd forgotten about this important fact. Counter-intuitive as we may tend to think of primitives as being atomic by nature. But not atomic in this case. Commented Apr 30, 2016 at 3:14
3

It seems other answers missed one important point: The SIMD architectures can process less/more data depending if they operate on double or float structs (for example, eight float values at a time, or four double values at a time).

Performance considerations summary

  • float may be faster on certain CPUs (for example, certain mobile devices).
  • float uses less memory so in huge data sets it may substantially reduce the total required memory (hard disk / RAM) and consumed bandwidth.
  • float may cause a CPU to consume less power (I cannot find a reference, but if not possible at least seems plausible) for single-precision computations compared to double precision computations.
  • float consumes less bandwidth, and in some applications that matters.
  • SIMD architectures may process as much as twice the same amount of data because usually.
  • float uses as much as half of cache memory compared to double.

Accuracy considerations summary

  • In many applications float is enough
  • double has much more precision anyway

Compatibility considerations

  • If your data has to be submitted to a GPU (for example, for a video game using OpenGL or whatever other rendering API), the floating point format is considerably faster than double (that's because GPU manufacturers try to increase the number of graphics cores, and thus they try to save as much circuitry as possible in each core, so optimizing for float allows to create GPUs with more cores inside)
  • Old GPUs and some mobile devices just cannot accept double as the internal format (for 3D rendering operations)

General tips

  • On modern desktop processors (and probably a good amount of mobile processors) you can basically assume using temporary double variables on the stack gives extra precision for free (extra precision without performance penalty).
  • Never use more precision than you need (you may not know how much precision you really need).
  • Sometimes you are just forced by the range of values (some values would be infinite if you are using float, but may be limited values if you are using double)
  • Using only float or only double greatly helps the compiler to SIMD-ify the instructions.

See comments below from PeterCordes for more insights.

3
  • 1
    double temporaries is only free on x86 with the x87 FPU, not with SSE2. Auto-vectorizing a loop with double temporaries means unpacking float to double, which takes an extra instruction, and you process half as many elements per vector. Without auto-vectorization, the conversion can usually happen on the fly during a load or store, but it means extra instructions when you're mixing floats and doubles in expressions. Commented Apr 27, 2016 at 19:17
  • 1
    On modern x86 CPUs, div and sqrt are faster for float than double, but other things are the same speed (not counting the SIMD vector width issue, or memory bandwidth / cache footprint of course). Commented Apr 27, 2016 at 19:18
  • @PeterCordes thanks for expanding some points. I was not aware of the div and sqrt disparity Commented Apr 28, 2016 at 8:03
0

Apart from the other reasons which were mentioned:

If you have measure data, be it pressures, flows, currents, voltages or whatever, this is often done with hardware having an ADC.

An ADC typically has 10 or 12 bits, 14 or 16 bits ones are rarer. But let's stick on the 16 bit one - if measuring around full scale, you have an accuracy of 1/65535. That means a change from 65534/65535 to 65535/65535 is just this step - 1/65535. That's roughly 1.5E-05. The accuracy of a float is around 1E-07, so a lot better. That means you don't lose anything by using float for storing these data.

If you do excessive calculations with floats, you perform lightly worse than with doubles in terms of accuracy, but often you don't need that accuracy, as you often don't care if you just measured a voltage of 2 V or 2.00002 V. Similarly, if you convert this voltage into a pressure, you don't care if you have 3 bar or 3.00003 bar.

Not the answer you're looking for? Browse other questions tagged or ask your own question.