4

Is there an optimized, performant way to round a double to the exact value nearest multiple of a given power of two fraction?

In other words, round .44 to the nearest 1/16 (in other words, to a value that can be expressed as n/16 where n is an integer) would be .4375. Note: this is relevant because power of two fractions can be stored without rounding errors, e.g.

public class PowerOfTwo {
  public static void main(String... args) {
    double inexact = .44;
    double exact = .4375;

    System.out.println(inexact + ":   " + Long.toBinaryString(Double.doubleToLongBits(inexact)));
    System.out.println(exact + ": " + Long.toBinaryString(Double.doubleToLongBits(exact)));
  }
}

Output:

0.44:   11111111011100001010001111010111000010100011110101110000101001
0.4375: 11111111011100000000000000000000000000000000000000000000000000
5
  • How do you expect storing your result to differ from a normal double value? All doubles are multiples of 2^x simply because of the exponent part of the double. What do you hope to gain? Do you want your result to represent excatly a value of 2^x?
    – midor
    Commented Jan 30, 2015 at 14:49
  • 2
    The best bet here would be to go through a long, mask, and then convert back to a double again
    – fge
    Commented Jan 30, 2015 at 15:19
  • @fge I agree, but I'm not exactly sure how to do that. Especially if the integer part of the fraction is not zero e.g. 12345.4375
    – durron597
    Commented Jan 30, 2015 at 15:20
  • How is that a problem? You only alter the fraction, not the exponent
    – fge
    Commented Jan 30, 2015 at 15:26
  • @fge Because you have to mask fewer bits when the exponent is larger
    – durron597
    Commented Jan 30, 2015 at 15:27

6 Answers 6

3

If you want to chose the power of two, the simplest way is to multiply by e.g. 16, round to nearest integer, then divide by 16. Note that division by a power of two is exact if the result is a normal number. It can cause rounding error for subnormal numbers.

Here is a sample program using this technique:

public class Test {
  public static void main(String[] args) {
    System.out.println(roundToPowerOfTwo(0.44, 2));
    System.out.println(roundToPowerOfTwo(0.44, 3));
    System.out.println(roundToPowerOfTwo(0.44, 4));
    System.out.println(roundToPowerOfTwo(0.44, 5));
    System.out.println(roundToPowerOfTwo(0.44, 6));
    System.out.println(roundToPowerOfTwo(0.44, 7));
    System.out.println(roundToPowerOfTwo(0.44, 8));
  }

  public static double roundToPowerOfTwo(double in, int power) {
    double multiplier = 1 << power;
    return Math.rint(in * multiplier) / multiplier;
  }
}

Output:

0.5
0.5
0.4375
0.4375
0.4375
0.4375
0.44140625
7
  • The first paragraph of this response is a direct response to unclear wording in my original question. I have since fixed the question to clarify.
    – durron597
    Commented Jan 30, 2015 at 15:49
  • @durron597 I've edited my answer accordingly, and added a sample program. Commented Jan 30, 2015 at 15:56
  • I've upvoted you. I wonder if this answer is faster than my bitmask answer, and also if it works for large integer parts.
    – durron597
    Commented Jan 30, 2015 at 16:02
  • @durron597 Good point about large value - it would fail if the result of the multiply overflows long. Commented Jan 30, 2015 at 16:08
  • I've corrected the overflow issue @durron597 raised, by using Math.rint instead of Math.round. Now it will only overflow if the result of the multiply is greater than Double.MAX_VALUE. Commented Jan 30, 2015 at 16:13
1

If the question is about rounding any number to a pre-determined binary precision, what you need to do is this:

  1. Convert the value to long using 'Double.doubleToLongBits()`
  2. Examine the exponent: if it's too big (exponent+required precision>51, the number of bits in the significand), you won't be able to do any rounding but you won't have to: the number already satisfies your criteria.
  3. If on the other hand exponent+required precision<0, the result of the rounding is always 0.
  4. In any other case, look at the significand and blot out all the bits that are below the exponent+required precisionth significant bit.
  5. Convert the number back to double using Double.longBitsToDouble()
1
  • I attempted to implement this answer
    – durron597
    Commented Jan 30, 2015 at 15:44
1

Getting this right in all corner cases is a bit tricky. If I have to solve such a task, I'd usually start with a naive implementation that I can be pretty sure will be correct and only then start implementing an optimized version. While doing so, I can always compare against the naive approach to validate my results.

The naive approach is to start with 1 and multiply / divide it with / by 2 until we have bracketed the absolute value of the input. Then, we'll output the nearer of the boundaries. It's actually a bit more complicated: If the value is a NaN or infinity, it requires special treatment.

Here is the code:

public static double getClosestPowerOf2Loop(final double x) {
    final double absx = Math.abs(x);
    double prev = 1.0;
    double next = 1.0;
    if (Double.isInfinite(x) || Double.isNaN(x)) {
        return x;
    } else if (absx < 1.0) {
        do {
            prev = next;
            next /= 2.0;
        } while (next > absx);
    } else if (absx > 1.0) {
        do {
            prev = next;
            next *= 2.0;
        } while (next < absx);
    }
    if (x < 0.0) {
        prev = -prev;
        next = -next;
    }
    return (Math.abs(next - x) < Math.abs(prev - x)) ? next : prev;
}

I hope the code will be clear without further explanation. Since Java 8, you can use !Double.isFinite(x) as a replacement for Double.isInfinite(x) || Double.isNaN(x).

Let's see for an optimized version. As other answers have already suggested, we should probably look at the bit representation. Java requires floating point values to be represented using IEE 754. In that format, numbers in double (64 bit) precision are represented as

  • 1 bit sign,
  • 11 bits exponent and
  • 52 bits mantissa.

We will special-case NaNs and infinities (which are represented by special bit patterns) again. However, there is yet another exception: The most significant bit of the mantissa is implicitly 1 and not found in the bit pattern – except for very small numbers where a so-called subnormal representation us used where the most significant digit is not the most significant bit of the mantissa. Therefore, for normal numbers we will simply set the mantissa's bits to all 0 but for subnormals, we convert it to a number where none but the most significant 1 bit is preserved. This procedure always rounds towards zero so to get the other bound, we simply multiply by 2.

Let's see how this all works together:

public static double getClosestPowerOf2Bits(final double x) {
    if (Double.isInfinite(x) || Double.isNaN(x)) {
        return x;
    } else {
        final long bits = Double.doubleToLongBits(x);
        final long signexp = bits  & 0xfff0000000000000L;
        final long mantissa = bits & 0x000fffffffffffffL;
        final long mantissaPrev = Math.abs(x) < Double.MIN_NORMAL
            ? Long.highestOneBit(mantissa)
            : 0x0000000000000000L;
        final double prev = Double.longBitsToDouble(signexp | mantissaPrev);
        final double next = 2.0 * prev;
        return (Math.abs(next - x) < Math.abs(prev - x)) ? next : prev;
    }
}

I'm note entirely sure I have covered all corner cases but the following tests do run:

public static void main(final String[] args) {
    final double[] values = {
        5.0, 4.1, 3.9, 1.0, 0.0, -0.1, -8.0, -8.1, -7.9,
        0.9 * Double.MIN_NORMAL, -0.9 * Double.MIN_NORMAL,
        Double.NaN, Double.MAX_VALUE, Double.MIN_VALUE,
        Double.NEGATIVE_INFINITY, Double.POSITIVE_INFINITY,
    };
    for (final double value : values) {
        final double powerL = getClosestPowerOf2Loop(value);
        final double powerB = getClosestPowerOf2Bits(value);
        System.out.printf("%17.10g  -->  %17.10g  %17.10g%n",
                          value, powerL, powerB);
        assert Double.doubleToLongBits(powerL) == Double.doubleToLongBits(powerB);
    }
}

Output:

      5.000000000  -->        4.000000000        4.000000000
      4.100000000  -->        4.000000000        4.000000000
      3.900000000  -->        4.000000000        4.000000000
      1.000000000  -->        1.000000000        1.000000000
      0.000000000  -->        0.000000000        0.000000000
    -0.1000000000  -->      -0.1250000000      -0.1250000000
     -8.000000000  -->       -8.000000000       -8.000000000
     -8.100000000  -->       -8.000000000       -8.000000000
     -7.900000000  -->       -8.000000000       -8.000000000
 2.002566473e-308  -->   2.225073859e-308   2.225073859e-308
-2.002566473e-308  -->  -2.225073859e-308  -2.225073859e-308
              NaN  -->                NaN                NaN
 1.797693135e+308  -->   8.988465674e+307   8.988465674e+307
 4.900000000e-324  -->   4.900000000e-324   4.900000000e-324
        -Infinity  -->          -Infinity          -Infinity
         Infinity  -->           Infinity           Infinity

How about performance?

I have run the following benchmark

public static void main(final String[] args) {
    final Random rand = new Random();
    for (int i = 0; i < 1000000; ++i) {
        final double value = Double.longBitsToDouble(rand.nextLong());
        final double power = getClosestPowerOf2(value);
    }
}

where getClosestPowerOf2 is to be replaced by either getClosestPowerOf2Loop or getClosestPowerOf2Bits. On my laptop, I get the following results:

  • getClosestPowerOf2Loop: 2.35 s
  • getClosestPowerOf2Bits: 1.80 s

Was that really worth the effort?

3
  • In my real world application, I know the input will never be Infinity, NaN, or negative, so I don't need to be quite this careful.
    – durron597
    Commented Jan 30, 2015 at 16:33
  • Also, a loop is never necessary here, both my answer and Patricia's answer don't use loops.
    – durron597
    Commented Jan 30, 2015 at 16:40
  • 1
    My second function doesn't use a loop either. I have shown the looping approach to give a simple solution that I can be reasonably sure will cover all cases correctly.
    – 5gon12eder
    Commented Jan 30, 2015 at 16:43
0

You are going to need some bit magic if you are going to round to arbitrary powers of 2.

You will need to inspect the exponent:

int exponent = Math.getExponent(inexact);

Then knowing that there are 53 bits in the mantissa can find the bit at which you need to round with.


Or just do:

Math.round(inexact* (1l<<exponent))/(1l<<exponent)

I use Math.round because I expect it to be optimal for the task as opposed to trying to implement it yourself.

0

Here is my first attempt at a solution, that doesn't handle all the cases in @biziclop's answer, and probably does "floor" instead of "round"

public static double round(double d, int precision) {
  double longPart = Math.rint(d);
  double decimalOnly = d - longPart;
  long bits = Double.doubleToLongBits(decimalOnly);

  long mask = -1l << (54 - precision);

  return Double.longBitsToDouble(bits & mask) + longPart;
}
1
  • It will round towards zero, which is floor for positive, ceil for negative. I suggest using Math.rint to store intPart as a double, rather than an int which would overflow for all except relatively small doubles. Commented Jan 30, 2015 at 16:26
0

I came across this post trying to solve a related problem: how to efficiently find the two powers of two that bracket any given regular real value. Since my program deals in many types beside doubles I needed a general solution. Someone wanting to round to the nearest power of two can get the bracketing values and choose the closest. In my case the general solution required BigDecimals. Here is the trick I used.

For numbers > 1:

            int exponent = myBigDecimal.toBigInteger.bitLength() - 1;
            BigDecimal lowerBound = TWO.pow(exponent);
            BigDecimal upperBound = TWO.pow(exponent+1);

For numbers > 0 and < 1:

            int exponent = -(BigDecimal.ONE.divide(myBigDecimal, myContext).toBigInteger().bitLength()-1); 
            BigDecimal lowerBound = TWO.pow(exponent-1);
            BigDecimal upperBound = TWO.pow(exponent);

I have only lined out the positive case. You generally take a number, and use this algorithm on it's absolute value. And then if in the original problem the number was negative you multiply the algorithm's result by -1. Finally the orignal num == 0 or num == 1 are trivial to handle outside this algorithm. That covers the whole real number line except infinties and nans which you deal with before calling this algorithm.

3
  • I should add if you are looking for the nearest power of two after the decimal (like your coordinate rounding you were doing) you take your number and only calc on the fraction value (dropping the integer part) using the > 0 and < 1 case above. Then add that solution back to your integer part.
    – Barry
    Commented Aug 8, 2021 at 10:45
  • Is this actually more performant than the accepted answer?
    – durron597
    Commented Aug 26, 2021 at 16:11
  • @durron597 I would assume no. It's a general solution that can be adapted to other byte sizes etc. Anyone can specialize it to their needs. I've left this as a pointer to people looking for solutions to similar problems.
    – Barry
    Commented Aug 27, 2021 at 17:24

Not the answer you're looking for? Browse other questions tagged or ask your own question.