Find XOR of all numbers in a given range

Question

You are given a large range [a,b] where 'a' and 'b' can be typically between 1 and 4,000,000,000 inclusive. You have to find out the XOR of all the numbers in the given range.

This problem was used in TopCoder SRM. I saw one of the solutions submitted in the match and I'm not able to figure out how its working.

Could someone help explain the winning solution:

long long f(long long a) {
     long long res[] = {a,1,a+1,0};
     return res[a%4];
}

long long getXor(long long a, long long b) {
     return f(b)^f(a-1);
}

Here, getXor() is the actual function to calculate the xor of all number in the passed range [a,b] and "f()" is a helper function.

I edited your question just a bit. We don't mind explaining the why of some code, but we don't need a new list of other ways to solve this. Leave that to TopCoder. — Kev, Commented May 21, 2012 at 0:00
@Kev No issues! I wrote that because some people love to give their own way rather than explaining the already written thing. And any new idea is never a waste... ;) — rajneesh2k10, Commented May 22, 2012 at 2:55
This has undefined behaviour for a<=0, or for b<0. long long is a signed type, so x%4 is negative (or 0) for negative inputs. Perhaps you want unsigned long long, and/or a & 3 to index the array? — Peter Cordes, Commented Jan 30, 2019 at 0:12

Shahbaz · Accepted Answer · 2014-01-23 10:51:55Z

166

This is a pretty clever solution -- it exploits the fact that there is a pattern of results in the running XORs. The f() function calculates the XOR total run from [0, a]. Take a look at this table for 4-bit numbers:

0000 <- 0  [a]
0001 <- 1  [1]
0010 <- 3  [a+1]
0011 <- 0  [0]
0100 <- 4  [a]
0101 <- 1  [1]
0110 <- 7  [a+1]
0111 <- 0  [0]
1000 <- 8  [a]
1001 <- 1  [1]
1010 <- 11 [a+1]
1011 <- 0  [0]
1100 <- 12 [a]
1101 <- 1  [1]
1110 <- 15 [a+1]
1111 <- 0  [0]

Where the first column is the binary representation and then the decimal result and its relation to its index (a) into the XOR list. This happens because all the upper bits cancel and the lowest two bits cycle every 4. So, that's how to arrive at that little lookup table.

Now, consider for a general range of [a,b]. We can use f() to find the XOR for [0,a-1] and [0,b]. Since any value XOR'd with itself is zero, the f(a-1) just cancels out all the values in the XOR run less than a, leaving you with the XOR of the range [a,b].

edited Jan 23, 2014 at 10:51

Shahbaz

47.2k19 gold badges117 silver badges186 bronze badges

answered May 20, 2012 at 3:13

FatalError

54k16 gold badges103 silver badges118 bronze badges

min range threshold is 1, not 0
– Pencho Ilchev
Commented May 20, 2012 at 3:39
3

@PenchoIlchev Whether or not it includes 0 is kind of moot -- (n^0)==n
– FatalError
Commented May 20, 2012 at 4:03
2

@rajneesh2k10 Well, in runs of 4 (starting at a multiple of 4), all the bits except the lowest are the same, so they alternate between canceling each other or having their original value. It's true that the lowest bit cycles every 2, but 0^1 == 1 (i.e. they don't cancel). The reason the lowest two is special is because (00 ^ 01 ^ 10 ^ 11) == 00. In other words, every 4 values you cycle through gets you back to 0, and so you can cancel out all such cycles, which is why a%4 is significant.
– FatalError
Commented May 20, 2012 at 4:45
3

@Pandrei the a there is 2, not 0.
– user555045
Commented Jan 23, 2014 at 11:38
1

That column is the running xor and 1 xor 2 is 3, so the current value in that row looks correct to me.
– FatalError
Commented Aug 7, 2016 at 14:27

| Show 9 more comments

Luke Briggs · Accepted Answer · 2017-01-12 21:21:18Z

Adding to FatalError's great answer, the line return f(b)^f(a-1); could be explained better. In short, it's because XOR has these wonderful properties:

It's associative - Place brackets wherever you want
It's commutative - that means you can move the operators around (they can "commute")

Here's both in action:

(a ^ b ^ c) ^ (d ^ e ^ f) = (f ^ e) ^ (d ^ a ^ b) ^ c

It reverses itself

Like this:

a ^ b = c
c ^ a = b

Add and multiply are two examples of other associative/ commutative operators, but they don't reverse themselves. Ok, so, why are these properties important? Well, a simple route is to expand it out into what it really is, and then you can see these properties at work.

First, let's define what we want and call it n:

n      = (a ^ a+1 ^ a+2 .. ^ b)

If it helps, think of XOR (^) as if it was an add.

Let's also define the function:

f(b)   = 0 ^ 1 ^ 2 ^ 3 ^ 4 .. ^ b

b is greater than a, so just by safely dropping in a few extra brackets (which we can because it's associative), we can also say this:

f(b)   = ( 0 ^ 1 ^ 2 ^ 3 ^ 4 .. ^ (a-1) ) ^ (a ^ a+1 ^ a+2 .. ^ b)

Which simplifies to:

f(b)   = f(a-1) ^ (a ^ a+1 ^ a+2 .. ^ b)

f(b)   = f(a-1) ^ n

Next, we use that reversal property and commutivity to give us the magic line:

n      = f(b) ^ f(a-1)

If you've been thinking of XOR like an add, you would've dropped in a subtract there. XOR is to XOR what add is to subtract!

How do I come up with this myself?

Remember the properties of logical operators. Work with them almost like an add or multiply if it helps. It feels unusual that and (&), xor (^) and or (|) are associative, but they are!

Run the naive implementation through first, look for patterns in the output, then start finding rules which confirm the pattern is true. Simplify your implementation even further and repeat. This is probably the route that the original creator took, highlighted by the fact that it's not completely optimal (i.e. use a switch statement rather than an array).

This reminds me of my Discrete Mathematics course I took last year in university. Fun days. What came to my mind immediately after reading this is this XKCD comic. — Sean Francis N. Ballais, Commented Jan 18, 2017 at 13:40
It reverses itself is technically "a is the inverse of itself", or really we can extend this discussion all the way to basic Group Theory... — iBug, Commented Dec 29, 2021 at 14:22

Nirav Kamani · Accepted Answer · 2017-02-07 09:08:44Z

I found out that the below code is also working like the solution given in the question.

May be this is little optimized but its just what I got from observing repetition like given in the accepted answer,

I would like to know / understand the mathematical proof behind the given code, like explained in the answer by @Luke Briggs

Here is that JAVA code

public int findXORofRange(int m, int n) {
    int[] patternTracker;

    if(m % 2 == 0)
        patternTracker = new int[] {n, 1, n^1, 0};
    else
        patternTracker = new int[] {m, m^n, m-1, (m-1)^n};

    return patternTracker[(n-m) % 4];
}

Abhijeet Sonawane · Accepted Answer · 2018-01-28 16:42:24Z

2

I have solved the problem with recursion. I simply divide the dataset into an almost equal part for every iteration.

public int recursion(int M, int N) {
    if (N - M == 1) {
        return M ^ N;
    } else {
        int pivot = this.calculatePivot(M, N);
        if (pivot + 1 == N) {
            return this.recursion(M, pivot) ^ N;
        } else {
            return this.recursion(M, pivot) ^ this.recursion(pivot + 1, N);
        }
    }
}
public int calculatePivot(int M, int N) {
    return (M + N) / 2;
}

Let me know your thoughts over the solution. Happy to get improvement feedbacks. The proposed solution calculates the XOR in 0(log N) complexity.

Thank you

answered Jan 28, 2018 at 16:42

Abhijeet Sonawane

6629 silver badges16 bronze badges

2

This one has same Computational complexity with normal m ^ (m+1) ^ ... ^ (n-1) ^ n calculation. This is 0(n).
– The Anh Nguyen
Commented Apr 23, 2018 at 9:49

Add a comment |

Mohammad Nazmul Hossain · Accepted Answer · 2018-12-26 20:13:00Z

0

To support XOR from 0 to N the code given needed to be modified as below,

int f(int a) {
    int []res = {a, 1, a+1, 0};
    return res[a % 4];
}

int getXor(int a, int b) {
    return f(b) ^ f(a);
}

answered Dec 26, 2018 at 20:13

Mohammad Nazmul Hossain

3744 silver badges14 bronze badges

Add a comment |

kcsquared · Accepted Answer · 2022-03-01 23:40:38Z

Adding on even further to FatalError's answer, it's possible to prove (by induction) that the observed pattern in f() will cycle for every 4 numbers.

We're trying to prove that for every integer k >= 0,

f(4k + 1) = 1
f(4k + 2) = 4k + 3
f(4k + 3) = 0
f(4k + 4) = 4k + 4

where f(n) is 1 ^ 2 ^ ... ^ n.

As our base case, we can work out by hand that

f(1) = 1
f(2) = 1 ^ 2 = 3
f(3) = 3 ^ 3 = 0
f(4) = 0 ^ 4 = 4

For our inductive step, assume that these equations are true up to a particular integer 4x (i.e. f(4x) = 4x). We want to show that our equations are true for 4x + 1, 4x + 2, 4x + 3 and 4x + 4.

To help write and visualize the proof, we can let b(x) denote the binary (base-2) string representation of x, for example
b(7) = '111', b(9) = '1001'.

and

b(4x)     = 'b(x)00'
b(4x + 1) = 'b(x)01'
b(4x + 2) = 'b(x)10'
b(4x + 3) = 'b(x)11'

Here is the inductive step:

Assume: f(4x) = 4x = 'b(x)00'
Then:

f(4x + 1) = f(4x)    ^ (4x + 1)  // by definition
          = f(4x)    ^ 'b(x)01'  // by definition
          = 'b(x)00' ^ 'b(x)01'  // from assumption
          = '01'                 // as b(x) ^ b(x) = 0

f(4x + 2) = f(4x + 1) ^ (4x + 2)
          = f(4x + 1) ^ 'b(x)10'       
          = '01'      ^ 'b(x)10'
          = 'b(x)11'             // this is 4x + 3

f(4x + 3) = f(4x + 2) ^ (4x + 3)
          = f(4x + 2) ^ 'b(x)11' 
          = 'b(x)11'  ^ 'b(x)11'
          = '00'

For the last case, we don't use binary strings, 
since we don't know what b(4x + 4) is.

f(4x + 4) = f(4x + 3) ^ (4x + 4)
          = 0         ^ (4x + 4)
          = 4x + 4

So the pattern holds for the next four numbers after 4x, completing the proof.

Collectives™ on Stack Overflow

Find XOR of all numbers in a given range

6 Answers 6

How do I come up with this myself?

Not the answer you're looking for? Browse other questions tagged
algorithm
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

How do I come up with this myself?

Not the answer you're looking for? Browse other questions tagged algorithm or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
algorithm
or ask your own question.