Comparing pivot choosing methods in quicksort

Question

I have seen various tweaks for quicksort and to establish their usefulness, I designed a program that randomly generates arrays and times how long quicksort takes to sort them. Right now I'm focusing on how the pivot is chosen. I'm comparing choosing the first element as the pivot versus choosing the median of first, middle and last elements. I came across an implementation that presorts the first, middle and last element and have to implemented it for my tests.

Here is the original code I got the idea from:

public static int medianOf3(int[] intArray, int left, int right) {
    int center = (left + right) / 2;

    if (intArray[left] > intArray[center])
      swap(intArray, left, center);

    if (intArray[left] > intArray[right])
      swap(intArray, left, right);

    if (intArray[center] > intArray[right])
      swap(intArray, center, right);

    swap(intArray, center, right - 1);
    return intArray[right - 1];
}

First of all, I want to make sure I understand it.

The index of the middle element is computed
The 3 if statements rearange the first, middle and last element so they are in order relative to only each other (so for example 5,1,4,9,8 => 4,1,5,9,8). I am interested in the math behind how it is known that only 3 if statements are needed, since there are 3! (=6) permutations of 3 elements.
The median is swapped so it's beside the largest valued element of the 3. Latter in the code I noticed partitionIt() has int rightPtr = right - 1; and I think the -1 is to avoid one extra iteration of the while loop as it's known [right-1] and [right] is a sorted sub array of size 2. Is this right? I don't really see how this benefits the algorithm as quicksort works on the principle of finding the final location of a pivot, and doesn't care a bout a sorted subarray.

/*returns index of element with median value of beginning, middle and end elements
sorts beginning, middle and end element relative to each other*/
private static int medianOf3(int[] arr, int beginning, int end) {
    int middle = (beginning + end) >>> 1;//>>> prevents overflow error where / wouldn't
    /*following 3 lines may cause side effects*/
    if(arr[beginning] > arr[middle])
        swap(arr, beginning, middle);

    if(arr[beginning] > arr[end])
        swap(arr, beginning, end);

    if(arr[middle] > arr[end])
        swap(arr, middle, end);

    swap(arr, middle, end-1);
    return arr[end-1];

}
    public static void quicksort(int[] arr, int beginning, int end) {

    if(end-beginning >= 1) {
        int partition = partition(arr, beginning, end);
        quicksort(arr, beginning, partition);//note sure if this should be partition-1
        quicksort(arr, partition + 1, end);
    }
}
private static int partition(int[] arr, int beginning, int end) {
    //int pivot = arr[beginning];
    int pivot = medianOf3(arr, beginning, end);
    int lftPtr = beginning-1;
    int rhtPtr = end+1-1;//-1 for last swap in median()
    for(;;) {
        lftPtr = lftPtr + 1;
        while(arr[lftPtr] < pivot && lftPtr < end)
            lftPtr = lftPtr + 1;

        rhtPtr = rhtPtr - 1;
        while(arr[rhtPtr] > pivot && rhtPtr > beginning)
            rhtPtr = rhtPtr -1;

        if(rhtPtr > lftPtr)
            swap(arr, lftPtr, rhtPtr);
        else
            return lftPtr;
    }
}

I'm not sure if I should be calling quicksort recursively on

quicksort(arr, beginning, partition);

or

quicksort(arr, beginning, partition-1);

I randomly generate arrays and call quicksort on them. The time is meausured and summed and at the end it is divided by the number of arrays that were tested (to give the average).

long startTime = System.nanoTime();
quicksort(randomArray, 0, randomArray.length-1);
long endTime = System.nanoTime();
totalTime += endTime-startTime;

For my tests I run a quicksort on 1,000,000 arrays (of same size) and record the average time taken. I reran the entire test 3 times (as in started the program from fresh), and I did notice some variation (is this expected?). Here are my findings:

For the method of choosing the first element as the pivot and iterating through 1,000,000 arrays of length 100 with random values between [-100, 100] it took:

9498ns
9464ns
9459ns

Doing this on arrays of length 10 gave the times:

623ns
670ns
914ns
838ns
635ns

I ran extra tests as I was surprised to see such high variability. Why the variability?

For the tests run with the median with side effects implementation, on 1,000,000 randomly generated arrays of length 100 with values between [-100, 100] the results were:

8590ns
8697ns
8586ns

For arrays of length 10 the results were:

655ns
679ns
660ns

Does it make sense to rerun the program even though it essentially loops through itself 1,000,000 times to calculate the average?

It doesn't look like choosing the pivot of median of 3, and presorting the 3, is much better than choosing the first element. In the future I'm going to write a pivot-choosing method that only takes the median of 3 and doesn't do the side-effect thing and see how fast it preforms.

David Harkness · Accepted Answer · 2014-07-08 23:32:17Z

Sorting the Three

You only need at most three swaps to sort the three elements:

at most two swaps to place the smallest element into the first slot since it must be compared to the other two, and
at most one swap to order the remaining two elements.

How does this line up with there being 3! permutations? Consider that each permutation requires at most two swaps to create. The worst case is (3, 2, 1) which takes only one swap to create from (1, 2, 3)--swap 1 and 3--requires all three swaps to revert using the brute-force algorithm:

3 2 1    // swap 3 and 2
2 3 1    // swap 2 and 1
1 3 2    // swap 3 and 2
1 2 3

I'm sure someone here can provide a nice proof, but I'm satisfied by examination that it is correct and sufficient.

Micro-Optimization

The final swap when picking the median is a micro-optimization that moves the pivot into the partition where it would eventually be placed, bypassing a needless comparison with itself.

Given the layout before the swap,

[ left . . . . . pivot . . . . . right ]

these facts,

left < pivot
pivot < right
pivot = pivot

and this partitioning logic:

for every element
    if (element < pivot)
        move element to left half
    else
        move element to right half

we can make some minor optimizations by skipping the comparison for left, right, and pivot since we know into which partition they should land. The first two are already in their correct partitions, so we start by shifting in the looping terminals one slot. By performing a swap to place pivot into the second-to-last slot, we can skip comparing it as well. Here it must be placed at the far right end of the partition, next to right.

swap(arr, middle, right - 1);

which leaves this new layout:

[ left . . . . . . . . . . pivot right ]
       ^ partition these ^

If the partition comparison used <= instead of <, you'd swap the pivot into the second slot instead.

swap(arr, middle, left + 1);

[ left pivot . . . . . . . . . . right ]
             ^ partition these ^

Timing Variations

The runtime of sorting algorithms can be greatly affected by the initial order of the elements. Each algorithm has a best and worst case initial ordering. For example, Bubble Sort requires \$O(n^2)\$ swaps when given the elements in reverse order and none when already sorted.

Even by sorting one million arrays, when you run the program again you start with a new random seed which produces an entirely different set of random arrays. One seed may produce a higher percentage of worst cases. You can remove this element by choosing a random seed to use on every run with Random.setSeed.

In addition to that--and probably responsible for more of the outliers--you are calculating wall clock time. If the system swaps out your program to do other work, the clock keeps running. You can mitigate this by timing at the system level (e.g. using time on Linux) which will track wall clock, CPU, and system time.

Finally, the smaller arrays will involve a higher ratio of overhead compared to the actual sorting work. I recommend fewer repetitions with larger arrays, say sort 10,000 arrays of size 10,000. You may also want to consider creating initial arrays by shuffling the numbers \$(1..N)\$ to remove the unpredictable effects of equal elements.

@rolfl Hehe, don't you hate that? You're almost done with your awesome answer when someone swoops in and submits theirs first. — David Harkness, Commented Jul 8, 2014 at 20:20
I still don't understand the part about swap(arr, middle, end-1); in medianOf3? — Celeritas, Commented Jul 8, 2014 at 21:35
@Celeritas It's a micro-optimization. Because the center (median) element becomes the pivot, it will always be moved to the second partition. This swap makes it explicit and avoids the comparison and one iteration during the partition step. — David Harkness, Commented Jul 8, 2014 at 22:19
What do you mean the pivot is always moved to the second partition? For example (2,9,7,5,4)=>(2,9,4,5,7) with 4 as pivot. 9 and 4 get swapped => (2,4,9,5,7) so 4 is left of where it started and I call that the first partition. — Celeritas, Commented Jul 8, 2014 at 22:42

Community · Accepted Answer · 2020-06-10 13:24:26Z

Consistency of results

When you compile a Java program, you don't actually do much in the way of performance optimization of the code. Some basic things are done, but the real optimization is done by the JIT compiler at Java runtime.

The JIT compiler is able to take your class methods and analyze them, using statistics it gathers during your method rin,a n other items, to make your code really efficient. It can them mke bbigger decisions about performance as well. JIT is the performance king.

The JIT process takes time though, and the cost of the compile for code that runs once, or just a few times, is seldom worth it. So, JIT does nothing until the code has run many times (hundreds). Then, even if the code has been JIT compiled once, it may be determined that the code would benefit from a second, and even third compile with different characteristics.

It is only when you get to this final stage that it starts to make sense to performance-time/benchmark the code. The JIT based code can be thousands of times faster than the uncompiled code.

Consider this code here: https://softwareengineering.stackexchange.com/a/246535/109836

The first 1000 loops ran at an average of > 7ms.... average.

The second 1000 loops ran at an average of 1.6ms

The 101st 1000 loops ran at an average of 1.2ms.

The code was well-compiled by then.

When benchmarking Java, you have to 'warm up' the code so that you are not running in to compile times, or poorly compiled code when you do your performance work. warming up normally requires running the sensitive code thousands of times (one of the best optimizations available is inlining method calls, and that requires the calling method to be run a lot, not just the actual code doing the work).

The median of three

The median of three logic is relatively simple, though it can be confusing.

If you have three values, A, B, and C. What are the possible combinations?

A B C
A C B
B A C
B C A
C A B
C B A

Now, if A is less than B, and B is less than C, let's follow the logic through:

if(arr[beginning] > arr[middle])
    swap(arr, beginning, middle);

Is the first bigger than the second value. This is possible with the following combinations (marked with an asterist *):

A B C
A C B
B A C *
B C A
C A B *
C B A *

So, half the combinations could be like this. If it is, then swap the first two:

A B C
A C B
A B C *
B C A
A C B *
B C A *

OK, so, now for the second test:

if(arr[beginning] > arr[end])
    swap(arr, beginning, end);

I'll mark the instances where this is possible with a hash mark #

A B C
A C B
A B C *
B C A   #
A C B *
B C A * #

Swap them if they are:

A B C
A C B
A B C *
A C B   #
A C B *
A C B * #

Final test is:

if(arr[middle] > arr[end])
    swap(arr, middle, end);

A B C
A C B     @
A B C *
A C B   # @
A C B *   @
A C B * # @

Swap them:

A B C
A B C     @
A B C *
A B C   # @
A B C *   @
A B C * # @

Great point about the JIT, and I do like how you listed out the permutations to make it explicit. — David Harkness, Commented Jul 8, 2014 at 20:19
Thanks, but you forgot to explain why the need for swap(intArray, center, right - 1);? — Celeritas, Commented Jul 8, 2014 at 20:45
@Celeritas - no, I did not forget. I don't have enough context to determine why that last swap is significant. You don't show any code that uses the mid-point, so there's no context for it, so nothing to say. — rolfl, Commented Jul 8, 2014 at 20:50
I added more code. Also, you didn't say why doing the sorting in the medianOf3 method is better than leaving it for the main partitioning step? — Celeritas, Commented Jul 8, 2014 at 21:34
@Celeritas - what is start in if(end-start >= 1) { ... ? — rolfl, Commented Jul 8, 2014 at 21:37

toto2 · Accepted Answer · 2014-07-08 21:09:10Z

Not a code review, but a comment that would not fit as a comment.

Quicksort is more efficient if the pivot in a quicksort iteration is closer to the median of the sub-array in that iteration. If the pivot is close to the median at each iteration, you will get \$\log n\$ quicksort iterations. On the other end, if you pivot was the worst possible value (the min/max in each sub-array), you would get \$n\$ quicksort iterations.

When you take the median of 3 values (arr[0], arr[middle], arr[end]) to estimate the median, you should indeed get closer to the true median of that sub-array. (However, in some random rare cases that estimated median will be farther from the true median than the original pivot, arr[0].)

You could also just find the estimated median by taking the median of the first three elements in the sub-array (arr[0], arr[1], arr[2]). You will get the exact same result as using (arr[0], arr[middle], arr[end]) in the case where your array is truly randomly distributed. In the case where your array are partially ordered, it does make sense to use (arr[0], arr[middle], arr[end]), since the estimated median will be closer to the true median for an increaseing/decreasing partially ordered array.

You could also benchmark the different methods using nearly ordered arrays. The method using (arr[0], arr[middle], arr[end]) should be much better in that case.

You could also use more than 3 values to estimate the median. However, at some point, you will spend more time estimating the median than you will save by having a better estimate of the median. The extreme case would be scanning the whole sub-array to find the true median of that sub-array.

gnasher729 · Accepted Answer · 2014-07-08 23:31:57Z

Random arrays are actually not that common. Lots of times you will be sorting an array that is already sorted (because someone was careless), or one that is almost sorted, or sorted in reverse order, or is a sorted array with some elements added.

To handle these cases well, I'd try picking the median of three numbers randomly picked reasonably close to the middle of the array. If the array is already sorted, or sorted in reverse order, or has some random elements added at the end, that median will be very close to the median of the array. In an array that was sorted with random elements changed, it is still quite likely to be a good choice.

An interesting variation happens when moving elements is much slower than comparing elements, for example if you are not sorting numbers or pointers, but large structs. If you always pick a pivot that is not the median, but say at the 30% or 70% percentile, the number of comparisons grows, but the number of moves actually goes down! So instead of the median, you might take the second or third of four values in that situation.

forsvarir · Accepted Answer · 2016-10-06 06:59:32Z

If in MedianOf3 method:

//swap pivot element to the second to last position

swap(arr, middle, end-1)

Then in the partition method, before return leftPtr, it needs to swap the the pivot from second to last position back to the leftPtr position.

swap(arr, leftPtr, end-1);

After switching it back, the pivot element at that position will not change any more. It is the final position for that element.

So quicksort(arr, beginning, partition) can be changed to quicksort(arr, beginning, partition -1 ) because the element at partition position will not change any more. It doesn't need to sort that element

This method is explained in the Robert Lafore's Data Structures and Algorithms in Java book.

\$\begingroup\$ you could add some formatting to your answer \$\endgroup\$
– t3chb0t
Commented Oct 6, 2016 at 6:20 — t3chb0t, Commented Oct 6, 2016 at 6:20

Stack Exchange Network

Comparing pivot choosing methods in quicksort

5 Answers 5

Sorting the Three

Micro-Optimization

Timing Variations

Consistency of results

The median of three

Not the answer you're looking for? Browse other questions tagged
java
performance
sorting
quick-sort
or ask your own question.

Hot Network Questions

Comparing pivot choosing methods in quicksort

5 Answers 5

Sorting the Three

Micro-Optimization

Timing Variations

Consistency of results

The median of three

Not the answer you're looking for? Browse other questions tagged javaperformancesortingquick-sort or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
java
performance
sorting
quick-sort
or ask your own question.