3

Foreword:

  • In this post, I will make the common confusion between O(n) and Theta(n) as complexity notations.
  • I will write pseudo-code to talk about algorithms, using whatever notation I find to my liking.

I know, yet another question about nested loops and complexity.

I recently wrote a nested algorithm, that would work as a bottleneck of a more complex application. It somehow looked like this:

while(firstLoopRunning) {
    foreach(element in myArray) {
        doSomeAction(element);
    }
}

My keen-eye brought me to the conclusion that, as doSomeAction is in O(1), the algorithm had to be in O(n²).

I know however that nested loops can have a different complexity. For instance the following is in O(n.log(n)):

for(i in 0..n) // n is constant in this loop
    for(j in 0..i) // j goes up to i and not to a constant
        print(j);

Then I thought, hey, I know that myArray has a maximum length. In this particular case, I know by design that there will never be more than 8 elements in myArray, so the algorithm could look like this:

while(firstLoopRunning) {
    // Let's do this one in Java, myArray would actually be an ArrayList
    try {
        doSomeAction(myArray.get(0));
        doSomeAction(myArray.get(1));
        doSomeAction(myArray.get(2));
        doSomeAction(myArray.get(3));
        doSomeAction(myArray.get(4));
        doSomeAction(myArray.get(5));
        doSomeAction(myArray.get(6));
        doSomeAction(myArray.get(7));
    } catch(ArrayOutOfBoundsException ex) {
       // Just a lazy way for me to avoid checking if the index exists
    }
}

And voilà! Here it is in O(n)!

Moreover, I know for a fact that compilers usually transform fake loops such as this one:

for(i in 0..8) // not really a loop
    someCall(i);

into a sequence of calls.

Which brought me to a first conclusion:

A iteration over an array which length will have a known finite upper-bound is in O(1).

I think I'm right about this one (aka: correct me if I'm wrong).

On the other hand, we are working with finite data and the whole point of the complexity theory is to work on theorically infinite data.

So let's use another classical example: we are iterating over the squares in a grid (=~ bidimensional array), so clearly an O(n²) algorithm. What if however we would to put all the squares in a single list and change this:

// O(n²) algorithm over myGrid[][]
for(i in myGrid)
    for(j in myGrid[i])
        yieldAction(myGrid[i][j])

into this

// O(n) algorithm over myFlatGrid[]
for(i in myFlatGrid)
    yieldAction(myFlatGrid[i mod rowLength][j])

Basically the same thing but different complexity, yet no loop has in fact been gained. Granted that the grid can grow in both dimensions so the variable really is quadratic, but in a way it can definitely be treated in a linear way (even though that's not worth it).

But I must be missing something. What sense does it make that if a twist the data a litte, I can change the complexity of an algorithm on a theorical point of view although the exact same number of operations is performed?

0

4 Answers 4

10

You seem to think that the complexity of an algorithm is linked to the number of nested loops. It is not the case. The following piece of code is O(1):

for i in [1.. 10^15]:
    for j in [1.. 10^15]:
        for k in [1.. 10^15]:
            dosomethingO1()

Complexity is related to the rate of growth of the number of operations. Unrolling your loop does not change that. So:

foreach(element in myArray) {
    doSomeAction(element);
}

Has 0(1) if the number of elements in myArray is bounded. The complexity will be decided by the number of iterations in the while loop.

About your grid example: if n is the number of cells and your grid is represented as a nested array, looping over those arrays still yields O(n) complexity.

4
  • 2
    Exactly. n is the input size, not the way the input is structured. For every Big-O analysis we first have to state what n is, else confusion as in OP's case arises (where he is happily comparing apples and oranges).
    – amon
    Commented Jan 2, 2014 at 11:23
  • @amon so what is the size of the input? The size of the flatArray? Or in the nested array case, the total size of the array? Like, is it in the first case O(n.m) and O(n') (with n'=n*m) in the second case? Commented Jan 2, 2014 at 12:20
  • @ArlaudPierre you need to decide yourself. In the grid case I would go for the number of cells (n') because it is simpler. In some case it really depends of the problem at hand (for graphs it can either be the number of edges or the number of vertices). Commented Jan 2, 2014 at 12:24
  • @Simon I see. I did grasp the difference between the number of loops and the complexity, but a little clarification seemed required. Thanks for the nice explanation. Commented Jan 2, 2014 at 12:26
8

It seems, you have no clear understanding about what the n is in O(n), which is why you are puzzled.

My keen-eye brought me to the conclusion that, as doSomeAction is in O(1), the algorithm had to be in O(n²).

This is only true if both loops are in O(n) but if the loops have complexities O(p) and O(q) then the overall program has complexity O(pq) and so on. In your example, you have an outer loop taking place n times and triggering a loop of length p not greater than 8, so that the overall loop has complexity bounded by O(8n) = O(n).

If the complexity of the inner loop is not constant, you have to do some finer approximation.

I know however that nested loops can have a different complexity. For instance the following is in O(n.log(n))

It is not. Recall that

1 + 2 + ... + n = n(n+1)/2 = O(n²)

so that the complexity of your loop is O(n²) and not O(n.log(n)).

1
  • 3
    +1, also for pointing out that the second snippet has complexity O(n^2).
    – Giorgio
    Commented Jan 2, 2014 at 12:50
2

Yes. But by changing the data, you are also changing the n. In your fist example, the n is actually constant. So it makes sense than the whole algorithms is actually constant.

In your second example, while you can turn the O(n^2) algorithm into O(n) algorithm, the amount of data increases from n to n^2, so it in the end, the total complexity remains same.

As Simon hinted, the O-notation is not about absolute values of n, but how the time of algorithm changes based on how you change n. With O(n) algorithm, the time increases linearly, so if n increases 2x, the time increases 2x. With O(n^2), the time increases quadraticaly, so if n increases 2x the time increases 4x, etc..

The point is by changing the data, the rate of growth of n can also change. Like in your example with grid. If you increase the length of side by 2, then the algorithm is going to take 4x that long no matter what. Because in O(n^2), the n is length of the side and in O(n), the n is number of total items which is equal to square of side.

0

When taking N and your grid to get your O(n^2) you should instead talk about a grid with for instance N columns and N rows.

Now you will see that with a doubling N, that amounth of tiles quadruples.

And no matter if you put it in list or change the notation, the problem will always have O(n^2) complexity.

Not the answer you're looking for? Browse other questions tagged or ask your own question.