6
$\begingroup$

Some context: I'm to write a program that sorts the lines of a file in C for Linux. Since I have to read all lines of the file (fgets() for example) I'm thinking about inserting them in a tree like structure in a sorted manner using the so called tree sort algorithm.

Looking for a self-balancing tree structure I came across two that may be interesting, the red-black tree and a splay tree. According to Wikipedia the red-black tree has an O(log n) worst case and the splay tree has amortized O(log n).

The actual question: I know how to roughly compare some complexity levels in O notation but what's that amortized time? Given two algorithms one that runs in O(log n) and the other in amortized O(log n) which one would be preferrable?

$\endgroup$
1
  • 2
    $\begingroup$ There are two different questions being asked here. In the title, the question is "which algorithm complexity measure is better?", while the last line in the actual question is "given two algorithms with these two complexity measures, which one would is preferred?". Jeremy's answer (below) addresses the first question (which is more appropriate for TCS.SE but still too basic IMO) while the answer to the second question should be "cannot be determined due to lack of information" since this depends on many more practical details. $\endgroup$ Commented Jun 17, 2013 at 13:23

3 Answers 3

10
$\begingroup$

$O(\log n)$ in the worst case implies $O(\lg n)$ amortized.

Basically, given two data structures supporting the same operator, one in $O(\lg n)$ time in the worst case and the other one in $O(\lg n)$ amortized time, the first one is considered to be superior asymptotically: being $O(\lg n)$ time in the worst case means that each call of the operator will be supported in this time, while having an amortized complexity of $O(\lg n)$ means that some (very few) operator calls can take $O(n)$ time. Usually, the concept of amortized analysis is used for data-structures which amortized complexity is better than their worst case complexity.

As an illustration, consider a data-structure for integers, storing each such integer $x$ as a string of bits (e.g. $x=8$ represented by $(1,0,0,0)$), and the operator $x.inc()$ which increments $x$ by $1$.

  • In the worst case (e.g. on $x=7$ represented by $(1,1,1)$), the operator $x.inc()$ corresponds to $\log(x)+1$ binary write (e.g. to write $(1,0,0,0)$ corresponding to $8$).
  • In the best case (e.g. on $x=8$), the operator $x.inc()$ corresponds to exactly one binary write (e.g. to change the last bit of $(1,0,0,0)$ to $1$, giving $(1,0,0,1)$).

In a sequence of increments of the same integer object (e.g. enumerating all integers from $0$), the "best case" described above happens half of the time, and the worst case one every $n$ increment (i.e. after $1,2,4,8,16,...$ increments). A case requiring $i$ bits write happens $1/2^i$ of the time. Summing all those costs gives an amortized cost of $\sum_i i/2^i \in O(1)$. Hence the operator $inc()$ has a worst case complexity of $O(\lg n)$ but an amortized complexity of $O(1)$.

The notion of amortized analysis is well explained on wikipedia. You might want to see the page on the Potential method for more details.

Hope it helps!

$\endgroup$
0
2
$\begingroup$

A data structures with amortized guarantees can be very useful even if there exists a data structure with matching worst-case guarantees. Often data structures with amortized guarantees as simpler, and the asymptotic notation hides smaller constants. I think the key difference is:

  • Are you using the data structure to support an interactive system, where the user interacts repeatedly with the system and response time needs to be fast? Then you really need a worst case guarantee.

  • Is the data structure a way to speed up the overall running time of an algorithm? Then an amortized guarantee is usually enough.

A classical example of the second kind is Fibonacci heaps in conjunction with Dijkstra's algorithm. A simpler example is the one Jeremy gave: if you want to implement a brute-force algorithm that tries all binary strings of length $n$ as possible solutions to some problem, the binary counter is one of the simplest ways to enumerate over all binary strings. If your algorithm does $O(1)$ work per string, then the total running time will be $O(2^n)$, which is as good as it can be since you are enumerating over $2^n$ strings. You can de-amortize the data structure (it's a very cute exercise) but why bother - you'll just lose constant factors and in fact increase the total running time!

$\endgroup$
2
$\begingroup$

There is another factor that distinguishes worst-case and amortized complexity. In some cases, you need to "backtrack" with a data structure. That is, you need to restore the data-structure to its state at some prior time. There are a variety of reasons for this, one possibility is that you detect some contradiction or new information at a given time and want to go back and fix it by changing some earlier choices you made.

If a data structure takes worst-case time $T$ per step, then you can backtrack it at the cost of $T$ per step. However, if it takes amortized time $T$, then there is no guarantee on the cost of backtracking. For example, the steps you are backtracking may have turned out to be the costly steps.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.