4
$\begingroup$

I've implemented a ranking system (based on a score), using a Map<Player,Score> and an improved BST<Score, Set<Player>>. So now I can compute "rank of a player" in O(log(n)) time, "top xxx players", insert/update/remove in O(log(n)) time.

Now I want to improve my data structure to remember the "best rank" of each player. So when a player's score is updated, it will affect this player's rank, but other players' ranks as well as this player's ranking will go up/down. I want, for each player, to always keep the best rank they ever had.

An easy way to do so would be to keep a new hashmap<Player,BetScore> and, when an update is done, to check all the affected players and potentially update their best rank. The issue with this is the O(n) running time per operation. I'm trying to figure out if there is a way to do this in O(log(n)) time per operation. Is it possible?

$\endgroup$
5
  • $\begingroup$ I'm quite confident that this could be done efficiently, but it sounds like a very hard puzzle, especially if you want to stay in $\tilde O(n)$ memory which would disqualify persistent data structure solutions. $\endgroup$
    – orlp
    Commented Feb 6 at 22:25
  • $\begingroup$ In the worst case, $\Omega(n)$ players could be affected per update operation. That is, the rank of $\Omega(n)$ players can increase if score of one player is updated. So, an $O(\log n)$ will not be possible. Am I missing something? $\endgroup$ Commented Feb 10 at 8:11
  • 1
    $\begingroup$ @InuyashaYagami, Yes, there is a gap in your reasoning: you have not explained why $O(\log n)$ time is not possible. Perhaps you are assuming that the implementation must iterate over all players whose "best rank" has improved and do something for each, but maybe there are other ways to implement all the desired operations. My presumption is that the data structure must provide an operation bestrank(Player) that, given a player, returns their best rank, and runs in $O(\log n)$ time. It seems plausible that this might be possible. $\endgroup$
    – D.W.
    Commented Feb 10 at 8:13
  • 1
    $\begingroup$ @InuyashaYagami Note that returning the current rank of any particular player is trivial to do in $O(log n)$ by augmenting a self-balancing tree with the size of the subtree in each node. Despite inserting a new node increases the rank of potentially $\Theta(n)$ players, you can still answer this in $O(\log n)$. The problem is maintaining the best rank ever achieved, not just the current rank. $\endgroup$
    – orlp
    Commented Feb 10 at 12:14
  • $\begingroup$ @orlp any thoughts on similarities with range sum query or range minimum query? or others you might have in mind? $\endgroup$ Commented Feb 12 at 11:39

1 Answer 1

6
+50
$\begingroup$

Each query can be implemented to run in $O(\log n)$ time by lazily propagating appropriate operators on the binary search tree. The lazy propagation technique *1, is that it is possible to perform range update on a BST as long as such update operations can be composed in $O(1)$ time (forms a monoid).

Let $x_i$ and $m_i$ be the current and the best ranks of the player $i$. Then, let $y_i := x_i - m_i$ be the difference between current and best ranks. We implicitly maintain $x_i$ and $y_i$ on a binary search tree ordered by the score (ties broken arbitrary). A score change can be represented by first removing a player from the BST and then re-inserting the player. Ranks change when:

  • When inserting a player, we must down-rank all players $j$ with score less than the inserted player: $x_j \leftarrow x_j + 1$ and $y_j \leftarrow y_j + 1$.
  • When removing a player, we must up-rank all players $j$ with score less than the removed player: $x_j \leftarrow x_j - 1$ and $y_j \leftarrow \max(y_j - 1, 0)$.

The operators on $y$ can be represented as a function of the form represented by two numbers $(a, b) \in \mathbb{Z} \times \mathbb{Z} \cup \{-\infty\}$:

$$ f(X) = \max(X + a, b) $$

and two such functions $f, g$ can be composed:

$$ \begin{align} (f \circ g)(X) &= \max(\max(X + a_1, b_1) + a_2, b_2) \\ &= \max(\max(X + a_1 + a_2, b_1 + a_2), b_2) \\ &= \max(X + (a_1 + a_2), \max(b_1 + a_2, b_2)) \end{align} $$

while still being represented by two numbers. That is, the composition is closed and forms a monoid with the identity $(0, -\infty)$. Independently, update on $x$ (range addition) can be represented by just one number.

To summarize, we can implement the required range update operations by storing three additional numbers on each node of the BST. Therefore, it takes $O(\log n)$ time to insert/delete/query a player.

  • *1 I don't know a proper reference, but there are many tutorials on the web as a popular competitive programming technique. It is usually written for segment trees (statically allocated full BST) but can adopted to any BST. Rotation/other tree change should trigger a "push" of update operator.
$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.