1
$\begingroup$

Given a rectangle X1,Y1 and X2,Y2 such that

 X1,Y1-----------------
 |                    |
 |                    |
 ----------------------X2, Y2

And given a million points ranked points P(x,y,r) where r is a random 32 bit integer...

What is the fastest way to determine the top X points inside the rectangle, based upon their ranking r.

For instance, if you have a 2d plane of size 10,000 by 10,000 and you have 1 million points scattered across it. How do I first find all the points inside the rectangle located at 200,200 to 400,400 and then how do I find the 5 points in that rect with the highest ranking.

Simple solutions, such as sort all 1 million points, then test all 1 million points to see if they are in the bounds of the rectangles (say there are 1000 of them to test), and find the top 5 points in the array, are very computationally intense.

What is the fastest way to do this, and what is the time complexity? Can this be done in less than O(N)?

Also, a quadtree approach would not work as well if the rect is the size of the entire plane... as it would just return all 1 million points that you would have to sort each time.

$\endgroup$

2 Answers 2

2
$\begingroup$

From what you're asking, I think you're asking to:

  1. Given $n$ points, find all $k \le n$ points $(x,y)$ located in a given rectangle.
  2. Out of those $k$ points, find the top $r$ ranked points.

Here's what I would do in this case:

  1. Sort the coordinates by $x$-coordinate. Depending on if your points are bounded or not, you can do an $O(n)$ non-comparison integer sort, or do an $O(n \log n)$ comparison sort.
  2. Do a binary search on $x_1$ and $x_2$ to find all points in the list with $x$-coordinates between $x_1$ and $x_2$. This way, you can eliminate points not in the rectangle. Eliminate them.
  3. Repeat 1-2 for the remaining points, except sort by $y$-coordinate, and compare $y$-coordinates via binary search this time.
  4. By this time, you should have all the points located in the rectangle. You can now use a selection algorithm to find the top $r$ points in the rectangle (See Wikipedia for more info). This should generally take $O(n)$ or $O(rn)$ time.

I'm not sure if this is the fastest way to do things, but at worst this algorithm runs in $O(n \log n)$ time.

$\endgroup$
2
  • $\begingroup$ The idea to sort by X axis first, is pretty spiffy, as opposed to sorting by ranking. I cannot upvote this until I get more ranking, but this answer is a good one and thank you for it! $\endgroup$
    – Harold M
    Commented Aug 27, 2014 at 23:58
  • $\begingroup$ This worked perfectly and drastically increased my performance. Thanks! $\endgroup$
    – Harold M
    Commented Aug 28, 2014 at 4:49
0
$\begingroup$

You don't actually need to sort anything if you only need the top 5 points

Initialize by storing an array of 5 "null" points with rank $r = 0$ and store $r_0 = 0$.

For each point $P(x,y,r)$, if $P$ is inside the rectangle (i.e. $X_1 < x < X_2$ and $Y_1 < y < Y_2$) and $r > r_0$, then replace a point in the array with the lowest rank with $P(x,y,r)$ and let $r_0$ be the new smallest rank in the list.

Checking if the new point is inside the rectangle and if the rank is larger than $r_0$ takes constant time. Determining which of $5$ points in the array has the lowest rank, and replacing that point also takes constant time. Updating $r_0$ with the new value of the lowest rank in the top 5 also takes constant time.

Thus, this algorithm runs in $O(n)$. If you need the top $k$ points instead of the top $5$ points, this algorithm takes $O(kn)$ time. Any algorithm will have to read in all $n$ points, so no algorithm can run faster than $O(n)$.

$\endgroup$
1
  • $\begingroup$ This is a solid optimization and I will incorporate it now with the answer above as well to see how fast I can get it to run. Thank you for this, and I wish I could upvote you but I need 15 rep first :( $\endgroup$
    – Harold M
    Commented Aug 28, 2014 at 0:00

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .