0
\$\begingroup\$

This function takes two inputs: A is 2D (N,5) while B is 1D (N).

It tries to find the smallest j such that A[i,j] > B[i] for all i.

Using numpy with argmax trick is slower than this version where I loop manually through all i and compile it with numba. I think the reason why is because as N gets much bigger (~60000), it's more expensive to index on the first dimension, rather than to just loop and hoping to stop at small j, e.g., j=2 or j=3.

Any ideas how to further speed things up?

@nb.jit(nb.int64[:](nb.float32[:,:],nb.float32[:]),nopython=True,cache=True)
def find_passed_hz(A, B):
    C = np.empty(len(B), dtype=np.int64)
    _, m = A.shape # m=5
    n = len(current_times)
    for i in range(n):
        for j in range(m):
            if A[i,j] > B[i]:
                C[i] = j
                break
    return C

NOTE: every row of A is sorted in ascending order, but because m=5 very small for numpy search_sorted or binary search to have any effect (or not? I tried but it seems to have no effect).

\$\endgroup\$
8
  • 1
    \$\begingroup\$ (Down-voters please comment what in particular makes a post not useful.) \$\endgroup\$
    – greybeard
    Commented Jul 29, 2023 at 10:15
  • \$\begingroup\$ smallest j such that… is such a j guaranteed to exist? \$\endgroup\$
    – greybeard
    Commented Jul 29, 2023 at 10:18
  • \$\begingroup\$ find the smallest j such that A[i,j] > B[i] for all i is not what I see the code do: return a NumPy array C of smallest elements of A[i,j] exceeding B[i] for each i. \$\endgroup\$
    – greybeard
    Commented Jul 29, 2023 at 10:24
  • \$\begingroup\$ j guaranteed to exist, because I set A[i,4] very big. for the second comment, yes, that's what I meant. can you suggest the better wordings to be more specific? or I simply insert your second comment in the question? Thank you. \$\endgroup\$
    – Sanyou
    Commented Jul 29, 2023 at 10:30
  • 3
    \$\begingroup\$ Code this terse is not ready for public consumption, those days are 20 years past. A good explanation would help getting reviews. \$\endgroup\$
    – Mast
    Commented Jul 29, 2023 at 10:57

1 Answer 1

0
\$\begingroup\$

If the dimension was (N, 4), I'd feel invited to open code a binary search.
(N, 5) is one of the "worst" cases: one more value to check than ideal for a binary search.
Just the same:

def find_passed_hz(A, thresholds):   # ToDo: check name is suggestive. Type hints?
    """ Return an array C of 
        smallest elements of A[i, j] exceeding thresholds[i] for each i.
        A[i, j] <= A[i, j + 1]; A[i, 4] is known to exceed thresholds[i].
    """
    
    C = np.empty(len(thresholds), dtype=np.int64)
    n = len(current_times)           # ToDo: explain current_times
    for i in range(n):               # i, threshold in enumerate(thresholds) 
        threshold = thresholds[i]
        if threshold < A[i, 2]:      # A[i, 2] would do: how about A[i, 0/1]?
            if threshold < A[i, 1]:  # A[i, 1] would do: how about A[i, 0]?
                C[i] = A[i, 0 if threshold < A[i, 0] else 1]
            else
                C[i] = A[i, 2]       # known good
        else:
            C[i] = A[i, 3 if threshold < A[i, 3] else 4]
    return C
\$\endgroup\$
1
  • \$\begingroup\$ Never used NumPy/SciPy, search for a helpful function without result. \$\endgroup\$
    – greybeard
    Commented Jul 30, 2023 at 8:34

Not the answer you're looking for? Browse other questions tagged or ask your own question.