Numpy: replace each element in a row by the maximum of other elements in the same row

Question

Let say we have a 2-D array like this:

>>> a
array([[1, 1, 2],
   [0, 2, 2],
   [2, 2, 0],
   [0, 2, 0]])

For each line I want to replace each element by the maximum of the 2 others in the same line.

I've found how to do it for each column separately, using numpy.amax and an identity array, like this:

>>> np.amax(a*(1-np.eye(3)[0]), axis=1)
array([ 2.,  2.,  2.,  2.])
>>> np.amax(a*(1-np.eye(3)[1]), axis=1)
array([ 2.,  2.,  2.,  0.])
>>> np.amax(a*(1-np.eye(3)[2]), axis=1)
array([ 1.,  2.,  2.,  2.])

But I would like to know if there is a way to avoid a for loop and get directly the result which in this case should look like this:

>>> numpy_magic(a)
array([[2, 2, 1],
   [2, 2, 2],
   [2, 2, 2],
   [2, 0, 2]])

Edit: after a few hours playing in the console, I've finally come up with the solution I was looking for. Be ready for some mind blowing one line code:

np.amax(a[[range(a.shape[0])]*a.shape[1],:][(np.eye(a.shape[1]) == 0)[:,[range(a.shape[1])*a.shape[0]]].reshape(a.shape[1],a.shape[0],a.shape[1])].reshape((a.shape[1],a.shape[0],a.shape[1]-1)),axis=2).transpose()
array([[2, 2, 1],
   [2, 2, 2],
   [2, 2, 2],
   [2, 0, 2]])

Edit2: Paul has suggested a much more readable and faster alternative which is:

np.max(a[:, np.where(~np.identity(a.shape[1], dtype=bool))[1].reshape(a.shape[1], -1)], axis=-1)

After timing these 3 alternatives, both Paul's solutions are 4 times faster in every contexts (I've benchmarked for 2, 3 and 4 columns with 200 rows). Congratulations for these amazing pieces of code!

Last Edit (sorry): after replacing np.identity with np.eye which is faster, we now have the fastest and most concise solution:

np.max(a[:, np.where(~np.eye(a.shape[1], dtype=bool))[1].reshape(a.shape[1], -1)], axis=-1)

I hate to be critical, but that is what you were looking for? Look, here is a readable version at less than half the length: np.max(a[:, np.where(~np.identity(a.shape[1], dtype=bool))[1].reshape(a.shape[1], -1)], axis=-1) — Paul Panzer, Commented Jan 23, 2018 at 15:48
Well done @PaulPanzer! I've edited the question and starred your answer, even if I prefer this last one which is also faster if you replace np.identity with np.eye. Many thanks to you! — fbparis, Commented Jan 24, 2018 at 0:57
Thanks. I'm a bit surprised this is fastest since in my experience advanced indexing is kind of slow. Maybe because it's just a handful of indices. Also, I used to think eye and identity are essentially the same. Lots of things to learn... — Paul Panzer, Commented Jan 24, 2018 at 6:51

Paul Panzer · Accepted Answer · 2018-01-23 04:28:08Z

Here are two solutions, one that is specifically designed for max and a more general one that works for other operations as well.

Using the fact that all except possibly one maximums in each row are the maximum of the entire row, we can use argpartition to cheaply find the indices of the largest two elements. Then in the position of the largest we put the value of the second largest and everywhere else the largest value. Works also for more than 3 columns.

>>> a
array([[6, 0, 8, 8, 0, 4, 4, 5],
       [3, 1, 5, 0, 9, 0, 3, 6],
       [1, 6, 8, 3, 4, 7, 3, 7],
       [2, 1, 6, 2, 9, 1, 8, 9],
       [7, 3, 9, 5, 3, 7, 4, 3],
       [3, 4, 3, 5, 8, 2, 2, 4],
       [4, 1, 7, 9, 2, 5, 9, 6],
       [5, 6, 8, 5, 5, 3, 3, 3]])
>>> 
>>> M, N = a.shape
>>> result = np.empty_like(a)
>>> largest_two = np.argpartition(a, N-2, axis=-1)
>>> rng = np.arange(M)
>>> result[...] = a[rng, largest_two[:, -1], None]
>>> result[rng, largest_two[:, -1]] = a[rng, largest_two[:, -2]]>>> 
>>> result
array([[8, 8, 8, 8, 8, 8, 8, 8],
       [9, 9, 9, 9, 6, 9, 9, 9],
       [8, 8, 7, 8, 8, 8, 8, 8],
       [9, 9, 9, 9, 9, 9, 9, 9],
       [9, 9, 7, 9, 9, 9, 9, 9],
       [8, 8, 8, 8, 5, 8, 8, 8],
       [9, 9, 9, 9, 9, 9, 9, 9],
       [8, 8, 6, 8, 8, 8, 8, 8]])

This solution depends on specific properties of max.

A more general solution that for example also works for sum instead of max would be. Glue two copies of a together (side-by-side, not on top of each other). So the rows are something like a0 a1 a2 a3 a0 a1 a2 a3. For an index x we can get all but ax by slicing [x+1:x+4]. To do this vectorized we use stride_tricks:

>>> a
array([[2, 6, 0],
       [5, 0, 0],
       [5, 0, 9],
       [6, 4, 4],
       [5, 0, 8],
       [1, 7, 5],
       [9, 7, 7],
       [4, 4, 3]])
>>> M, N = a.shape
>>> aa = np.c_[a, a]
>>> ast = np.lib.stride_tricks.as_strided(aa, (M, N+1, N-1), aa.strides + aa.strides[1:])
>>> result = np.max(ast[:, 1:, :], axis=-1)
>>> result
array([[6, 2, 6],
       [0, 5, 5],
       [9, 9, 5],
       [4, 6, 6],
       [8, 8, 5],
       [7, 5, 7],
       [7, 9, 9],
       [4, 4, 4]])

# use sum instead of max
>>> result = np.sum(ast[:, 1:, :], axis=-1)
>>> result
array([[ 6,  2,  8],
       [ 0,  5,  5],
       [ 9, 14,  5],
       [ 8, 10, 10],
       [ 8, 13,  5],
       [12,  6,  8],
       [14, 16, 16],
       [ 7,  7,  8]])

Ethan Henderson · Accepted Answer · 2018-01-23 03:25:02Z

1

List comprehension solution.

np.array([np.amax(a * (1 - np.eye(3)[j]), axis=1) for j in range(a.shape[1])]).T

answered Jan 23, 2018 at 3:25

Ethan Henderson

4284 silver badges11 bronze badges

Add a comment |

Brad Solomon · Accepted Answer · 2018-01-23 04:07:11Z

1

Similar to @Ethan's answer but with np.delete(), np.max(), and np.dstack():

np.dstack([np.max(np.delete(a, i, 1), axis=1) for i in range(a.shape[1])])

array([[2, 2, 1],
       [2, 2, 2],
       [2, 2, 2],
       [2, 0, 2]])

delete() "filters" out each column successively;
max() finds the row-wise maximum of the remaining two columns
dstack() stacks the resulting 1d arrays

If you have more than 3 columns, note that this will find the maximum of "all other" columns rather than the "2-greatest" columns per row. For example:

a2 = np.arange(25).reshape(5,5)
np.dstack([np.max(np.delete(a2, i, 1), axis=1) for i in range(a2.shape[1])])

array([[[ 4,  4,  4,  4,  3],
        [ 9,  9,  9,  9,  8],
        [14, 14, 14, 14, 13],
        [19, 19, 19, 19, 18],
        [24, 24, 24, 24, 23]]])

edited Jan 23, 2018 at 4:07

answered Jan 23, 2018 at 4:00

Brad Solomon

40.2k37 gold badges159 silver badges246 bronze badges

1

Nice one, but I'm afraid it will be too slow with np.delete, and also any list comprehension is a kind of for loop...
– fbparis
Commented Jan 23, 2018 at 4:58
1

Correct on both points @fbparis. Paul’s solution is pretty unbeatable here I believe
– Brad Solomon
Commented Jan 23, 2018 at 12:35

Add a comment |

Collectives™ on Stack Overflow

Numpy: replace each element in a row by the maximum of other elements in the same row

3 Answers 3

Not the answer you're looking for? Browse other questions tagged
python
numpy
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Not the answer you're looking for? Browse other questions tagged pythonnumpy or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
numpy
or ask your own question.