Let say we have a 2-D array like this:
>>> a
array([[1, 1, 2],
[0, 2, 2],
[2, 2, 0],
[0, 2, 0]])
For each line I want to replace each element by the maximum of the 2 others in the same line.
I've found how to do it for each column separately, using numpy.amax and an identity array, like this:
>>> np.amax(a*(1-np.eye(3)[0]), axis=1)
array([ 2., 2., 2., 2.])
>>> np.amax(a*(1-np.eye(3)[1]), axis=1)
array([ 2., 2., 2., 0.])
>>> np.amax(a*(1-np.eye(3)[2]), axis=1)
array([ 1., 2., 2., 2.])
But I would like to know if there is a way to avoid a for loop and get directly the result which in this case should look like this:
>>> numpy_magic(a)
array([[2, 2, 1],
[2, 2, 2],
[2, 2, 2],
[2, 0, 2]])
Edit: after a few hours playing in the console, I've finally come up with the solution I was looking for. Be ready for some mind blowing one line code:
np.amax(a[[range(a.shape[0])]*a.shape[1],:][(np.eye(a.shape[1]) == 0)[:,[range(a.shape[1])*a.shape[0]]].reshape(a.shape[1],a.shape[0],a.shape[1])].reshape((a.shape[1],a.shape[0],a.shape[1]-1)),axis=2).transpose()
array([[2, 2, 1],
[2, 2, 2],
[2, 2, 2],
[2, 0, 2]])
Edit2: Paul has suggested a much more readable and faster alternative which is:
np.max(a[:, np.where(~np.identity(a.shape[1], dtype=bool))[1].reshape(a.shape[1], -1)], axis=-1)
After timing these 3 alternatives, both Paul's solutions are 4 times faster in every contexts (I've benchmarked for 2, 3 and 4 columns with 200 rows). Congratulations for these amazing pieces of code!
Last Edit (sorry): after replacing np.identity with np.eye which is faster, we now have the fastest and most concise solution:
np.max(a[:, np.where(~np.eye(a.shape[1], dtype=bool))[1].reshape(a.shape[1], -1)], axis=-1)
np.max(a[:, np.where(~np.identity(a.shape[1], dtype=bool))[1].reshape(a.shape[1], -1)], axis=-1)
eye
andidentity
are essentially the same. Lots of things to learn...