5
\$\begingroup\$

Here is some code that generates the zero element of an Abelian sandpile model (ASM), for any given model dimensions, and then plots the result as a colormesh. Here is a wiki page explaining the ASM. The code is extremely slow, especially for large arrays. How can I make it more efficient?

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

NaN = np.nan

def trough(N):
    (i,j) = np.shape(N)
    Nt = np.concatenate((np.ones((i,1))*NaN, N, np.ones((i,1))*NaN), axis=1)
    Nt = np.concatenate((np.ones((1,j+2))*NaN, Nt, np.ones((1,j+2))*NaN), axis=0)
    return Nt

def topple(N):
    P = trough(N)
    sP = np.shape(P)
    while np.nanmax(P) > 3:
        for (i,j) in np.ndindex(sP):
            if P[i,j] > 3 and np.isnan(P[i,j]) == False:
                P[i,j] -= 4
                P[i+1,j] += 1
                P[i-1,j] += 1
                P[i,j+1] += 1
                P[i,j-1] += 1
    return P[1:-1,1:-1]

def Picard(m,n):
    P1 = 6*np.ones((m,n))
    P1 = topple(P1)
    P2 = 6*np.ones((m,n))
    Pi = P2 - P1
    Pi = topple(Pi)
    return Pi

M = 500
N = 500
Pi = Picard(M,N)

cmap = cm.get_cmap('viridis_r')

plt.figure()
plt.axes(aspect='equal')
plt.axis('off')
plt.pcolormesh(Pi, cmap=cmap)

dim = str(N) + 'x' + str(M)
file_name = 'Picard_Identity-' + dim + '.png'

plt.savefig(file_name)
\$\endgroup\$

1 Answer 1

2
\$\begingroup\$

The code is extremely slow ...

Using numba helps a lot.

Using math helps a tiny bit.


I ran this variant:

from pathlib import Path
from time import time
from typing import Any

from numba import njit
import matplotlib as mp
import matplotlib.pyplot as plt
import numpy as np

NaN = np.nan


@njit
def trough(
    N: np.ndarray[Any, np.dtype[np.float64]]
) -> np.ndarray[Any, np.dtype[np.float64]]:
    w, h = np.shape(N)
    Nt = np.concatenate((np.ones((w, 1)) * NaN, N, np.ones((w, 1)) * NaN), axis=1)
    Nt = np.concatenate(
        (np.ones((1, h + 2)) * NaN, Nt, np.ones((1, h + 2)) * NaN), axis=0
    )
    return Nt


@njit
def topple(
    N: np.ndarray[Any, np.dtype[np.float64]]
) -> np.ndarray[Any, np.dtype[np.float64]]:
    P = trough(N)
    sP = np.shape(P)
    while np.nanmax(P) >= 4:
        for i, j in np.ndindex(sP):
            # if P[i, j] >= 8 and not np.isnan(P[i, j]):
            #     P[i, j] -= 8
            #     P[i + 1, j] += 2
            #     P[i - 1, j] += 2
            #     P[i, j + 1] += 2
            #     P[i, j - 1] += 2
            if P[i, j] >= 4 and not np.isnan(P[i, j]):
                P[i, j] -= 4
                P[i + 1, j] += 1
                P[i - 1, j] += 1
                P[i, j + 1] += 1
                P[i, j - 1] += 1
    return P[1:-1, 1:-1]


def picard(m: int, n: int) -> np.ndarray[Any, np.dtype[np.float64]]:
    P1 = 6 * np.ones((m, n))
    P1 = topple(P1)
    P2 = 6 * np.ones((m, n))
    Pi = P2 - P1
    Pi = topple(Pi)
    return Pi


def main(m: int = 300, n: int = 300) -> None:
    picard(4, 4)  # warmup

    t0 = time()
    Pi = picard(m, n)
    print(f"elapsed: {time() - t0:.3f} sec")

    plt.figure()
    plt.axes(aspect="equal")
    plt.axis("off")
    plt.pcolormesh(Pi, cmap=mp.colormaps["viridis_r"])

    dim = f"{n}x{m}"
    file_name = Path("/tmp") / f"Picard_Identity-{dim}.png"
    plt.savefig(file_name)


if __name__ == "__main__":
    main()

On cPython 3.11.4 my Mac laptop takes ~ 73.4 seconds. With numba that becomes 220 msec, a more than 300 X speedup.

Initial conditions gives us a uniform level of six. Uncommenting the "propagate big numbers faster!" clause shaves off a bit more than 10% from the elapsed time. Nothing to write home about. Some example problems put thousands of particles on the origin, rather than beginning with a uniform level. Those problems may benefit more from using arithmetic to propagate lots of particles at once.

Scaling up to 300 x 300, numba completes the task in ~ 16 seconds.
400 x 400 takes 50 seconds, and
500 x 500 takes 122 seconds.

EDIT

Yup, sure enough, if I deposit a quarter million particles on an otherwise zero surface of 100 x 100, using this code we see a 4 X speedup thanks to arithmetic.

    k = 12
    while np.nanmax(P) >= 4:
        for i, j in np.ndindex(sP):
            if P[i, j] >= 4 * k and not np.isnan(P[i, j]):
                P[i, j] -= 4 * k
                P[i + 1, j] += k
                P[i - 1, j] += k
                P[i, j + 1] += k
                P[i, j - 1] += k

            if P[i, j] >= 4 and not np.isnan(P[i, j]):
                P[i, j] -= 4
                P[i + 1, j] += 1 ...

Maintaining a priority queue of "big" cells ( ≥ 4 ) might help the algorithm to focus on dispersal in the central cells, at least initially.


The ⊔ trough function could perhaps be better named. Putting particles on top, falling into the abyss of NaNs on the side, seems more like a ⊓ table function, maybe? Not sure what the best name would be.

\$\endgroup\$

Not the answer you're looking for? Browse other questions tagged or ask your own question.