11
\$\begingroup\$

I am a junior Software Engineer, C++ is usually my main jam but I started picking up Python for a research project I am doing in college. I am eager to learn as much Python syntax, tricks, best practices or software architecture in general as possible. So here is a portion of my project ~300 lines of code that simulate two types of protein movement through Brownian diffusion: diffusion in small circular domains (nanodomain simulation) and in a meshwork with compartments (hop diffusion). I hope I can get any tips and feedback on the code!

Project Structure

.
├── main.py
├── plotGenerator.py
├── requirements-dev.txt
├── simulations
│   ├── __init__.py
│   ├── hopDiffusionSimulation.py
│   ├── nanodomainSimulation.py
│   └── simulation.py
└── util.py

In Action Example

enter image description here

The Code

./main.py

from simulations.simulation import *
from simulations.nanodomainSimulation import *
from simulations.hopDiffusionSimulation import *    
    

from plotGenerator import *
from util import *

#nanoDomain = Nanodomain()
hopDiffusion = HopDiffusion();
plot(hopDiffusion, SimulationType.HOPDIFFUSION)

./simulations/simulation.py

from typing import List, Tuple
import random
import numpy as np
from enum import Enum

class SimulationType(Enum):
     BROWNIAN = 1
     NANODOMAIN = 2
     HOPDIFFUSION = 3     

PATH = Tuple[List[float]]
DPI = 100
RADIUS_PADDING = 10
RADIUS = 250
CORRECTED_CANVAS_RADIUS = RADIUS - RADIUS_PADDING

TIME_PER_FRAME: float = 0.02 # 20 ms
DIFFUSION_SPEED_CORRECTION: int = 35 # arbitrary
MEMBRANE_DIFFUSION_COEFFICIENT: float = 0.1 # micrometer^2 / s
MEMBRANE_DIFFUSION_FACTOR: float = 2 * np.sqrt(MEMBRANE_DIFFUSION_COEFFICIENT * TIME_PER_FRAME)
MEMBRANE_DIFFUSION_FATOR_CORRECTED: float = MEMBRANE_DIFFUSION_FACTOR * DIFFUSION_SPEED_CORRECTION

class Simulation:
    
    def __init__(self, n: int = 5):
        self.numberOfParticles: int = n
        self.particlesLocation: List[Tuple[int, int]] = [] 
        self.paths: List[List[Tuple[int, int]]] = []
        
        self.init_particles()
        self.init_paths()
        
    def init_paths(self):
        self.paths.extend([[coordinate] for coordinate in self.particlesLocation])
            
    def init_particles(self) -> None:
        mem: List[Tuple] = []
        
        def get_random_canvas_value(self) -> int:
            return int(random.randint(-(CORRECTED_CANVAS_RADIUS), CORRECTED_CANVAS_RADIUS))
        
        def rec(self, x: int = 0, y: int = 0) -> Tuple[int, int]:
            x, y = [get_random_canvas_value(self) for _ in range(2)]
            while (x, y) in mem:
                return rec(self, x, y)
            mem.append((x, y))
            return x,y
        
        self.particlesLocation.extend([rec(self) for _ in range(5)])

./simulations/hopDiffusionSimulation.py

from typing import List, Tuple
from simulations.simulation import *
from util import *
        
BOUNDARY_THICKNESS: int = 15
NUMBER_OF_COMPARTMENTS_PER_DIRECTION: int = 3
BOUNDARY_JUMP: int  = BOUNDARY_THICKNESS
BOUNDARY_OVERFLOW: int = 20
HOP_PROBABILITY_PERCENTAGE: float = 0.15

class HopDiffusion(Simulation):
    def __init__(self, n: int = 5):
        self.boundary_coordinates_for_plot: List[List] = []
        self.boundary_coordinates: List[Tuple[Tuple]] = []
        self.generate_boundaries()
        
        super().__init__(n)
        
    def generate_boundaries(self):
        step: int = int((RADIUS << 1) / NUMBER_OF_COMPARTMENTS_PER_DIRECTION)
        
        for i in range(6):
            if i % 3 == 0: continue 
            horizontal: bool = i < NUMBER_OF_COMPARTMENTS_PER_DIRECTION
            curr = i * step if horizontal else (i - NUMBER_OF_COMPARTMENTS_PER_DIRECTION) * step
            
            width = BOUNDARY_THICKNESS if horizontal else (RADIUS << 1) + (BOUNDARY_OVERFLOW << 1)
            height = BOUNDARY_THICKNESS if not horizontal else (RADIUS << 1) + (BOUNDARY_OVERFLOW << 1)
            x = curr - RADIUS - (BOUNDARY_THICKNESS >> 1) if horizontal else -RADIUS - BOUNDARY_OVERFLOW
            y = curr - RADIUS - (BOUNDARY_THICKNESS >> 1) if not horizontal else -RADIUS - BOUNDARY_OVERFLOW
            
            self.boundary_coordinates_for_plot.append(list([x, y, width, height]))
            self.boundary_coordinates.append(list([tuple((x, x + width)), tuple((y, y + height))]))
    
    @property
    def get_boundary_coordinates(self):
        return self.boundary_coordinates_for_plot

    def can_particle_hop_boundary_probability(self) -> bool:
        return random.random() < HOP_PROBABILITY_PERCENTAGE

    def is_particle_on_specific_boudnary(self, pos: Tuple, idx: int):
        return Util.is_point_within_bounds(pos, self.boundary_coordinates[idx])
    
    def is_particle_on_boundary(self, pos: Tuple):   
        return any(
            Util.is_point_within_bounds(pos, bounds_of_boundary)
            for bounds_of_boundary in self.boundary_coordinates
        ) 
    
    def is_particle_in_compartment(self, particle) -> bool:
        return not self.is_particle_on_boundary(particle)
    
    def get_surrounding_boundary_of_particle(self, pos: Tuple) -> int:
        for idx, bounds_of_boundary in enumerate(self.boundary_coordinates):
            if Util.is_point_within_bounds(pos, bounds_of_boundary):
                return idx
        return -1
    
    def make_particle_jump(self, newPos: Tuple, x_dir: int, y_dir: int):
        surrounding_boundary_idx = self.get_surrounding_boundary_of_particle(newPos)
        
        while (self.is_particle_on_specific_boudnary(newPos, surrounding_boundary_idx)):
            newPos = Util.increment_tuple_by_val(
                newPos, tuple((Util.sign(x_dir), Util.sign(y_dir)))
            )
            
        newPos = Util.increment_tuple_by_val(
            newPos, tuple(
                (Util.sign(x_dir) * BOUNDARY_JUMP, 
                 Util.sign(y_dir) * BOUNDARY_JUMP)
            )
        )
        # Special case: In some instances the jump may land the particle
        # on a subsequent boundary so we repeat the function. We decrement
        # the particle's coordinates until it is out.
        new_surrounding_boundary_idx = self.get_surrounding_boundary_of_particle(newPos)
        while (self.is_particle_on_boundary(newPos)):
            newPos = Util.increment_tuple_by_val(
                newPos, tuple((Util.sign(-x_dir), Util.sign(-y_dir)))
            )
        
        return newPos
    
    def update_path(self, idx: int):
        x, y = self.paths[idx][-1]
        assert(not self.is_particle_on_boundary(tuple((x, y))))
        
        diffusion_factor = MEMBRANE_DIFFUSION_FATOR_CORRECTED
        x_dir, y_dir = [Util.get_random_normal_direction() * diffusion_factor for _ in range(2)]
        newPos = tuple((x + x_dir, y + y_dir))
        
        if self.is_particle_on_boundary(newPos):
            if self.can_particle_hop_boundary_probability():
                newPos = self.make_particle_jump(newPos, x_dir, y_dir)
            else:
                newPos = Util.change_direction(tuple((x, y)), tuple((x_dir, y_dir)))
            
        self.paths[idx].append(newPos)
        
    def update(self):
        for i in range(self.numberOfParticles): self.update_path(i)
        
    def init_particles(self) -> None:
        mem: List[Tuple[int, int]] = []
        
        def get_random_canvas_value(self) -> int:
            return int(random.randint(-(CORRECTED_CANVAS_RADIUS), CORRECTED_CANVAS_RADIUS))
        
        def rec(self, x: int = 0, y: int = 0) -> Tuple[int, int]:
            x, y = [get_random_canvas_value(self) for _ in range(2)]
            while (x, y) in mem or self.is_particle_on_boundary(tuple((x, y))):
                return rec(self, x, y)
            mem.append((x, y))
            return x, y
        
        self.particlesLocation.extend([rec(self) for _ in range(5)])

./simulations/nanodomainDiffusionSimulation.py

from typing import List, Tuple
from simulations.simulation import *
from util import *
        
NANODOMAIN_DIFFUSION_FATOR_CORRECTED: float = MEMBRANE_DIFFUSION_FATOR_CORRECTED * 0.4 # type : ignore

class Nanodomain(Simulation):
    
    def __init__(self, n: int = 5):
        super().__init__(n)
        self.nanodomain_coordinates: List[Tuple[int, int]] = [ 
            (-100, 100), (0, 0), (150, -60), (-130, -160)
        ]
        self.nanodomain_radii: List[int] = [80, 20, 50, 140]
        
    @property
    def get_nanodomain_coordinates(self) -> List[Tuple[int, int]]:
        return self.nanodomain_coordinates
    
    @property
    def get_nanodomain_radii(self) -> List[int]:
        return self.nanodomain_radii
    
    def get_nanodomain_attributes(self) -> List[Tuple]:
        return list(map(
            lambda coord, radius: (coord, radius), 
            self.get_nanodomain_coordinates, 
            self.get_nanodomain_radii
        ))
    
    def is_particle_in_nanodomain(self, particle: Tuple) -> bool:
        return any(
            Util.compute_distance(particle, circle_center) <= radius 
            for circle_center, radius in 
            zip(self.get_nanodomain_coordinates, self.get_nanodomain_radii)
        )
    
    def update_path(self, idx):
        x, y = self.paths[idx][-1]
        diffusion_factor = NANODOMAIN_DIFFUSION_FATOR_CORRECTED if (self.is_particle_in_nanodomain((x, y))) else MEMBRANE_DIFFUSION_FATOR_CORRECTED
        x_dir, y_dir = [Util.get_random_normal_direction() * diffusion_factor for _ in range(2)]
        self.paths[idx].append((x + x_dir, y + y_dir))
        
    def update(self):
        [self.update_path(i) for i in range(self.numberOfParticles)]

./plotGenerator.py

from simulations.hopDiffusionSimulation import HopDiffusion
from simulations.nanodomainSimulation import Nanodomain
from simulations.simulation import *
from util import *

from matplotlib.animation import FuncAnimation # type: ignore
from matplotlib.pyplot import figure
import matplotlib.pyplot as plt
from typing import List, Tuple
import numpy as np
from matplotlib import rcParams # type: ignore

colors: List[str] = ['r', 'b', "orange", 'g', 'y', 'c']
markers: List[str] = ['o', 'v', '<', '>', 's', 'p']
                
def handle_nanodomain(ax, sim: Nanodomain):
    nanodomains = [
        plt.Circle( # type: ignore
            *param,
            color = 'black', 
            alpha = 0.2) 
        for param in sim.get_nanodomain_attributes()
    ]
    [ax.add_patch(nanodomain) for nanodomain in nanodomains]

def handle_hop_diffusion(ax, sim: HopDiffusion):
    compartments = [
        plt.Rectangle( # type: ignore
            tuple((param[0], param[1])),
            param[2], param[3],
            color = 'black',
            alpha = 0.7,
            clip_on = False)
        for param in sim.boundary_coordinates_for_plot
    ]
    [ax.add_patch(boundary) for boundary in compartments]

def get_coordinates_for_plot(sim, idx: int):
    return Util.get_x_coordinates(sim.paths[idx]), Util.get_y_coordinates(sim.paths[idx])

def get_coordinates_for_heads(sim, idx: int):
    return Util.get_last_point(sim.paths[idx])

def set_plot_parameters(ax):
    ax.tick_params(axis = 'y', direction = "in", right = True, labelsize = 16, pad = 20)
    ax.tick_params(axis = 'x', direction = "in", top = True, bottom = True, labelsize = 16, pad = 20)

    ## legends and utilities
    ax.set_xlabel(r"nm", fontsize=16)
    ax.set_ylabel(r"nm", fontsize=16)

    ## border colors
    ax.patch.set_edgecolor('black')  
    ax.patch.set_linewidth('2') 

    ax.set_xlim(-RADIUS, RADIUS)
    ax.set_ylim(-RADIUS, RADIUS)
    
def plot(sim: Simulation, type: SimulationType):
    fig, ax = plt.subplots(figsize = [5, 5], dpi = DPI) # type: ignore

    path_plots: List = [
        ax.plot(
            *get_coordinates_for_plot(sim, i), 
            markersize=15, color = colors[i])[0] 
        for i in range(5)
    ] 
    
    head_plots: List = [
        ax.plot(
            *get_coordinates_for_heads(sim, i), 
            markersize=7, color = colors[i], marker = markers[i], 
            markerfacecolor="white")[0] 
        for i in range(5)
    ]

    def initialize_animation():
        set_plot_parameters(ax)
        if type == SimulationType.NANODOMAIN: handle_nanodomain(ax, sim)
        elif type == SimulationType.HOPDIFFUSION: handle_hop_diffusion(ax, sim)
        return path_plots

    def update_animation(frame):
        sim.update()
        for i, plot in enumerate(path_plots):
            plot.set_data(*get_coordinates_for_plot(sim, i))
        for i, head_marker in enumerate(head_plots):
            head_marker.set_data(*get_coordinates_for_heads(sim, i))
        return path_plots

    animation = FuncAnimation(
        fig, 
        update_animation, 
        init_func = initialize_animation, 
        interval = 20
    )

    plt.show(block = True) # type: ignore
    fig.tight_layout()

rcParams.update({'figure.autolayout': True})

./util.py

from typing import List, Tuple
import numpy as np
import random

class Util:
    @staticmethod
    def get_bounds(lists) -> Tuple[int, ...]:
        x_min: int = min([min(elem[0]) for elem in lists])
        x_max: int = max([max(elem[0]) for elem in lists])
        y_min: int = min([min(elem[1]) for elem in lists])
        y_max: int = max([max(elem[1]) for elem in lists])
        return x_min, x_max, y_min, y_max

    @staticmethod
    def compute_distance(p1: Tuple, p2: Tuple) -> float:
        return np.sqrt((p2[0] - p1[0]) ** 2 + (p2[1] - p1[1]) ** 2)

    @staticmethod
    def get_last_point(path: List[Tuple]) -> Tuple[int, ...]:
        return path[-1][0], path[-1][1]

    @staticmethod
    def get_x_coordinates(path) -> List:
        return list(list(zip(*path))[0])

    @staticmethod
    def get_y_coordinates(path) -> List:
        return list(list(zip(*path))[1])

    @staticmethod
    def get_random_normal_direction():
        return np.random.normal() * np.random.choice([1, -1])

    @staticmethod
    def is_point_within_bounds(pos: Tuple, bounds: Tuple[Tuple, ...]):
        x, y = pos[0], pos[1]
        return x >= bounds[0][0] and x <= bounds[0][1] and y >=  bounds[1][0] and y <= bounds[1][1]
    
    @staticmethod
    def sign(x):
        return ((x >= 0) << 1) - 1
    
    @staticmethod
    def increment_tuple_by_val(tuple_object: Tuple, val):
        tuple_object = tuple((tuple_object[0] + val[0], tuple_object[1] + val[1]))
        return tuple_object
    
    @staticmethod
    def change_direction(tuple_object: Tuple, dir):
        tuple_object = tuple((tuple_object[0] - dir[0], tuple_object[1] - dir[1]))
        return tuple_object
\$\endgroup\$
1
  • \$\begingroup\$ np.random.normal() * np.random.choice([1, -1]) is suspicious. normal() carries defaults of loc=0.0, scale=1.0, so the choice of sign is already random. I think you can get away with dropping the second term, but I'd like to confirm with you what your intent was. \$\endgroup\$
    – Reinderien
    Commented Oct 2, 2022 at 19:14

3 Answers 3

7
\$\begingroup\$

Spelling: FATOR -> FACTOR, boudnary -> boundary.

Upgrade Python and then use list and tuple directly as typehints instead of the older List and Tuple.

Simplify your chained comparisons; the one for is_point_within_bounds can look like:

    return (
        bounds[0][0] <= x <= bounds[0][1] and
        bounds[1][0] <= y <= bounds[1][1]
    )

Having an argument called dir to change_direction shadows a built-in, dir. Name it something else, in this case direction.

You have incomplete type hints in many places. Don't leave list unspecified; fill its element type in []. Configure Mypy to disallow unspecified types.

Util should not be a class. Just put those functions in the global namespace, but then in dependent modules write import util without a from to create a containing namespace.

tuple_object is not a good name. Of course it's a tuple: you're doing the right thing in type-hinting it as such (though the type hint is incomplete). Name it according to the meaning of its contents and not its type.

You have a habit of casting sequences that don't need to be cast, as in

       self.boundary_coordinates.append(list([tuple((x, x + width)), tuple((y, y + height))]))

That's really just

self.boundary_coordinates.append(
    [
        (x, x + width), (y, y + height),
    ]
)

This:

def sign(x):
    return ((x >= 0) << 1) - 1

is tricky and unnecessary. Just use np.sign.

ax.patch.set_linewidth('2') will never work; that string needs to be a numeric literal.

-(CORRECTED_CANVAS_RADIUS) should not have parens.

rec exists in init_particles, so already has access to a self in closure scope; don't write it again in the signature to rec. x and y are never used in the signature either, so delete them too.

get_random_canvas_value should be moved out to a staticmethod of Simulation; and don't repeat it - stop copying and pasting code.

This loop makes no sense:

        while (x, y) in mem:
            return rec(self, x, y)

That's really just an if and not a while.

rec should be redesigned so that it does not recurse. This is a stack overflow waiting to happen.

This list comprehension:

        x, y = [self.get_random_canvas_value() for _ in range(2)]

should be left as a generator () and not a list []. Likewise, all of the inner list comprehensions in get_bounds should be converted to generators; basically delete the [].

colors and markers should be tuples.

numberOfParticles should be number_of_particles by PEP8.

get_bounds is unused so delete it.

You should use floor division here:

step: int = int((RADIUS << 1) // NUMBER_OF_COMPARTMENTS_PER_DIRECTION)

Your use of << and >> in an attempt to speed up multiplication and division by two is premature optimisation, and there are areas of your program that have comparatively gigantic inefficiencies. Just do the comprehensible thing and write 2. And anyway: you have a fixed framerate. So long as that framerate is being hit, there is no point whatsoever in attempting micro-optimisation.

It would be a good idea to make get_nanodomain_coordinates a property as you have, but in that case delete "get_" since its invocation is variable-like and not method-like. However. You have variables with those names already, so do the opposite: remove @property from those methods.

You mix built-in random with np.random. Don't do that. Probably use np.random only.

Don't write no-assignment comprehensions such as [self.update_path(i) for i in range(self.n_particles)]. Just write a loop.

Comprehensions are not helping in handle_hop_diffusion. Just expand it to a regular loop:

def handle_hop_diffusion(ax: plt.Axes, sim: HopDiffusion):
    for param in sim.boundary_coordinates_for_plot:
        boundary = plt.Rectangle(  # type: ignore
            tuple((param[0], param[1])),
            param[2], param[3],
            color='black',
            alpha=0.7,
            clip_on=False,
        )
        ax.add_patch(boundary)

type as a parameter needs to go away. You already know the type - use isinstance() on the sim variable.

Simulation is effectively an abstract class. You need an update stub on it:

    def update(self) -> None:
        raise NotImplementedError()

This statement has no effect, so delete it:

new_surrounding_boundary_idx = self.get_surrounding_boundary_of_particle(new_pos)

You have a latent bug. for i in range(6): hard-codes the number of compartments, when that should really be derived from NUMBER_OF_COMPARTMENTS_PER_DIRECTION. The moment that that constant changes, your code will not do what you want. Likewise, for _ in range(5): hard-codes the particle count instead of using the count parameter set in n_particles.

First pass

A first pass covering much of the above looks like:

import random

import matplotlib.pyplot as plt
import numpy as np
from matplotlib import rcParams  # type: ignore
from matplotlib.animation import FuncAnimation  # type: ignore


DPI = 100
RADIUS_PADDING = 10
RADIUS = 250
CORRECTED_CANVAS_RADIUS = RADIUS - RADIUS_PADDING

TIME_PER_FRAME: float = 0.02  # 20 ms
DIFFUSION_SPEED_CORRECTION: int = 35  # arbitrary
MEMBRANE_DIFFUSION_COEFFICIENT: float = 0.1  # micrometer^2 / s
MEMBRANE_DIFFUSION_FACTOR: float = 2 * np.sqrt(MEMBRANE_DIFFUSION_COEFFICIENT * TIME_PER_FRAME)
MEMBRANE_DIFFUSION_FACTOR_CORRECTED: float = MEMBRANE_DIFFUSION_FACTOR * DIFFUSION_SPEED_CORRECTION
NANODOMAIN_DIFFUSION_FACTOR_CORRECTED: float = MEMBRANE_DIFFUSION_FACTOR_CORRECTED * 0.4

BOUNDARY_THICKNESS: int = 15
NUMBER_OF_COMPARTMENTS_PER_DIRECTION: int = 3
BOUNDARY_JUMP: int = BOUNDARY_THICKNESS
BOUNDARY_OVERFLOW: int = 20
HOP_PROBABILITY_PERCENTAGE: float = 0.15

colors: tuple[str, ...] = ('r', 'b', 'orange', 'g', 'y', 'c')
markers: tuple[str, ...] = ('o', 'v', '<', '>', 's', 'p')


class util:  # convert this to an import of a module with global functions
    @staticmethod
    def compute_distance(p1: tuple[float, float], p2: tuple[float, float]) -> float:
        return np.sqrt((p2[0] - p1[0]) ** 2 + (p2[1] - p1[1]) ** 2)

    @staticmethod
    def get_last_point(path: list[tuple[float, float]]) -> tuple[float, float]:
        return path[-1][0], path[-1][1]

    @staticmethod
    def get_x_coordinates(path: list[tuple[float, float]]) -> list[float]:
        return list(list(zip(*path))[0])

    @staticmethod
    def get_y_coordinates(path: list[tuple[float, float]]) -> list[float]:
        return list(list(zip(*path))[1])

    @staticmethod
    def get_random_normal_direction() -> float:
        return np.random.normal() * np.random.choice([1, -1])

    @staticmethod
    def is_point_within_bounds(
        pos: tuple[float, float],
        bounds: tuple[tuple[float, float], tuple[float, float]],
    ) -> bool:
        x, y = pos[0], pos[1]
        return (
            bounds[0][0] <= x <= bounds[0][1] and
            bounds[1][0] <= y <= bounds[1][1]
        )

    @staticmethod
    def increment_tuple_by_val(tuple_object: tuple[float, float], val: tuple[float, float]) -> tuple[float, float]:
        return tuple_object[0] + val[0], tuple_object[1] + val[1]

    @staticmethod
    def change_direction(
        tuple_object: tuple[float, float],
        direction: tuple[float, float],
    ) -> tuple[float, float]:
        return tuple_object[0] - direction[0], tuple_object[1] - direction[1]


class Simulation:
    def __init__(self, n: int = 5) -> None:
        self.n_particles: int = n

        self.particle_locations: list[tuple[float, float]] = list(self.init_particles())

        self.paths: list[list[tuple[float, float]]] = [
            [coordinate] for coordinate in self.particle_locations
        ]

    @staticmethod
    def get_random_canvas_value() -> int:
        return int(random.randint(-CORRECTED_CANVAS_RADIUS, CORRECTED_CANVAS_RADIUS))

    def init_particles(self) -> set[tuple[float, float]]:
        mem: set[tuple[float, float]] = set()

        for _ in range(5):
            while True:
                pair = (self.get_random_canvas_value(), self.get_random_canvas_value())
                if pair not in mem:
                    break
            mem.add(pair)

        return mem

    def update(self) -> None:
        raise NotImplementedError()


class HopDiffusion(Simulation):
    def __init__(self, n: int = 5) -> None:
        self.boundary_coordinates_for_plot: list[tuple[int, int, int, int]] = []
        self.boundary_coordinates: list[tuple[tuple[float, float], tuple[float, float]]] = []
        self.generate_boundaries()

        super().__init__(n)

    def generate_boundaries(self) -> None:
        step: int = (RADIUS * 2) // NUMBER_OF_COMPARTMENTS_PER_DIRECTION

        for i in range(6):
            if i % 3 == 0:
                continue
            horizontal: bool = i < NUMBER_OF_COMPARTMENTS_PER_DIRECTION
            curr = i * step if horizontal else (i - NUMBER_OF_COMPARTMENTS_PER_DIRECTION) * step

            width = BOUNDARY_THICKNESS if horizontal else (RADIUS * 2) + (BOUNDARY_OVERFLOW * 2)
            height = BOUNDARY_THICKNESS if not horizontal else (RADIUS * 2) + (BOUNDARY_OVERFLOW * 2)
            x = curr - RADIUS - (BOUNDARY_THICKNESS // 2) if horizontal else -RADIUS - BOUNDARY_OVERFLOW
            y = curr - RADIUS - (BOUNDARY_THICKNESS // 2) if not horizontal else -RADIUS - BOUNDARY_OVERFLOW

            self.boundary_coordinates_for_plot.append((x, y, width, height))
            self.boundary_coordinates.append(
                ((x, x + width), (y, y + height))
            )

    @staticmethod
    def can_particle_hop_boundary_probability() -> bool:
        return random.random() < HOP_PROBABILITY_PERCENTAGE

    def is_particle_on_specific_boundary(self, pos: tuple[float, float], idx: int) -> bool:
        return util.is_point_within_bounds(pos, self.boundary_coordinates[idx])

    def is_particle_on_boundary(self, pos: tuple[float, float]) -> bool:
        return any(
            util.is_point_within_bounds(pos, bounds_of_boundary)
            for bounds_of_boundary in self.boundary_coordinates
        )

    def is_particle_in_compartment(self, particle: tuple[float, float]) -> bool:
        return not self.is_particle_on_boundary(particle)

    def get_surrounding_boundary_of_particle(self, pos: tuple[float, float]) -> int:
        for idx, bounds_of_boundary in enumerate(self.boundary_coordinates):
            if util.is_point_within_bounds(pos, bounds_of_boundary):
                return idx
        return -1

    def make_particle_jump(
        self,
        new_pos: tuple[float, float],
        x_dir: float, y_dir: float,
    ) -> tuple[float, float]:
        surrounding_boundary_idx = self.get_surrounding_boundary_of_particle(new_pos)

        while self.is_particle_on_specific_boundary(new_pos, surrounding_boundary_idx):
            new_pos = util.increment_tuple_by_val(
                new_pos, (np.sign(x_dir), np.sign(y_dir))
            )

        new_pos = util.increment_tuple_by_val(
            new_pos, (
                np.sign(x_dir) * BOUNDARY_JUMP,
                np.sign(y_dir) * BOUNDARY_JUMP,
            ),
        )

        # Special case: In some instances the jump may land the particle
        # on a subsequent boundary so we repeat the function. We decrement
        # the particle's coordinates until it is out.
        while self.is_particle_on_boundary(new_pos):
            new_pos = util.increment_tuple_by_val(
                new_pos, (np.sign(-x_dir), np.sign(-y_dir))
            )

        return new_pos

    def update_path(self, idx: int) -> None:
        x, y = self.paths[idx][-1]
        assert not self.is_particle_on_boundary((x, y))

        diffusion_factor = MEMBRANE_DIFFUSION_FACTOR_CORRECTED
        x_dir, y_dir = [util.get_random_normal_direction() * diffusion_factor for _ in range(2)]
        new_pos = x + x_dir, y + y_dir

        if self.is_particle_on_boundary(new_pos):
            if self.can_particle_hop_boundary_probability():
                new_pos = self.make_particle_jump(new_pos, x_dir, y_dir)
            else:
                new_pos = util.change_direction((x, y), (x_dir, y_dir))

        self.paths[idx].append(new_pos)

    def update(self) -> None:
        for i in range(self.n_particles):
            self.update_path(i)

    def init_particles(self) -> set[tuple[float, float]]:
        mem: set[tuple[float, float]] = set()

        for _ in range(5):
            while True:
                pair = (self.get_random_canvas_value(), self.get_random_canvas_value())
                if not (pair in mem or self.is_particle_on_boundary(pair)):
                    break
            mem.add(pair)

        return mem


class Nanodomain(Simulation):
    def __init__(self, n: int = 5):
        super().__init__(n)
        self.nanodomain_coordinates: list[tuple[float, float]] = [
            (-100, 100), (0, 0), (150, -60), (-130, -160)
        ]
        self.nanodomain_radii: list[int] = [80, 20, 50, 140]

    def get_nanodomain_attributes(self) -> list[tuple]:
        return list(map(
            lambda coord, radius: (coord, radius),
            self.nanodomain_coordinates,
            self.nanodomain_radii,
        ))

    def is_particle_in_nanodomain(self, particle: tuple) -> bool:
        return any(
            util.compute_distance(particle, circle_center) <= radius
            for circle_center, radius in
            zip(self.nanodomain_coordinates, self.nanodomain_radii)
        )

    def update_path(self, idx: int) -> None:
        x, y = self.paths[idx][-1]
        diffusion_factor = NANODOMAIN_DIFFUSION_FACTOR_CORRECTED if (
            self.is_particle_in_nanodomain((x, y))) else MEMBRANE_DIFFUSION_FACTOR_CORRECTED
        x_dir, y_dir = [util.get_random_normal_direction() * diffusion_factor for _ in range(2)]
        self.paths[idx].append((x + x_dir, y + y_dir))

    def update(self) -> None:
        for i in range(self.n_particles):
            self.update_path(i)


def handle_nanodomain(ax: plt.Axes, sim: Nanodomain) -> None:
    nanodomains = [
        plt.Circle(
            *param,
            color='black',
            alpha=0.2,
        )
        for param in sim.get_nanodomain_attributes()
    ]
    for nanodomain in nanodomains:
        ax.add_patch(nanodomain)


def handle_hop_diffusion(ax: plt.Axes, sim: HopDiffusion) -> None:
    for param in sim.boundary_coordinates_for_plot:
        boundary = plt.Rectangle(
            tuple((param[0], param[1])),
            param[2], param[3],
            color='black',
            alpha=0.7,
            clip_on=False,
        )
        ax.add_patch(boundary)


def get_coordinates_for_plot(sim: Simulation, idx: int):
    return util.get_x_coordinates(sim.paths[idx]), util.get_y_coordinates(sim.paths[idx])


def get_coordinates_for_heads(sim, idx: int):
    return util.get_last_point(sim.paths[idx])


def set_plot_parameters(ax):
    ax.tick_params(axis='y', direction='in', right=True, labelsize=16, pad=20)
    ax.tick_params(axis='x', direction='in', top=True, bottom=True, labelsize=16, pad=20)

    # legends and utilities
    ax.set_xlabel(r'nm', fontsize=16)
    ax.set_ylabel(r'nm', fontsize=16)

    # border colors
    ax.patch.set_edgecolor('black')
    ax.patch.set_linewidth(2)

    ax.set_xlim(-RADIUS, RADIUS)
    ax.set_ylim(-RADIUS, RADIUS)


def plot(sim: Simulation) -> None:
    fig, ax = plt.subplots(figsize=[5, 5], dpi=DPI)

    path_plots: list[plt.Line2D] = [
        ax.plot(
            *get_coordinates_for_plot(sim, i),
            markersize=15, color=colors[i],
        )[0]
        for i in range(5)
    ]

    head_plots: list[plt.Line2D] = [
        ax.plot(
            *get_coordinates_for_heads(sim, i),
            markersize=7, color=colors[i], marker=markers[i],
            markerfacecolor='white')[0]
        for i in range(5)
    ]

    def initialize_animation() -> list[plt.Artist]:
        set_plot_parameters(ax)
        if isinstance(sim, Nanodomain):
            handle_nanodomain(ax, sim)
        elif isinstance(sim, HopDiffusion):
            handle_hop_diffusion(ax, sim)
        return path_plots

    def update_animation(frame: int) -> list[plt.Artist]:
        sim.update()
        for i, axes in enumerate(path_plots):
            coords = get_coordinates_for_plot(sim, i)
            axes.set_data(*coords)
        for i, head_marker in enumerate(head_plots):
            coords = get_coordinates_for_heads(sim, i)
            head_marker.set_data(*coords)
        return path_plots

    animation = FuncAnimation(
        fig,
        update_animation,
        init_func=initialize_animation,
        interval=20,
    )

    fig.tight_layout()
    plt.show(block=True)


def main() -> None:
    rcParams.update({'figure.autolayout': True})

    sim = HopDiffusion()  # Nanodomain()
    plot(sim)


if __name__ == '__main__':
    main()

Second pass

For any kind of serious simulation, and especially if you ever need to convert this from an animated visualisation to a high-throughput headless simulation, you need to vectorise with Numpy. This is a long process and requires rewriting most of your code. I will show examples but not the complete program.

generate_boundaries() becomes a function that produces two three-dimensional arrays:

    @staticmethod
    def generate_boundaries() -> tuple[np.ndarray, np.ndarray]:
        stop = RADIUS + BOUNDARY_OVERFLOW
        inner_edges = np.linspace(
            start=-RADIUS, stop=RADIUS, num=NUMBER_OF_COMPARTMENTS_PER_DIRECTION + 1,
        )[1: -1]
        n_edges = len(inner_edges)

        # edges by x/y by start/stop
        bound_coords = np.empty((2*n_edges, 2, 2))
        bound_coords[:n_edges, 0, 0] = inner_edges - BOUNDARY_THICKNESS/2
        bound_coords[:n_edges, 0, 1] = inner_edges + BOUNDARY_THICKNESS/2
        bound_coords[:n_edges, 1, :] = -stop, stop
        bound_coords[n_edges:, ...] = bound_coords[:n_edges, ::-1, :]

        # edges by x/y by start/size
        plot_coords = bound_coords.copy()
        plot_coords[..., 1] = bound_coords[..., 1] - bound_coords[..., 0]

        return plot_coords, bound_coords

Your random generator becomes

from numpy.random import default_rng
from numpy.random._generator import Generator


rand: Generator = default_rng(seed=0)

and then normally-distributed Brownian offset generation becomes

def get_random_normal_direction() -> float:
    return rand.normal(scale=MEMBRANE_DIFFUSION_FACTOR_CORRECTED, size=2)

Bounds-checking becomes

def is_point_within_bounds(
    pos: np.ndarray,
    bounds: np.ndarray,
) -> bool:
    return (
        (bounds[:, 0] <= pos) &
        (bounds[:, 1] >= pos)
    ).all()

Your initial particle state loses all of its uniqueness concerns, operates on floats only, and becomes

    def init_particles(self) -> np.ndarray:
        return rand.uniform(
            low=-CORRECTED_CANVAS_RADIUS,
            high=CORRECTED_CANVAS_RADIUS,
            size=(self.n_particles, 2),
        )

etc.

\$\endgroup\$
3
  • \$\begingroup\$ "rec should be redesigned so that it does not recurse" because the recursion is unbounded. Recursion in general is fun! In this case, just use something off the shelf. There's copypasta in itertools, or import more-itertools.unique_everseen so you can write self.particlesLocation = list(itertools.islice(unique_everseen(more-itertools.repeatfunc(get_random_canvas_value)), 5)), or something like that... \$\endgroup\$ Commented Oct 2, 2022 at 16:35
  • 1
    \$\begingroup\$ @ShapeOfMatter Recursion in general is indeed fun, but should be avoided like the plague in Python especially when iterative alternatives are available. With no tail optimisation and a very shallow stack, Python is uniquely ill-suited to recursion. \$\endgroup\$
    – Reinderien
    Commented Oct 2, 2022 at 16:37
  • 2
    \$\begingroup\$ This is amazing, thank you for your time and wisdom! I learned A LOT through this comment. \$\endgroup\$
    – hexaquark
    Commented Oct 2, 2022 at 18:57
5
\$\begingroup\$

and welcome to Code Review.

With apologies, I should probably start with a bit of a disclaimer: I'm just on the way out of an infection and my brain is not working at full capacity. Obviously you should always take things you read on the internet with a grain of salt, but here especially feel free to question anything that doesn't make sense. I also won't be able to do a comprehensive review, so if I don't talk about something it means nothing more or less than that I didn't talk about it. In particular, I'm not going to review the actual correctness of the diffusion logic. That out of the way, here's some of my impressions.

  • Units: Units are really important in any sort of scientific programming. I appreciate the fact that you have comments attached to constant definitions clarifying the units of things like TIME_PER_FRAME. If your code gets much more complicated and you start tripping over them, it may be worth adding even more structured handling to your units such as runtime enforcement with Pint.
  • Type annotations: Python's support for type annotations is one of the most significant changes to the language in the last decade, and it's good that you're making use of it. I imagine it comes fairly naturally if you have a more C++ background. I would urge you to put a little more time into using them comprehensively to get the most use out of them. I assume that you're using a tool such as MyPy to validate your code. The way that such tools work, any function which doesn't have annotations just doesn't get checked. And point the tool loses track of what a variable is, it defaults to Any which basically means "Assume the programmer is right." While gratifying, that's not helpful! So it's well worth your time filling in annotations across the board. Additionally, if you annotate things like x: Tuple[int, int] instead of x: Tuple, you allow MyPy to warn you if you did a: str = x[0] or a, b, c = x.
  • Code separation: It is good that you've split up the code into different files responsible for different things. There are points where I'm not clear why something was grouped as it was. For example, DPI features in simulation.py, but it only gets used (and only really makes sense) in plotGenerator.py as the graphical output bit of your code. One particular tip that relates to this is to avoid the use of the form from X import * in production code. That helps slow you down enough to ask "Wait, why am I getting this from here?" as well as avoiding any nasty surprises when two different modules both claim the same symbol name. Commercial style guides vary, but most favour either the form from X import Y or import X ... X.Y.
  • staticmethod has its uses for functionality that it tightly linked to a class, but for something like your Utils class Python is actually perfectly comfortable with free floating functions at a module level. Object Oriented Programming is a tool like any other; use it when it makes sense (i.e. if you want to pass around an object instance which knows its own state and how to manipulate it) but if you couldn't imagine yourself creating a utils instance it probably
  • Prefer return types to side effects, and prefer fuller creation at definition time. These two sit together as a more general programming principle of "Minimise the possibility of invalid state". For example, in init_particles you have a method which runs exactly once and serves only to fill in the data for particlesLocation. You could instead have it return the data to fill in, and in __init__ would just write something like self.particlesLocation: List[Tuple[int, int]] = self.init_particles() That way there would never be a point where the Simulator object doesn't actually have particles. To be clear, what you have isn't bad because you're still getting it tidied up before the end of __init__ and therefore no external code will see it in that invalid state. The difference I'm suggesting just moves from code that is correct (in this regard) but I have to consciously stop and verify it to code that is obviously correct (in this regard.) A lot of coding best practice is this sort of thing, looking to guide your readers (colleagues, paper reviewers, your future self) through an argument without them having to go "Wait, what about..."
  • Random numbers. You use random numbers, which makes sense for code about diffusion. Especially for scientific programming, if you're using random numbers you should always ensure that they are reproducibly random. Yes, that's a bit of a contradiction on the definition of random; we actually have pseudorandom number generators so there is no (obvious or impactful) link between different numbers in the sequence but if you start over you'll get the same sequence again. There are two ways to ensure this. The easier is to call random.seed(31415926) at the start of your program. The slightly harder but perhaps better practice especially for multithreading, is to define a new object rng = random.Random(271828) and use rng thereafter. (Using a well recognised value like the digits of e or pi is called a "nothing up my sleeve number". It helps convince reviewers that you didn't just play with random seeds until you got one you liked!)
  • Numpy. This isn't something you've touched, and that is absolutely fine. If you get into scientific programming in Python, you will eventually come up against numpy, which is a python library for doing large array operations. Instead of looping through a list and adding something to each, in numpy you'd add two lists. I just mention it because, if you keep writing this code and get to the point where it's frustrating you waiting for the computer to do the calculations, it may be worth exploring that as an avenue.

I am not affiliated in any way with any of the packages I mention, but I have used them all.

Hope this helps. As above, if anything is unclear do ask. With any luck my brain will have booted up enough by then to see what went wrong.

\$\endgroup\$
3
  • 1
    \$\begingroup\$ Quite a good review for an infected brain! \$\endgroup\$ Commented Oct 2, 2022 at 15:47
  • \$\begingroup\$ @Josiah Thank you very much for all these well-detailed tips, I appreciate the help despite the infection! \$\endgroup\$
    – hexaquark
    Commented Oct 2, 2022 at 18:59
  • \$\begingroup\$ Whelp. Seems I recommended a whole library you were already using, and just forgot to finish some of my sentences. But I'm glad you found it useful. \$\endgroup\$
    – Josiah
    Commented Oct 3, 2022 at 8:47
4
\$\begingroup\$

These are kinda nit-picky; i noticed one or two things, so I ran pylint against the code and I'll summarize it here, leaving out stuff Josiah already mentioned.

  • main.py: Unnecessary semicolon
  • Lots of trailing whitespace. Python is whitespace sensitive, so while I don't think trailing whitespace can ever cause an error, it's a good habit to keep your whitespace tidy. There was also a place where you indented five spaces, which could cause an error in some cases. (and of course never use tabs)
  • Use snake_case instead of camelCase for most stuff. PascalCase is used for classes though. Making the transition between conventions is a pain, but it's less of a pain than fighting convention :(
  • Don't use a list comprehension for a side-effectful for loop. (plotGenerator.py:24 and others)
    It's poor form to have side-effects in a comprehension at all.
  • Avoid shadowing names, especially built-ins. (e.g. your argument type at plotGenerator.py:59)
  • Put if/else bodies on new lines.
  • Don't leave unused stuff lying around your code-base. Obviously all these points can be raised by a linter, but this is one you kinda need a linter for in practice. (e.g. plotGenerator.py:83: Unused argument frame)
  • There are conventions about how to organize your imports, whatever.
  • I'm quite fond of generator comprehensions, so it was a pleasant surprise that the linter suggested changing util.py:8 to ... = min(min(elem[0]) for elem in lists).

So, find a linter you can stand and configure it for your use. Complying with every formatting convention can be a chore, but it's valuable to any co-workers, and there are some real(ish) problems that a linter can find, on top of what your type-checker will find.

One of the first things I noticed was your declaration in simulation.py of PATH = Tuple[List[float]]. MyPy doesn't complain about it because it's never used, and the linter didn't complain that it's never used because (i think) it's a global definition in a module (so it might be useful to consumers of the module). The problem with that type expression is that it's specifically a tuple of length one, which is unlikely to be what you actually want (ignoring that you apparently don't want PATH at all). Perhaps you meant `Tuple[List[float], ...]

\$\endgroup\$
1
  • \$\begingroup\$ Cheers! I will look into pylint, haven't heard of it before. \$\endgroup\$
    – hexaquark
    Commented Oct 2, 2022 at 18:59

Not the answer you're looking for? Browse other questions tagged or ask your own question.