Strength and weakness of disease-induced herd immunity in networks

Takayuki Hiraoka takayuki.hiraoka@aalto.fi Zahra Ghadiri Abbas K. Rizi Mikko Kivelä Jari Saramäki Department of Computer Science, Aalto University, 00076 Espoo, Finland

Abstract

When a fraction of a population becomes immune to an infectious disease, the population-wide infection risk decreases nonlinearly due to collective protection known as herd immunity. Studies based on mean-field models suggest that natural infection in a heterogeneous population may induce herd immunity more efficiently than homogeneous immunization. Here, we use network epidemic models to show that the opposite can also be the case. We identify two competing mechanisms driving disease-induced herd immunity in networks: the high density of immunity among socially active individuals enhances the herd immunity effect, while the topological localization of immune individuals weakens it. The effect of localization is stronger in networks embedded in low-dimensional space, which can make disease-induced immunity less effective than random immunization. Our results highlight the role of networks in shaping herd immunity and call for careful examination of model predictions that inform public health policies.

I Introduction

A key challenge in infectious disease control is to protect a population by conferring immunity [1, 2, 3, 4, 5, 6]. When some individuals gain immunity and become no longer susceptible to a disease, their presence has a nonlinear impact on the level of protection for the population as a whole. Individuals without immunity also benefit from the reduced risk of infection because they are less likely to come into contact with others who could transmit the disease. In particular, there is a critical proportion of immune individuals above which the chain of transmission cannot be sustained; hence, the disease cannot invade the population. This concept is known in epidemiology as herd immunity [7, 8, 9, 10, 11].

The theoretical understanding of herd immunity through mathematical modeling emerged in the 1970s [11]. The main focus then was on estimating the vaccine coverage necessary for disease elimination. A simple expression for estimating the herd immunity threshold was obtained using a basic model that assumes random immunization of a homogeneous, fully mixed population [12]. At the same time, much theoretical effort has been devoted to bridging the gap between such simplifying assumptions and the heterogeneity of real-world populations [13, 12, 14, 15, 16, 8, 17]. Such population heterogeneity can be leveraged to design efficient, targeted vaccination strategies.

More recently, the notion of disease-induced herd immunity has gained attention, particularly in the context of the COVID-19 pandemic [18, 19, 20, 21], for which no vaccine was available at the early stage. Here, immunity is acquired through natural infection instead of vaccination. If one assumes a homogeneous and fully mixed population, disease-induced and vaccination-induced herd immunity are mathematically equivalent. However, this equivalence breaks down when variation in contact patterns is introduced. Britton et al. [18] showed that the threshold for disease-induced herd immunity in a heterogeneous population is considerably lower than expected in the homogeneous case; similar conclusions were drawn by several studies that adopted data-driven modeling approaches [19, 20, 21]. The essential reason for the lower threshold is that individuals with more contacts are more likely to get infected and become immune in an outbreak; epidemic spread thus effectively acts as targeted immunization.

However, these results are derived using either differential equation models or network models where edges are randomly drawn at every time step. In such modeling approaches, correlations at small scales are disregarded even if population heterogeneity is considered, and interactions essentially take place in a mean-field manner. While these assumptions provide a convenient starting point, they are seldom met in real-world populations. In reality, interactions often occur repeatedly between the same pairs of individuals and are heavily influenced by social and spatial constraints. Such characteristics are better captured by modeling the contact structure as a static network that encodes these constraints [22, 23].

In this study, we use network epidemic models to reexamine how immunity induced by a past epidemic affects the outcome of future epidemics. Building on earlier studies that demonstrate the role of network structure in disease-induced herd immunity [24, 25, 26, 27, 28], we aim to unpack the mechanisms shaping the herd immunity effect induced by natural infection in comparison to random immunization and clarify the difference between network epidemic models and mean-field models, particularly focusing on the role of spatial embeddedness.

Our key discovery is that in network models, disease-induced immunity is driven not only by preferential immunity among socially active individuals—as already identified in mean-field models—but also by another, counteracting mechanism inherent to network models. In an outbreak originating from a single source (an initially infected individual), the set of individuals who become infected is necessarily topologically contiguous in the network; see Fig. 1A for a schematic illustration. This localization of immunity gives rise to mixing heterogeneities akin to those discussed in the context of vaccination and other interventions [29, 30, 31, 32, 33]. We find that these correlations between the epidemic states of individuals and their network positions have a major impact on the herd immunity effect and can, in fact, be strong enough to offset the advantage of disease-induced immunity over random immunization for homogeneous and/or spatially embedded networks. However, notably, they cannot be accounted for by mean-field models as typically used in the literature.

Refer to caption — Figure 1: Models. (A) Comparison between the distributions of the same number of immune nodes induced by an epidemic and by a random immunization in a random geometric graph. Immune nodes are colored red, while susceptible nodes are light blue if they belong to the largest component of the residual subgraph and gray otherwise. The edges not included in the residual graph are represented by dashed lines, with light red edges connecting two immune nodes and purple edges connecting an immune node and a residual node. (B) Schematic illustration of the quantitative definition of herd immunity using the relationship between the immune fraction $\pi_{0}$ and the giant residual component size $\pi_{1}$ . (C) Network models used in this study, positioned according to their level of degree heterogeneity (vertical axis) and spatiality (horizontal axis).

We first introduce the model of epidemic dynamics and the network structures and formalize how we quantify the effect of herd immunity in the network setting. We then show that the net effectiveness of disease-induced herd immunity is determined by the competition between the two competing mechanisms: high density of immunity among highly connected nodes and localization of immunity within the network. After illustrating the strength of each effect in configuration model networks with different levels of degree heterogeneity, we quantify the effect of localization as a function of the level of spatial embedding. Finally, using real-world contact data, we show that the mean-field model can underestimate the herd immunity threshold of a networked population. We finish by discussing the implications of our findings.

II Methods

II.1 Epidemic dynamics and immunity

We use the canonical susceptible-infected-recovered (SIR) model to describe the dynamics of non-recurrent epidemics. Individuals and the contacts between them are represented as nodes and edges in an undirected contact network of size $N$ . We assume that the contact network is quenched, i.e., remains static throughout the epidemic timescale. The dynamical state of each node is either susceptible, infected, or removed, and this state is updated in continuous time. Transmission occurs between each connected pair of an infected node and a susceptible node independently at rate $\beta$ , after which the susceptible node becomes infected, while an infected node recovers at rate $\gamma$ . Once recovered, the nodes are fully immune, i.e., they can no longer become infected or transmit the disease, and are thus effectively removed from the system. In what follows, we set $\gamma=1$ unless specified otherwise.

We are interested in comparing two different scenarios in which immunity is introduced into a fully susceptible population. In the first scenario, individuals gain immunity through contracting the disease. An epidemic spreads from a randomly selected seed node until it eventually dies out, making the nodes that have experienced infection immune to future epidemics. In the second scenario, we model homogeneous immunization by randomly selecting susceptible individuals to be permanently immunized and moving them directly to the recovered compartment.

Regardless of how immunity is induced, its effectiveness at the population level is defined by its ability to prevent future outbreaks from spreading in the population or to reduce their size. In general, subsequent outbreaks can be caused by a variant of the original pathogen that is more contagious than the original. Consider an infection that cannot infect the immune individuals but spreads among the susceptible ones with a transmission rate greater than $\beta$ . If a fraction $\pi_{0}$ of the population has already acquired immunity through natural infection or immunization, this post-immunity epidemic cannot be larger than $1-\pi_{0}$ due to direct protection. This is the size of the residual subgraph [24] induced by the still susceptible nodes.

The actual size of the post-immunity epidemic depends on the contagiousness of the disease. Even if the disease is contagious enough to be always transmitted from an infected individual to a susceptible neighbor, the epidemic still cannot grow larger than the largest connected component of the residual subgraph, which we call the giant residual component for short and denote its relative size by $\pi_{1}$ . In the limit of large population size, every connected component in the residual graph except the giant one, if it exists at all, covers a vanishing fraction of the population. The nodes in one of the small components of the residual graph are, therefore, almost certainly protected from future epidemics of arbitrary contagiousness, even though they are not themselves immune. This represents the herd immunity effect, which can be quantified by the sum of the sizes of the small components or, equivalently, the difference in size between the residual subgraph and its giant component (see Fig. 1B).

The herd immunity effect can vary depending on the specific distribution of immune nodes on the network even if their relative size $\pi_{0}$ is the same. In particular, we focus on two aspects of immunity distribution: its bias towards high-degree nodes and its topological localization. We quantify the first with the ratio between the mean degree of immune nodes to the mean degree of the entire network, denoted by $\langle k\rangle_{\mathrm{R}}$ , and the second with the share of edges residing between the immune and residual subgraphs, denoted by $\rho_{\mathrm{SR}}$ . When immune nodes are more likely to be adjacent to each other in the network, there are fewer edges at the interface between the two subgraphs. Conversely, the number of edges at the interface grows as the mixing between immune and susceptible nodes becomes more heterophilic.

II.2 Network structures

Epidemic dynamics is largely influenced by the individual variance in contact and transmission patterns [34, 35, 36], which translates to the heterogeneity of node degrees in the contact network. The paradigmatic network model used to express degree heterogeneity is called the configuration model, where the distribution of node degrees solely determines the network structure [37]. In a configuration model network, the structure is locally tree-like, meaning that the likelihood of a node being part of a finite-length cycle diminishes as the network size increases. This feature often simplifies analytical calculations and makes the model more tractable.

However, real-world contact networks through which diseases are transmitted are hardly tree-like; rather, they are characterized by the abundance of short cycles. This is because contact and transmission between individuals can only occur when individuals are physically close to each other. If two individuals have a common neighbor in the contact network, they are likely to be in proximity in space, which implies a high probability that they are also connected to each other. As a result, many triangles and short cycles are formed.

Here, we explore a wide range of network structures, focusing on degree heterogeneity and spatial embeddedness. Figure 1C illustrates these two features. Degree heterogeneity refers to the variance in node connectivity. Node degrees are as homogeneous as possible when all nodes have the same number of neighbors, i.e., the degree distribution is degenerate, as in random regular graphs (RRGs) and lattices. In Erdős-Rényi graphs (ERGs) and random geometric graphs (RGGs), the degrees are moderately heterogeneous and follow Poisson distributions, where the variance is equal to the mean. At the more heterogeneous end of the spectrum, the network is characterized by the presence of nodes with significantly more connections than average. We use negative binomial distributions (parameterized by dispersion parameter $r$ ; smaller $r$ represents higher degree heterogeneity) to represent such high heterogeneity. For all the network models, we fix the mean degree $\langle k\rangle=6$ .

Spatial embeddedness, or spatiality, refers to the extent to which the underlying geometric configuration of nodes within a metric space determines the connectivity between them. Here, each node occupies a position in the underlying space. In a highly spatial network, nodes are more likely to be linked to each other if they are closer in the space. The least spatial networks belong to the family of configuration network models, such as RRGs and ERGs, which represent maximum randomness under the given degree distribution. Lattices and RGGs are at the other end of the spectrum where spatiality is highest because nodes are connected if and only if their positions are within a certain distance.

We adopt two different approaches to interpolate continuously between spatial and random networks. We use an edge randomization scheme to tune the spatiality for relatively homogeneous networks with degenerate and Poisson degree distributions. Starting with an instance of a network model with the highest spatiality, i.e., a lattice or an RGG, we rewire fraction $\phi$ of all edges by exchanging the endpoints of two randomly selected edges [38]. This process, commonly known as the double edge swap, preserves the degree of each node and allows us to adjust the spatiality without altering the original degree distribution. By completely randomizing edges (i.e., $\phi=1$ ), the double edge swap operation transforms a lattice into an RRG and an RGG into an ERG.

To explore the spatial network topologies with higher levels of degree heterogeneity, we adopt the heterogeneous random geometric graph (HRGG) model proposed by Boguñá et al. [39], which allows us to control the degree distribution and spatiality independently. In the HRGG model, each node is assigned an expected degree and a position in a metric space, which is here assumed to be a two-dimensional unit square with periodic boundaries. Given the expected degrees and positions, the model generates random networks that satisfy the following properties: (i) the degree of each node is a Poisson random variable with expectation equal to the expected degree assigned to the node; (ii) the spatiality of the network is expressed as the propensity of nodes to form local connections in the underlying space, which is governed by an independent parameter called the temperature $\tau>0$ . Low values of $\tau$ imply that nodes in proximity are more likely to be linked, and thus, the generated network is more strongly embedded in the space. On the other hand, in the limit of $\tau\to\infty$ , the edges are agnostic to the positions of nodes in the underlying space, making the model equivalent to the configuration model. We use this model to generate networks with Poisson and negative binomial degree distributions with varying levels of spatiality. Note that this model cannot generate networks with a degree distribution more homogeneous than Poisson. See Supplementary Text B for details about the HRGG model.

The two methods both induce randomness in the connection patterns between nodes embedded in space, but in different ways. We use the mean length of edges to evaluate the spatiality of the generated networks in a unified manner. The length $d_{ij}$ of the edge between nodes $i$ and $j$ is the Euclidean distance between the positions of $i$ and $j$ in the underlying space, and $\langle d\rangle$ denotes the mean length of all edges. A completely rewired network with $\phi=1$ and a “hot” HRGG with $\tau\to\infty$ are equivalent to the configuration model; in such cases, the mean edge length is equal to the expected distance $d^{*}$ between two random points:

	$\displaystyle d^{*}$	$\displaystyle=4\int_{0}^{\frac{1}{2}}\int_{0}^{\frac{1}{2}}\sqrt{x^{2}+y^{2}}% \,dxdy$		(1)
		$\displaystyle=\frac{\sqrt{2}+\ln(\sqrt{2}+1)}{6}\simeq 0.3826\dots.$		(1)

On the other hand, when $\phi=0$ or $\tau\to 0$ , each node is connected exclusively to other nodes in their proximity; therefore $\langle d\rangle\to 0$ in the large system size limit. Between these two extremes, $\langle d\rangle$ responds monotonically as a function of $\phi$ or $\tau$ . We adopt $\langle d\rangle/d^{*}$ as the normalized measure that represents the randomness of a network with respect to the underlying space.

II.3 Numerical and analytical methods

In our numerical simulations, we compute the outcome of the SIR model by mapping it to an epidemic percolation network [40], a directed network that encodes the stochastic epidemic dynamics on the original network. This mapping significantly simplifies the numerical analysis; for instance, all the nodes that will be infected in an outbreak can be obtained as the descendants of the initially infected nodes in the epidemic percolation network.

We derive the message-passing formalism for bond percolation [41] to analytically calculate relevant quantities, such as $\pi_{1}$ , $\langle k\rangle_{\mathrm{R}}$ , and $\rho_{\mathrm{SR}}$ . For configuration model networks, this method is equivalent to the probability generating function method for solving bond percolation [42, 24]. We discuss the details of the message-passing formalism in Supplementary Text C.

III Results

III.1 Localization of disease-induced immunity significantly weakens herd immunity

We first focus on the herd immunity effect in configuration model networks, where the degree distribution is the only defining feature. We study, in increasing order of degree heterogeneity, random regular graphs (RRGs), Erdős-Rényi random graphs (ERGs), and the configuration model networks with negative binomial degree distributions. For each of the network ensembles, we first calculate the expected size $\pi_{0}$ of a large epidemic (an outbreak that infects a finite fraction of the population) as a function of transmission rate $\beta$ . Then, we compute the size of the giant residual component, $\pi_{1}$ , for three cases: after removing the nodes that are infected in the epidemic, after removing the same number of nodes but randomly, and after removing the same number of nodes with the same degrees but randomly. We refer to the third case as the proportional immunization. The results of analytical and numerical calculations are summarized in the left columns of Fig. 2.

For RRGs, we observe that, strikingly, the herd immunity effect induced by natural infection is weaker than that of random immunization; the residual subgraph is always larger in the case of disease-induced immunity (Fig. 2A). As RRGs have no degree heterogeneity and their structure is entirely random, the herd immunity effect is entirely dictated by the localization of immunity in the wake of the outbreak. The contiguous nature of the subgraph covered by disease-induced immunity is clearly visible in the smaller size $\rho_{\mathrm{SR}}$ of the interface between the immune and residual subgraph.

When the contact network is an ERG in which node degrees are moderately heterogeneous, disease-induced immunity and random immunization result in a giant residual component of the same size. Although the herd immunity effect is strengthened by the fact that the outbreak preferentially leaves high-degree nodes immune, it is weakened to an equal extent by the topological contiguity of the outbreak. Both effects are visible in Fig. 2C: the average degree ${\langle k\rangle}_{\mathrm{R}}$ of immune nodes is higher for disease-induced immunity, but the size of the susceptible-immune interface $\rho_{\mathrm{SR}}$ is smaller. The equality of the two herd immunity effects can be shown analytically; we prove in Supplementary Text D that the Poisson distribution is, in fact, the unique degree distribution of the configuration model in which the two effects exactly offset each other. The giant residual component size under proportional immunization implies that the herd immunity effect would be stronger if only the effect of preference for high-degree nodes is taken into account. This leads to the important conclusion that neglecting the localization effect overestimates the strength of disease-induced herd immunity.

As the degree heterogeneity of the contact network increases, the advantage of disease-induced immunity in being able to exploit the heterogeneity and to reside preferentially among high-degree nodes outweighs the localization effect, as seen in the results for configuration model networks with a negative binomial degree distribution with dispersion parameter $r=3$ (Fig. 2E). Importantly, disease-induced herd immunity is still significantly weaker than what would be expected from the degrees of immune nodes alone. This is clearly visible in the difference in the size of the giant residual component for disease-induced immunity and proportional immunization, as well as in the average degrees of immunized nodes and interface sizes. Qualitatively similar results are obtained for scale-free networks, corroborating our findings (Fig. S1A).

To conclude, the net effect of disease-induced herd immunity is determined by the competition between preferential immunity among high-degree nodes and immunity localization. We observe that localization significantly weakens herd immunity for all the studied networks; the fact that proportional immunization, which represents the pure effect of preference for high-degree nodes of disease-induced immunity without dynamical correlations, always results in smaller giant residual component underpins that localization is another factor that defines the strength of disease-induced herd immunity.

III.2 Spatiality further weakens disease-induced herd immunity by enhancing localization

Unlike the configuration model networks studied so far, real-world contact networks are constrained by physical space and are therefore effectively lower dimensional. As the ratio of surface area to volume is smaller in lower dimensions, one can expect that the susceptible-immune interface under disease-induced immunity is smaller and the effect of localization is more pronounced in networks embedded in low-dimensional space. Accordingly, we expect disease-induced herd immunity to be weaker in spatial networks. To study this, we numerically investigate the herd immunity effect in regular lattices, random geometric graphs (RGGs), and heterogeneous random geometric graphs (HRGGs).

We observe that for regular lattices, disease-induced immunity leads to a larger giant residual component than random or proportional immunization (Fig. 2B), similarly to the case for RRGs. The smaller size of the interface, $\rho_{\mathrm{SR}}$ , implies that the effect of localization is stronger in a regular lattice than in an RRG due to its lower dimensionality.

For RGGs, the gap between disease-induced immunity and random immunization is even larger, confirming the above hypothesis and implying a greater advantage of random immunization over disease-induced immunity in efficiently shrinking the giant residual component (Fig. 2D). In an RGG, the immune nodes under disease-induced immunity have very small interface with the residual nodes, indicating that they are highly localized. Although disease-induced immunity can exploit the modest heterogeneity of the Poisson degree distribution, the impact of localization is much more pronounced, overriding the effect of preference for high-degree nodes. This is in contrast to the case of ERGs, the configuration model counterpart of RGGs, where the effects of the two mechanisms exactly cancel each other out. Note that in RGGs, the two mechanisms are correlated; preference for high-degree nodes amplifies localization because of degree correlations.

The impact of spatial embeddedness is particularly evident for networks with higher degree heterogeneity. For HRGGs with negative binomial degree distribution with dispersion parameter $r=3$ and temperature $\tau=0.05$ , which amounts to $\langle d\rangle/d^{*}=0.01$ , the herd immunity effect induced by natural infection is weaker than that achieved by random immunization (Fig. 2F). Juxtaposed against the configuration model counterpart, where the opposite result is found, this highlights that the spatiality of the contact network can boost the effect of localization to the extent that it overcomes the counteracting effects of preferential immunity among high-degree nodes, thus reversing the outcome. In scale-free HRGGs, characterized by even higher degree heterogeneity, disease-induced immunity still proves more efficient than random immunization in dismantling the giant residual component (Fig. S1B). Even then, there is a significant gap between $\pi_{1}$ for disease-induced immunity and proportional immunization, highlighting the impact of immunity localization.

We have so far established that the low-dimensional topology of the contact network diminishes the disease-induced herd immunity effect by amplifying immunity localization. We next ask: How strongly does the network need to be embedded in space for the effect of preferential immunity among high-degree nodes to be outweighed by the effect of immunity localization? To this end, we compare the effectiveness of disease-induced immunity and random immunization across the spatiality spectrum. We use the edge rewiring process and the HRGG model with varying temperatures to continuously cover the spectrum of spatiality. As a common measure of spatiality across two schemes, we use the normalized mean edge length $\langle d\rangle/d^{*}$ as described in Methods. We represent the effectiveness of herd immunity by the minimum fraction of nodes that need to be immune to eliminate the giant residual component: $\pi_{\mathrm{c}}=\min\{\pi_{0}\mid\pi_{1}=0\}$ . In other words, even a disease with an infinitely large transmission rate cannot invade the population if $\pi_{0}\geq\pi_{\mathrm{c}}$ . Thus, $\pi_{\mathrm{c}}$ represents the worst-case bound for the herd immunity threshold. Here, we numerically identify $\pi_{\mathrm{c}}$ as the minimum value of $\pi_{0}$ that leads to $\pi_{1}\leq 0.01$ . We let $\pi_{\mathrm{c}}^{\mathrm{(d)}}$ and $\pi_{\mathrm{c}}^{\mathrm{(r)}}$ denote the threshold under disease-induced immunity and random immunization, respectively.

Figure 3A shows the difference between the herd immunity thresholds of random immunization and disease-induced immunity, $\pi_{\mathrm{c}}^{\mathrm{(r)}}-\pi_{\mathrm{c}}^{\mathrm{(d)}}$ , as a function of normalized mean edge length $\langle d\rangle/d^{*}$ for different degree distributions. Positive values indicate that $\pi_{\mathrm{c}}^{\mathrm{(d)}}$ is smaller, therefore natural infection induces a stronger herd immunity effect compared to random immunization, while negative values imply the opposite. For every degree distribution, the difference between the thresholds increases with spatiality. In other words, the spatiality of the contact network decreases the relative advantage (or increases the relative disadvantage) of disease-induced immunity. Fig. 3B shows that the distribution of disease-induced immunity becomes increasingly localized with spatiality, implying that the negative influence of low dimensionality on the disease-induced herd immunity effect is mediated by immunity localization.

As we have already seen, for Poisson degree distributions, the two immunity scenarios induce an equal herd immunity effect without spatiality (i.e., at $\langle d\rangle/d^{*}=1$ ). For more heterogeneous networks, there is a crossover from the regime where disease-induced immunity is more effective ( $\pi_{\mathrm{c}}^{\mathrm{(d)}}<\pi_{\mathrm{c}}^{\mathrm{(r)}}$ ) to the regime where random immunization becomes more advantageous ( $\pi_{\mathrm{c}}^{\mathrm{(d)}}>\pi_{\mathrm{c}}^{\mathrm{(r)}}$ ) as the network becomes more spatial. As the degree distribution becomes broader, as in negative binomial distributions with increasing variance (i.e., decreasing dispersion parameter), the crossover point shifts toward the spatial end, indicating that a higher level of spatiality is required to counterbalance the effect of degree heterogeneity (see also Fig. S1B). This highlights the competition between degree heterogeneity and spatiality of the contact network in determining the strength of disease-induced herd immunity.

III.3 Herd immunity threshold in empirical population

To underscore the difference between the mean-field and network models in estimating the herd immunity threshold, we compare two age-structured models informed by empirical data: a mean-field model and a network model of SIR dynamics, both with the same age structure and age-specific contact patterns. The age structure is captured by the proportion $p_{i}$ of the population in each age group $i$ , while the contact patterns are represented by the contact matrix $M$ , of which element $M_{ij}$ represents the average number of contacts an individual in age group $i$ makes with individuals in age group $j$ . For the contacts between groups to be symmetric, we impose on the contact matrix $p_{i}M_{ij}=p_{j}M_{ji}$ .

The mean-field model is defined by the following set of differential equations:

$\displaystyle\dot{s}_{i}$	$\displaystyle=-\lambda_{i}s_{i},$	(2)
$\displaystyle\dot{y}_{i}$	$\displaystyle=\lambda_{i}s_{i}-\gamma y_{i},$
$\displaystyle\dot{r}_{i}$	$\displaystyle=\gamma y_{i}.$

Here, $s_{i},y_{i},r_{i}$ denotes the proportion of susceptible, infected, and recovered individuals in age group $i$ , respectively, and satisfies $s_{i}+y_{i}+r_{i}=1$ ; $\lambda_{i}$ is the force of infection to which an individual in age group $i$ is subject, which is calculated as

\lambda_{i}=\beta_{\mathrm{MF}}\sum_{j}M_{ij}y_{j},

(3)

where $\beta_{\mathrm{MF}}$ denotes the transmission rate of the mean-field model. In this model, the basic reproduction number can be computed as [43]

R_{0}=\frac{\beta_{\mathrm{MF}}}{\gamma}\,\Lambda(M),

(4)

where $\Lambda(\cdot)$ denotes the spectral radius. By linearizing Eq. (2) around $s_{i}$ and $y_{i}\to 0$ , we obtain the condition that disease with transmission rate $\beta_{\mathrm{MF}}$ cannot invade the population:

\frac{\beta_{\mathrm{MF}}}{\gamma}\,\Lambda\big{(}\mathrm{diag}(\mathbf{s})M% \big{)}\leq 1,

(5)

where $\mathbf{s}$ denotes the vector of which $i$ th element is $s_{i}$ . Thus, this condition represents the herd immunity threshold, with the left-hand side representing the effective reproduction number.

Figure 4A shows the mean-field model simulation of an epidemic with $R_{0}=3$ in the population of Finland in 2007. The age structure [44] and contact matrix [35, 45] are stratified into 16 age groups. By the time the population reaches the herd immunity threshold defined by Eq. (5), $54.1\%$ of the total population has contracted the disease. However, due to the heterogeneity of contact patterns within the population, there is a large variance in attack rate between age groups, ranging from $10.4\%$ among those over 75 years of age to $73.6\%$ among 25–29 year olds.

We formulate the corresponding network model as follows. The population of $N$ individuals is divided into age groups, each with $n_{i}\simeq Np_{i}$ individuals. The contact network is generated by a stochastic block model (SBM). The standard SBM assumes that the probability that individuals are linked to each other depends only on the groups they belong to. It is defined by the number of individuals $n_{i}$ in each group $i$ and the edge probability $B_{ij}$ between an individual in group $i$ and an individual in group $j$ . The edge probability is related to the contact matrix as

B_{ij}=aM_{ij}/n_{j},

(6)

where $a$ is the coefficient that controls the total number of edges in the network. In addition to the age-stratified contact patterns, we further incorporate spatiality by considering a mixture of the SBM and the HRGG model parameterized by temperature $\tau$ ; see Supplementary Text E for details.

Using the same age structure and contact matrix of Finland, we construct the contact network with $N=2\times 10^{4}$ and $a=1$ . For this setup, the normalized mean edge length $\langle d\rangle/d^{*}$ decreases nonlinearly with inverse temperature from $\langle d\rangle/d^{*}=1$ for $\tau^{-1}=0$ to $\langle d\rangle/d^{*}=0.31$ for $\tau^{-1}=3$ and $\langle d\rangle/d^{*}=0.16$ for $\tau^{-1}=10$ . We let an epidemic of $R_{0}=3$ evolve in the network until it reaches the herd immunity threshold defined by Eq. (5), at which point we halt the transmission process and let every infected individual recover. Once the population is free of infection, we reintroduce the disease into the population by infecting a random individual who remained susceptible during the first epidemic. If the population has already reached herd immunity during the first outbreak, the disease will be quickly eliminated, and there will be no epidemic resurgence that infects a finite fraction of the population.

Figure 4B shows the cumulative incidence during the first epidemic and after reintroduction of the disease according to the network model. The size of the first epidemic is hardly affected by spatiality, varying from an average of $54.2\%$ for $\tau^{-1}=0$ to $55.2\%$ for $\tau^{-1}=10$ , which is also consistent with the results of the mean-field model. Upon reintroduction of the disease, simulations for standard SBM networks ( $\tau^{-1}=0$ ) indicate the occurrence of a relatively small epidemic, affecting an additional $2.8\%$ on average. When the network is spatially embedded, we observe a larger resurgence, infecting up to $13.8\%$ of the population. This suggests that the population is still in the epidemic phase, even though the number of immune individuals should be sufficient to achieve herd immunity according to the mean-field model. In other words, the mean-field model underestimates the herd immunity threshold of networked populations in such a scenario, especially when they are spatially embedded.

IV Conclusion

We studied the effectiveness of disease-induced immunity compared to random immunization. First, we compared the herd immunity effect in configuration model networks under the two scenarios, showing that disease-induced immunity provides weaker protection than random immunization in the absence of degree heterogeneity, while they induce equally effective herd immunity effect in Erdős-Rényi graphs. This is due to the “tug-of-war” between two antagonistic mechanisms that shape disease-induced herd immunity in networks. While an epidemic preferentially infects and removes high-degree nodes, which enhances the herd immunity effect, it can only spread over a topologically contiguous part of the network, which reduces the herd immunity effect. Our results indicate that the effect of immunity localization is significant enough to outweigh the preferential immunity among high-degree nodes in homogeneous networks, while in heterogeneous networks, it attenuates collective protection to a lower level than what is expected from the degree distribution of immune nodes alone.

The effect of localization was generally larger for spatially embedded networks. Notably, even if the degree follows a heterogeneous distribution for which disease-induced immunity induces a stronger herd immunity effect in the configuration model, this advantage may be overturned, and disease-induced immunity may prove less effective. Using the edge rewiring model and the heterogeneous random geometric graph model to interpolate the spectrum of spatiality, we found that increasing spatiality generally makes disease-induced herd immunity harder to achieve compared to random immunization. The competition between spatiality and degree heterogeneity of the network is evident from the fact that the former needs to be compensated for by the latter for disease-induced immunity to be more effective than random immunization. In sum, our results suggest that the two underlying mechanisms of disease-induced herd immunity are each influenced by one of the two features of network topology: while degree heterogeneity promotes the effect of preferential immunity among high-degree nodes, spatial embeddedness enhances the effect of immunity localization.

Finally, the comparison between the mean-field model and the network model, both informed by empirical contact patterns, shows the discrepancy in the estimate of the herd immunity threshold. Even when the mean-field model deems that the population has reached herd immunity through natural infection, the network model may predict that the disease can reinvade the population. In other words, the mean-field model estimates a lower threshold and a stronger herd immunity effect than the network model.

The tension between the two models stems from the fact that the mean-field model does not account for dynamical correlations between the epidemic states of nodes. It implicitly assumes that the population is mixed much faster than the epidemic dynamics. The network model, on the other hand, represents the slow-mixing regime; the contact partner of each individual remains the same throughout the epidemic. This gives rise to dynamical correlations that manifest themselves in the localized distribution of immune nodes after the disease has spread.

We emphasize that our aim is not to claim that the network model better reflects reality. Rather, we believe that the real-world human population lies somewhere in between; most people would have repeated contact with the slowly changing set of others (family, friends, colleagues) but would also encounter random strangers in public places. Our work demonstrates that model-based estimates of the herd immunity effect induced by disease spreading will vary depending on whether or not the dynamical correlations are taken into account. When population heterogeneity is relatively low or when contacts are mostly made between those who are spatially close to each other, disease-induced immunity may offer no advantage over random immunization. Our work sheds light on the role of contact networks and calls for careful examination of the herd immunity effect obtained from models that implicitly or explicitly disregard dynamical correlations, particularly because these assumptions can result in overestimating the strength of herd immunity.

References

Anderson and May [1991] R. M. Anderson and R. M. May, Infectious Diseases of Humans: Dynamics and Control (Oxford University Press, 1991).
Orenstein et al. [2000] W. A. Orenstein, P. M. Strebel, M. Papania, R. W. Sutter, W. J. Bellini, and S. L. Cochi, Measles eradication: is it in our future?, American Journal of Public Health 90, 1521 (2000).
Reichert et al. [2001] T. A. Reichert, N. Sugaya, D. S. Fedson, W. P. Glezen, L. Simonsen, and M. Tashiro, The Japanese Experience with Vaccinating Schoolchildren against Influenza, New England Journal of Medicine 344, 889 (2001).
Drolet et al. [2015] M. Drolet, É. Bénard, M.-C. Boily, H. Ali, L. Baandrup, H. Bauer, S. Beddows, J. Brisson, J. M. L. Brotherton, T. Cummings, B. Donovan, C. K. Fairley, E. W. Flagg, A. M. Johnson, J. A. Kahn, K. Kavanagh, S. K. Kjaer, E. V. Kliewer, P. Lemieux-Mellouki, L. Markowitz, A. Mboup, D. Mesher, L. Niccolai, J. Oliphant, K. G. Pollock, K. Soldan, P. Sonnenberg, S. N. Tabrizi, C. Tanton, and M. Brisson, Population-level impact and herd effects following human papillomavirus vaccination programmes: a systematic review and meta-analysis, The Lancet Infectious Diseases 15, 565 (2015).
Toor et al. [2021] J. Toor, S. Echeverria-Londono, X. Li, K. Abbas, E. D. Carter, H. E. Clapham, A. Clark, M. J. De Villiers, K. Eilertson, M. Ferrari, I. Gamkrelidze, T. B. Hallett, W. R. Hinsley, D. Hogan, J. H. Huber, M. L. Jackson, K. Jean, M. Jit, A. Karachaliou, P. Klepac, A. Kraay, J. Lessler, X. Li, B. A. Lopman, T. Mengistu, C. J. E. Metcalf, S. M. Moore, S. Nayagam, T. Papadopoulos, T. A. Perkins, A. Portnoy, H. Razavi, D. Razavi-Shearer, S. Resch, C. Sanderson, S. Sweet, Y. Tam, H. Tanvir, Q. Tran Minh, C. L. Trotter, S. A. Truelove, E. Vynnycky, N. Walker, A. Winter, K. Woodruff, N. M. Ferguson, and K. A. Gaythorpe, Lives saved with vaccination for 10 pathogens across 112 countries in a pre-COVID-19 world, eLife 10, e67635 (2021).
Shattock et al. [2024] A. J. Shattock, H. C. Johnson, S. Y. Sim, A. Carter, P. Lambach, R. C. W. Hutubessy, K. M. Thompson, K. Badizadegan, B. Lambert, M. J. Ferrari, M. Jit, H. Fu, S. P. Silal, R. A. Hounsell, R. G. White, J. F. Mosser, K. A. M. Gaythorpe, C. L. Trotter, A. Lindstrand, K. L. O’Brien, and N. Bar-Zeev, Contribution of vaccination to improved survival and health: modelling 50 years of the Expanded Programme on Immunization, The Lancet , S014067362400850X (2024).
Fox et al. [1971a] J. Fox, L. Elveback, W. Scott, L. Gatewood, and E. Ackerman, Herd immunity: basic concept and relevance to public health immunization practices, American Journal of Epidemiology 94, 179 (1971a).
Anderson and May [1985] R. M. Anderson and R. M. May, Vaccination and herd immunity to infectious diseases, Nature 318, 323 (1985).
Anderson and May [1990] R. Anderson and R. May, Immunisation and herd immunity, The Lancet 335, 641 (1990).
Anderson [1992] R. M. Anderson, The concept of herd immunity and the design of community-based immunization programmes, Vaccine 10, 928 (1992).
Fine [1993] P. E. M. Fine, Herd Immunity: History, Theory, Practice, Epidemiologic Reviews 15, 265 (1993).
Anderson and May [1982] R. M. Anderson and R. M. May, Directly Transmitted Infections Diseases: Control by Vaccination, Science 215, 1053 (1982).
Fox et al. [1971b] J. P. Fox, L. Elveback, W. Scott, L. Gatewood, and E. Ackerman, Herd Immunity: Basic Concept and Relevance to Public Health Immunization Practices, American Journal of Epidemiology 94, 179 (1971b).
Schenzle [1984] D. Schenzle, An Age-Structured Model of Pre- and Post-Vaccination Measles Transmission, Mathematical Medicine and Biology 1, 169 (1984).
May and Anderson [1984] R. M. May and R. M. Anderson, Spatial heterogeneity and the design of immunization programs, Mathematical Biosciences 72, 83 (1984).
Anderson and May [1984] R. M. Anderson and R. M. May, Spatial, Temporal, and Genetic Heterogeneity in Host Populations And the Design of Immunization Programmes, Mathematical Medicine and Biology 1, 233 (1984).
Hethcote and Van Ark [1987] H. W. Hethcote and J. W. Van Ark, Epidemiological models for heterogeneous populations: proportionate mixing, parameter estimation, and immunization programs, Mathematical Biosciences 84, 85 (1987).
Britton et al. [2020] T. Britton, F. Ball, and P. Trapman, A mathematical model reveals the influence of population heterogeneity on herd immunity to SARS-CoV-2, Science 369, 846 (2020).
Lu et al. [2021] D. Lu, A. Aleta, M. Ajelli, R. Pastor-Satorras, A. Vespignani, and Y. Moreno, Data-driven estimate of SARS-CoV-2 herd immunity threshold in populations with individual contact pattern variations, medRxiv 10.1101/2021.03.19.21253974 (2021).
Gomes et al. [2022] M. G. M. Gomes, M. U. Ferreira, R. M. Corder, J. G. King, C. Souto-Maior, C. Penha-Gonçalves, G. Gonçalves, M. Chikina, W. Pegden, and R. Aguas, Individual variation in susceptibility or exposure to SARS-CoV-2 lowers the herd immunity threshold, Journal of Theoretical Biology 540, 111063 (2022).
Aguas et al. [2022] R. Aguas, G. Gonçalves, M. U. Ferreira, and M. G. M. Gomes, Herd immunity thresholds for SARS-CoV-2 estimated from unfolding epidemics, medRxiv 10.1101/2020.07.23.20160762 (2022).
Keeling and Eames [2005] M. J. Keeling and K. T. Eames, Networks and epidemic models, Journal of The Royal Society Interface 2, 295 (2005).
Pastor-Satorras et al. [2015] R. Pastor-Satorras, C. Castellano, P. Van Mieghem, and A. Vespignani, Epidemic processes in complex networks, Reviews of Modern Physics 87, 925 (2015).
Newman [2005] M. E. J. Newman, Threshold Effects for Two Pathogens Spreading on a Network, Physical Review Letters 95, 108701 (2005).
Ferrari et al. [2006] M. J. Ferrari, S. Bansal, L. A. Meyers, and O. N. Bjørnstad, Network frailty and the geometry of herd immunity, Proceedings of the Royal Society B: Biological Sciences 273, 2743 (2006).
Bansal and Meyers [2012] S. Bansal and L. A. Meyers, The impact of past epidemics on future disease dynamics, Journal of Theoretical Biology 309, 176 (2012).
Mann et al. [2021] P. Mann, V. A. Smith, J. B. O. Mitchell, and S. Dobson, Two-pathogen model with competition on clustered networks, Physical Review E 103, 062308 (2021).
Di Lauro et al. [2021] F. Di Lauro, L. Berthouze, M. D. Dorey, J. C. Miller, and I. Z. Kiss, The Impact of Contact Structure and Mixing on Control Measures and Disease-Induced Herd Immunity in Epidemic Models: A Mean-Field Model Perspective, Bulletin of Mathematical Biology 83, 117 (2021).
Burgio et al. [2021] G. Burgio, B. Steinegger, G. Rapisardi, and A. Arenas, Homophily in the adoption of digital proximity tracing apps shapes the evolution of epidemics, Physical Review Research 3, 033128 (2021).
Rizi et al. [2022] A. K. Rizi, A. Faqeeh, A. Badie-Modiri, and M. Kivelä, Epidemic spreading and digital contact tracing: Effects of heterogeneous mixing and quarantine failures, Physical Review E 105, 044313 (2022).
Burgio et al. [2022] G. Burgio, B. Steinegger, and A. Arenas, Homophily impacts the success of vaccine roll-outs, Communications Physics 5, 70 (2022).
Hiraoka et al. [2022] T. Hiraoka, A. K. Rizi, M. Kivelä, and J. Saramäki, Herd immunity and epidemic size in networks with vaccination homophily, Physical Review E 105, L052301 (2022).
Watanabe and Hasegawa [2022] H. Watanabe and T. Hasegawa, Impact of assortative mixing by mask-wearing on the propagation of epidemics in networks, Physica A: Statistical Mechanics and its Applications 603, 127760 (2022).
Lloyd-Smith et al. [2005] J. O. Lloyd-Smith, S. J. Schreiber, P. E. Kopp, and W. M. Getz, Superspreading and the effect of individual variation on disease emergence, Nature 438, 355 (2005).
Mossong et al. [2008] J. Mossong, N. Hens, M. Jit, P. Beutels, K. Auranen, R. Mikolajczyk, M. Massari, S. Salmaso, G. S. Tomba, J. Wallinga, J. Heijne, M. Sadkowska-Todys, M. Rosinska, and W. J. Edmunds, Social Contacts and Mixing Patterns Relevant to the Spread of Infectious Diseases, PLoS Medicine 5, e74 (2008).
Hébert-Dufresne et al. [2020] L. Hébert-Dufresne, B. M. Althouse, S. V. Scarpino, and A. Allard, Beyond R₀ : heterogeneity in secondary infections and probabilistic epidemic forecasting, Journal of The Royal Society Interface 17, 20200393 (2020).
Newman [2018] M. Newman, Networks (Oxford University Press, 2018).
Fosdick et al. [2018] B. K. Fosdick, D. B. Larremore, J. Nishimura, and J. Ugander, Configuring Random Graph Models with Fixed Degree Sequences, SIAM Review 60, 315 (2018).
Boguñá et al. [2020] M. Boguñá, D. Krioukov, P. Almagro, and M. Á. Serrano, Small worlds and clustering in spatial networks, Physical Review Research 2, 023040 (2020).
Kenah and Robins [2007] E. Kenah and J. M. Robins, Second look at the spread of epidemics on networks, Physical Review E 76, 036113 (2007).
Newman [2023] M. E. J. Newman, Message passing methods on complex networks, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 479, 20220774 (2023).
Newman [2002] M. E. J. Newman, Spread of epidemic disease on networks, Physical Review E 66, 016128 (2002).
Diekmann et al. [1990] O. Diekmann, J. Heesterbeek, and J. Metz, On the definition and the computation of the basic reproduction ratio R₀ in models for infectious diseases in heterogeneous populations, Journal of Mathematical Biology 28, 10.1007/BF00178324 (1990).
[44] United Nations Statistics Division, Population by age, sex and urban/rural residence, retrieved from Demographic Statistics Database https://unstats.un.org/unsd/demographic-social/; Data accessed on 2024-03-15.
Prem et al. [2017] K. Prem, A. R. Cook, and M. Jit, Projecting social contact matrices in 152 countries using contact surveys and demographic data, PLOS Computational Biology 13, e1005697 (2017).

Acknowledgements.

The authors wish to acknowledge Aalto University Science-IT project for generous computational resources. Funding: MK acknowledges support from project 105572 NordicMathCovid, which is part of the Nordic Programme on Health and Welfare funded by NordForsk, and Research Council of Finland project 349366. Author contributions: T.H., A.K.R., M.K., and J.S. conceived the research. All authors contributed in developing the models and writing the manuscript. T.H. and Z.G. implemented the code and performed simulations. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data are available in the main text or the supplementary materials.