Exploiting decoupled discretization in meshless lattice Boltzmann method for high-Reynolds number flows

Dawid Strzelczyk, Maciej Matyka

Institute of Theoretical Physics, Faculty of Physics and Astronomy, University of Wrocław, pl. M. Borna 9, 50-204, Wrocław, Poland

dawid.strzelczyk@uwr.edu.pl

Abstract One of the most severe limitations of the Lattice Boltzmann Method in the context of simulating inertial flows is the coupling of the discretization of space to the velocity discretization. It causes the need to increase the size of computational meshes whenever one wants to increase the Reynolds number in the system for a fixed velocity and viscosity. This work aims at adopting the recently proposed meshless LBM formulation to the problem of high Reynolds number flows by using its internal property of decoupled space and velocity discretizations. In meshless formulation, one can change the reference length of the problem by scaling the distances between points that discretize the domain while the streaming distance remains fixed. By doing that, we increase the Reynolds number at no cost: additional nodes need not be inserted in the discretization. We measure the accuracy and efficiency of this approach in the flow around a circular obstacle at Re=84𝑅𝑒84Re=84italic_R italic_e = 84168168168168, the lid-driven cavity flow at Re=100𝑅𝑒100Re=100italic_R italic_e = 1005000500050005000, and the flow through a porous sample at Re=0.8𝑅𝑒0.8Re=0.8italic_R italic_e = 0.81500150015001500. Additionally, we apply the meshless streaming step to the recently proposed fixed relaxation time τ=1𝜏1\tau=1italic_τ = 1 LBM to extend its applicability to model inertial flows.

Keywords: Lattice Boltzmann Method, meshless methods, inertial flows, Reynolds number

1 Introduction

The transport of fluids through complex, porous structures is ubiquitous with many technological applications (CO2 storage, oil and shale gas extraction, electric battery design). It is of great interest for the study of human health, e.g. blood and cerebrospinal fluid flow, gas transport in the lungs[1, 2].

Experimental investigation of fluid flow phenomena in porous media is costly and requires sophisticated setups. Measuring velocity fields in creeping flows using Particle Image Velocimetry (PIV) methods has become popular [3, 4]. However, beyond Darcy Law (inertial regime) [5], PIV tracers used to track the flow face several problems. They may, for instance, damage the porous sample, and some of the flow regions may get excluded from the measurement due to the lack of tracers therein. Also, the tracers’ size must be chosen carefully to reconcile the accuracy of the scattered signal acquisition and the minimal interaction between the tracers, flow, and the pore space [6]. The remedy for this is to use numerical methods to solve fluid flow transport equations directly in pore space. Real samples, however, are relatively big compared to pore size, and using standard numerical methods based on computational grids may become impractical. This becomes most problematic at large Reynolds numbers, where fine flow structures appear.

Recently, the Lattice Boltzmann Method (LBM) became one of the most popular numerical tools used to solve the Navier-Stokes equations at the pore scale [7, 8]. It was applied to the simulation of fluid transport in various contexts of porous media [9], multiphase flows [10], semiclassical fluids flow [11], particulate suspensions [12] and especially inertial flows, and turbulent flows [13, 14]. Its popularity arises from the stable and local treatment of non-linear effects, a huge potential for parallelization of calculations, and simple implementation of basic no-slip boundary conditions on a regular grid [15, 16, 17]. The latter is especially attractive in complex porous media geometries, i.e. those arising from tomography scans in the form of binary images. Unfortunately, the memory overhead associated with storing the distribution function in LBM, and the need to regularly discretize the entire simulation space, limits the utility of this method for real-life, large samples with relatively small pore sizes.

One way to overcome the mentioned problem of excessive memory usage is to resort to LBM models with the fixed relaxation time τ=1𝜏1\tau=1italic_τ = 1 (LBM1) [18, 19] where the distribution function is not explicitly stored in the memory. As shown in our previous work, LBM1 simplifies the main algorithm and reduces the memory requirement up to 86%percent8686\%86 % compared to the standard LBM solver [18]. Savings may also come from the use of the meshless discretization of space in LBM (meshless LBM, MLBM) [20]. This offers a significant advantage regarding the flexibility of space discretization, with LBM grid points distributed, i.e., in the pore-space only. We recently investigated its convergence in several scenarios, including low-Reynolds number flow through porous samples consisting of a network of two-dimensional channels and a three-dimensional periodic array of spheres [21].

By default, in both mentioned approaches, increasing the Reynolds number means either a decrease in viscosity, an increase in the flow velocity, or an increase in the size of the system. The first approach is limited by the minimum theoretical viscosity due to stability issues. Moreover, it becomes inaccessible in the τ=1𝜏1\tau=1italic_τ = 1 formulation [18], where, by definition, we restrict the model to a fixed relaxation time, which involves fixing viscosity. Thus, considering the maximum velocity limit due to Mach number restrictions, increasing the Reynolds number by increasing the size of the system is the only way to obtain stable, high-Reynolds number fluid flow solutions in both MLBM and LBM1.

The aim of this work is to investigate the procedure of the domain reference length scaling in the meshless LBM (MLBM) to change the Reynolds number at zero computational and algorithmic cost [22] and to implement the meshless streaming step in the LBM1 model to achieve higher Reynolds numbers without additional memory overhead. We apply this scaling method to classic fluid flow problems: Kármán vortex street, lid-driven cavity and in the inertial regime of a flow through a porous medium.

2 Methods

2.1 Lattice Boltzmann Method

Lattice Boltzmann Method [15, 16] is a numerical tool for solving the discrete Boltzmann equation. During each timestep, two steps run - the streaming and the collision of the velocity distributions. The streaming is defined as:

fk(t+1,𝒙)=fkpost(t,𝒙+𝒆k)subscript𝑓𝑘𝑡1𝒙superscriptsubscript𝑓𝑘post𝑡𝒙subscript𝒆superscript𝑘f_{k}(t+1,\boldsymbol{x})=f_{k}^{\text{post}}(t,\boldsymbol{x}+\boldsymbol{e}_% {k^{\prime}})italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_t + 1 , bold_italic_x ) = italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT post end_POSTSUPERSCRIPT ( italic_t , bold_italic_x + bold_italic_e start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) (1)

where fksubscript𝑓𝑘f_{k}italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is the k𝑘kitalic_k-th distribution function with its streaming vector 𝒆ksubscript𝒆𝑘\boldsymbol{e}_{k}bold_italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, ksuperscript𝑘k^{\prime}italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT denotes the vector antiparallel to 𝒆ksubscript𝒆𝑘\boldsymbol{e}_{k}bold_italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT (𝒆k=𝒆ksubscript𝒆𝑘subscript𝒆superscript𝑘\boldsymbol{e}_{k}=-\boldsymbol{e}_{k^{\prime}}bold_italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = - bold_italic_e start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT) and the superscript ’post’ denotes the post-collision distribution function. According to this notation, 𝒙+𝒆k𝒙subscript𝒆superscript𝑘\boldsymbol{x}+\boldsymbol{e}_{k^{\prime}}bold_italic_x + bold_italic_e start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT denotes the departure node of the k𝑘kitalic_k-th population being streamed to 𝒙𝒙\boldsymbol{x}bold_italic_x. Eq. (1) is written in a non-dimensionalized form with the timestep length and the lattice spacing equal to 1.

In this work, we use D2Q9 BGK model with 9 streaming directions (k=0,1,,8𝑘018k=0,1,...,8italic_k = 0 , 1 , … , 8):

𝒆k((0,0),(1,0),(0,1),(1,0),(0,1),(1,1),(1,1),(1,1),(1,1)).subscript𝒆𝑘001001100111111111\boldsymbol{e}_{k}\in\left((0,0),(1,0),(0,1),(-1,0),(0,-1),(1,1),(-1,1),(-1,-1% ),(1,-1)\right).bold_italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ ( ( 0 , 0 ) , ( 1 , 0 ) , ( 0 , 1 ) , ( - 1 , 0 ) , ( 0 , - 1 ) , ( 1 , 1 ) , ( - 1 , 1 ) , ( - 1 , - 1 ) , ( 1 , - 1 ) ) . (2)

Due to the Lagrangian nature of the streaming step, such a choice of the streaming vectors imposes the use of a square lattice (refer to Fig. 1 for a graphical representation of the LBM lattice and streaming directions). The collision step calculates fkpostsubscriptsuperscript𝑓post𝑘f^{\text{post}}_{k}italic_f start_POSTSUPERSCRIPT post end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT from the current distributions and macroscopic fields. This work uses two relaxation time collision term [23]:

fkpost=fk(t,𝒙)1τ+[fk+(t,𝒙)fkeq+(t,𝒙)]1τ[fk(t,𝒙)fkeq(t,𝒙)]subscriptsuperscript𝑓post𝑘subscript𝑓𝑘𝑡𝒙1superscript𝜏delimited-[]superscriptsubscript𝑓𝑘𝑡𝒙subscriptsuperscript𝑓limit-fromeq𝑘𝑡𝒙1superscript𝜏delimited-[]superscriptsubscript𝑓𝑘𝑡𝒙subscriptsuperscript𝑓limit-fromeq𝑘𝑡𝒙f^{\text{post}}_{k}=f_{k}(t,\boldsymbol{x})-\frac{1}{\tau^{+}}\left[f_{k}^{+}(% t,\boldsymbol{x})-f^{\text{eq}+}_{k}(t,\boldsymbol{x})\right]-\frac{1}{\tau^{-% }}\left[f_{k}^{-}(t,\boldsymbol{x})-f^{\text{eq}-}_{k}(t,\boldsymbol{x})\right]italic_f start_POSTSUPERSCRIPT post end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_t , bold_italic_x ) - divide start_ARG 1 end_ARG start_ARG italic_τ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_ARG [ italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_t , bold_italic_x ) - italic_f start_POSTSUPERSCRIPT eq + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_t , bold_italic_x ) ] - divide start_ARG 1 end_ARG start_ARG italic_τ start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT end_ARG [ italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT ( italic_t , bold_italic_x ) - italic_f start_POSTSUPERSCRIPT eq - end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_t , bold_italic_x ) ] (3)

where τ±superscript𝜏plus-or-minus\tau^{\pm}italic_τ start_POSTSUPERSCRIPT ± end_POSTSUPERSCRIPT, are relaxation times for the symmetric and anti-symmetric parts of the distributions, which are defined as:

fk+=fk+fk2,fk=fkfk2fkeq+=fkeq+fkeq2,fkeq=fkeqfkeq2.superscriptsubscript𝑓𝑘subscript𝑓𝑘subscript𝑓superscript𝑘2missing-subexpressionsuperscriptsubscript𝑓𝑘subscript𝑓𝑘subscript𝑓superscript𝑘2superscriptsubscript𝑓𝑘limit-fromeqsuperscriptsubscript𝑓𝑘eqsuperscriptsubscript𝑓superscript𝑘eq2missing-subexpressionsuperscriptsubscript𝑓𝑘limit-fromeqsuperscriptsubscript𝑓𝑘eqsuperscriptsubscript𝑓superscript𝑘eq2\begin{array}[]{c}\begin{aligned} f_{k}^{+}=\frac{f_{k}+f_{k^{\prime}}}{2},&% \quad&f_{k}^{-}=\frac{f_{k}-f_{k^{\prime}}}{2}\\[2.15277pt] f_{k}^{\text{eq}+}=\frac{f_{k}^{\text{eq}}+f_{k^{\prime}}^{\text{eq}}}{2},&% \quad&f_{k}^{\text{eq}-}=\frac{f_{k}^{\text{eq}}-f_{k^{\prime}}^{\text{eq}}}{2% }.\\ \end{aligned}\end{array}start_ARRAY start_ROW start_CELL start_ROW start_CELL italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT = divide start_ARG italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_f start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG , end_CELL start_CELL end_CELL start_CELL italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT = divide start_ARG italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG end_CELL end_ROW start_ROW start_CELL italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT eq + end_POSTSUPERSCRIPT = divide start_ARG italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT eq end_POSTSUPERSCRIPT + italic_f start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT eq end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG , end_CELL start_CELL end_CELL start_CELL italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT eq - end_POSTSUPERSCRIPT = divide start_ARG italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT eq end_POSTSUPERSCRIPT - italic_f start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT eq end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG . end_CELL end_ROW end_CELL end_ROW end_ARRAY (4)

The relaxation times are related to each other by:

Λ=(τ+12)(τ12).Λsuperscript𝜏12superscript𝜏12\Lambda=\left(\tau^{+}-\frac{1}{2}\right)\left(\tau^{-}-\frac{1}{2}\right).roman_Λ = ( italic_τ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ) ( italic_τ start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ) . (5)

For the non-fixed relaxation time LBM and MLBM we use the value of Λ=3/16Λ316\Lambda=3/16roman_Λ = 3 / 16 [7]. The kinematic viscosity in the lattice units is then:

νlb=cs2(τ+12)subscript𝜈𝑙𝑏superscriptsubscript𝑐𝑠2superscript𝜏12\nu_{lb}=c_{s}^{2}\left(\tau^{+}-\frac{1}{2}\right)italic_ν start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT = italic_c start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_τ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ) (6)

where cs=1/3subscript𝑐𝑠13c_{s}=1/\sqrt{3}italic_c start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = 1 / square-root start_ARG 3 end_ARG denotes the lattice speed of sound. The equilibrium distributions fkeqsuperscriptsubscript𝑓𝑘eqf_{k}^{\text{eq}}italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT eq end_POSTSUPERSCRIPT are expressed as:

fkeq(t,𝒙)=ωkρlb[1+𝒆k𝒖lbcs2+(𝒆k𝒖lb)22cs4𝒖lb22cs2]subscriptsuperscript𝑓eq𝑘𝑡𝒙subscript𝜔𝑘subscript𝜌𝑙𝑏delimited-[]1subscript𝒆𝑘subscript𝒖𝑙𝑏superscriptsubscript𝑐𝑠2superscriptsubscript𝒆𝑘subscript𝒖𝑙𝑏22superscriptsubscript𝑐𝑠4superscriptsubscript𝒖𝑙𝑏22superscriptsubscript𝑐𝑠2f^{\text{eq}}_{k}(t,\boldsymbol{x})=\omega_{k}\rho_{lb}\left[1+\frac{% \boldsymbol{e}_{k}\cdot\boldsymbol{u}_{lb}}{c_{s}^{2}}+\frac{(\boldsymbol{e}_{% k}\cdot\boldsymbol{u}_{lb})^{2}}{2c_{s}^{4}}-\frac{\boldsymbol{u}_{lb}^{2}}{2c% _{s}^{2}}\right]italic_f start_POSTSUPERSCRIPT eq end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_t , bold_italic_x ) = italic_ω start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT [ 1 + divide start_ARG bold_italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⋅ bold_italic_u start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT end_ARG start_ARG italic_c start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG + divide start_ARG ( bold_italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⋅ bold_italic_u start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_c start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT end_ARG - divide start_ARG bold_italic_u start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_c start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ] (7)

where ωksubscript𝜔𝑘\omega_{k}italic_ω start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is the weight specific to the k𝑘kitalic_k-th streaming direction:

ω=(49,19,19,19,19,136,136,136,136).𝜔4919191919136136136136\omega=\left(\frac{4}{9},\frac{1}{9},\frac{1}{9},\frac{1}{9},\frac{1}{9},\frac% {1}{36},\frac{1}{36},\frac{1}{36},\frac{1}{36}\right).italic_ω = ( divide start_ARG 4 end_ARG start_ARG 9 end_ARG , divide start_ARG 1 end_ARG start_ARG 9 end_ARG , divide start_ARG 1 end_ARG start_ARG 9 end_ARG , divide start_ARG 1 end_ARG start_ARG 9 end_ARG , divide start_ARG 1 end_ARG start_ARG 9 end_ARG , divide start_ARG 1 end_ARG start_ARG 36 end_ARG , divide start_ARG 1 end_ARG start_ARG 36 end_ARG , divide start_ARG 1 end_ARG start_ARG 36 end_ARG , divide start_ARG 1 end_ARG start_ARG 36 end_ARG ) . (8)

The macroscopic density and velocity in lattice units, ρlb=ρlb(t,𝒙)subscript𝜌𝑙𝑏subscript𝜌𝑙𝑏𝑡𝒙\rho_{lb}\!=\!\rho_{lb}(t,\boldsymbol{x})italic_ρ start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT = italic_ρ start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT ( italic_t , bold_italic_x ) and 𝒖lb=𝒖lb(t,𝒙)subscript𝒖𝑙𝑏subscript𝒖𝑙𝑏𝑡𝒙\boldsymbol{u}_{lb}\!=\!\boldsymbol{u}_{lb}(t,\boldsymbol{x})bold_italic_u start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT = bold_italic_u start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT ( italic_t , bold_italic_x ) respectively, at time t𝑡titalic_t and point 𝒙𝒙\boldsymbol{x}bold_italic_x, are obtained from the discrete populations:

ρlb=subscript𝜌𝑙𝑏absent\displaystyle\rho_{lb}=italic_ρ start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT = k=1qfksuperscriptsubscript𝑘1𝑞subscript𝑓𝑘\displaystyle\sum\limits_{k=1}^{q}f_{k}∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT (9)
𝒖lb=subscript𝒖𝑙𝑏absent\displaystyle\boldsymbol{u}_{lb}=bold_italic_u start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT = 1ρlbk=1qfk𝒆k1subscript𝜌𝑙𝑏superscriptsubscript𝑘1𝑞subscript𝑓𝑘subscript𝒆𝑘\displaystyle\frac{1}{\rho_{lb}}\sum\limits_{k=1}^{q}f_{k}\boldsymbol{e}_{k}divide start_ARG 1 end_ARG start_ARG italic_ρ start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT end_ARG ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT

The pressure is related to the density via the lattice speed of sound: plb=ρlbcs2subscript𝑝𝑙𝑏subscript𝜌𝑙𝑏superscriptsubscript𝑐𝑠2p_{lb}=\rho_{lb}c_{s}^{2}italic_p start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT = italic_ρ start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

In the numerical implementations, at each time step, Eq.  (3) is first calculated to obtain the values of the post-collision distribution function (collision step). Then,  (1) advects post-collision distributions to neighboring nodes (streaming step). Because lattice nodes 𝒙𝒙\boldsymbol{x}bold_italic_x coincide with the departure/arrival nodes of the streaming step, transport is purely Lagrangian and amounts to an index shift in the distribution function array.

The fixed-relaxation time variant of LBM assumes τ+=1superscript𝜏1\tau^{+}=1italic_τ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT = 1 and Λ=1/4Λ14\Lambda=1/4roman_Λ = 1 / 4 [18, 19], which by Eq. (5) sets τ=1superscript𝜏1\tau^{-}=1italic_τ start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT = 1. In such case, Eq.(4) simplifies the streaming and collision procedures from Eqs. (1) and (3) to the same form as if Bhatnagar-Gross-Krook collision [24] was used with the unit relaxation time:

fk(t+1,𝒙)=fkeq(t,𝒙+𝒆k).subscript𝑓𝑘𝑡1𝒙subscriptsuperscript𝑓eq𝑘𝑡𝒙subscript𝒆superscript𝑘f_{k}(t+1,\boldsymbol{x})=f^{\text{eq}}_{k}(t,\boldsymbol{x}+\boldsymbol{e}_{k% ^{\prime}}).italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_t + 1 , bold_italic_x ) = italic_f start_POSTSUPERSCRIPT eq end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_t , bold_italic_x + bold_italic_e start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) . (10)

This substitution alleviates the need to store the distribution function explicitly. In each timestep, the equilibrium distributions are calculated directly from the current macroscopic fields according to Eq. (7) and summed as in Eq. (9) to obtain the macroscopic fields in the next time step.

2.2 Formulation of meshless LBM

The meshless LBM algorithm considered in this work belongs to the family of off-grid LBM [25, 26, 27, 28, 20]. Those methods trade the Lagrangian approach to solving the streaming step for the semi-Lagrangian or Eulerian method, such that the space and velocity discretizations get decoupled, and nodes no longer need to be arranged in a regular grid. In the implementations taking the semi-Lagrangian approach to solve the streaming step, like the one discussed here, the discretization points (so-called Eulerian nodes) need not coincide with the departure points of the distribution function (so-called Lagrangian nodes). Instead, with each Eulerian point 𝒙isubscript𝒙𝑖\boldsymbol{x}_{i}bold_italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT we identify q𝑞qitalic_q Lagrangian points to which the corresponding distributions are interpolated and from which they are advected during the streaming step (see Fig. 1). To relate the Eulerian and Lagrangian nodes’ positions to one another, in MLBM the streaming vectors need to be expressed in physical units: δ𝒙k=𝒆kδx𝛿subscript𝒙𝑘subscript𝒆𝑘𝛿𝑥\delta\boldsymbol{x}_{k}=\boldsymbol{e}_{k}\delta xitalic_δ bold_italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = bold_italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_δ italic_x where δx𝛿𝑥\delta xitalic_δ italic_x is the streaming distance. In this manner, the positions of the departure nodes of the k𝑘kitalic_k-th population are related to the positions of their Eulerian nodes as 𝒙+δ𝒙k𝒙𝛿subscript𝒙superscript𝑘\boldsymbol{x}+\delta\boldsymbol{x}_{k^{\prime}}bold_italic_x + italic_δ bold_italic_x start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT. We introduce the interpolation step between the collision and the streaming to approximate the post-collision distributions in the appropriate Lagrangian points. Once the approximation is done, the streaming and the collision step are performed in the same way as in LBM (see. Fig. 2). We note that in the models with zero-velocity population discretized (i.e. 𝒆0subscript𝒆0\boldsymbol{e}_{0}bold_italic_e start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT in Eq. (2)), one does not perform the streaming for this population explicitly.

Among many interpolation methods available to solve the streaming step on irregular discretizations, meshless methods [29, 30, 31] are less computationally demanding and error-prone in the discretization process, they achieve boundary-compliant discretizations, offer high order approximations as well as feasible local refinement and adaptivity. Such approaches have been frequently used to solve transport partial differential equations [32, 33, 34, 35, 36, 37].

Refer to caption
Figure 1: Comparison of space discretizations in the standard (left) and meshless (right) LBM. The standard LBM requires a square grid; the meshless LBM can operate on scattered, boundary-compliant node sets. Note that in MLBM, velocity is still discretized using a lattice, resulting in several departure points (so-called Lagrangian nodes, filled squares) assigned to each discretization point (so-called Eulerian nodes, filled circles).
Refer to caption
Figure 2: The procedure performed at a single timestep of the meshless LBM algorithm. Circles represent Eulerian points; squares denote Lagrangian points. A dashed loop encloses the stencil of a Lagrangian node, which consists of the nine closest neighbors of the Lagrangian node (blue circles). The interpolated distribution function is streamed to the Eulerian point at the center of the presented square lattice.

The semi-Lagrangian streaming step in the discussed MLBM uses meshless interpolation based on radial basis functions-generated finite differences (RBF-FD)[29]. This allows one to approximate the value of a linear operator \mathcal{L}caligraphic_L applied to the underlying function in an arbitrary point as a linear combination of the known function values stored in the neighboring nodes:

(f)|𝒙0i=1NLf(𝒙i)wievaluated-at𝑓subscript𝒙0superscriptsubscript𝑖1subscript𝑁𝐿𝑓subscript𝒙𝑖subscript𝑤𝑖\mathcal{L}(f)|_{\boldsymbol{x}_{0}}\approx\sum\limits_{i=1}^{N_{L}}f(% \boldsymbol{x}_{i})w_{i}caligraphic_L ( italic_f ) | start_POSTSUBSCRIPT bold_italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≈ ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_f ( bold_italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (11)

where 𝒙0subscript𝒙0\boldsymbol{x}_{0}bold_italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT denotes the approximation point, 𝒙isubscript𝒙𝑖\boldsymbol{x}_{i}bold_italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT denote the NLsubscript𝑁𝐿N_{L}italic_N start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT approximation stencil members and wisubscript𝑤𝑖w_{i}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are interpolation weights. In particular one can speak of the unit operator \mathcal{L}caligraphic_L ((f)f𝑓𝑓\mathcal{L}(f)\equiv fcaligraphic_L ( italic_f ) ≡ italic_f) which results in the interpolation formula in Eq. (11). We use local approximation supports for each Lagrangian node consisting of its NL=25subscript𝑁𝐿25N_{L}=25italic_N start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = 25 closest Eulerian neighbors, unless otherwise stated. To achieve high-order approximation we use polynomial augmentation [38] of order two, unless otherwise stated. For a detailed description of the algorithm, we refer the reader to our previous work [20].

The application of the above formalism to LBM1 is the following. The macroscopic density and velocities are interpolated to each of the Lagrangian points. Then, the equilibrium distributions are calculated in the Lagrangian nodes, Eq.(7), and their values are summed according to Eq.(9) to obtain the next-timestep macroscopic fields in the Eulerian nodes. The interpolation of the macroscopic fields, rather than the equilibrium distributions, frees one from the need to store fkeqsuperscriptsubscript𝑓𝑘eqf_{k}^{\text{eq}}italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT eq end_POSTSUPERSCRIPT in the memory.

2.3 Increasing Reynolds number in meshless LBM

To assess the ratio between inertial and viscous forces in a flow, the dimensionless Reynolds number is used:

Re=ULν𝑅𝑒𝑈𝐿𝜈Re=\frac{UL}{\nu}italic_R italic_e = divide start_ARG italic_U italic_L end_ARG start_ARG italic_ν end_ARG (12)

arising naturally from non-dimensionalization of incompressible Navier-Stokes momentum equation:

𝒖t+(𝒖)𝒖=pρ+𝒈+ν2𝒖𝒖t+(𝒖)𝒖p+𝒈+1Re2𝒖where 𝒙=𝒙/L,𝒖=𝒖/U,t=tUL=tT,p=p/ρU2,𝒈=𝒈U2L𝒖𝑡𝒖𝒖𝑝𝜌𝒈𝜈superscript2𝒖superscript𝒖superscript𝑡superscript𝒖superscriptsuperscript𝒖superscript𝑝superscript𝒈1𝑅𝑒superscriptabsent2superscript𝒖formulae-sequenceformulae-sequencewhere superscript𝒙𝒙𝐿formulae-sequencesuperscript𝒖𝒖𝑈superscript𝑡𝑡𝑈𝐿𝑡𝑇formulae-sequencesuperscript𝑝𝑝𝜌superscript𝑈2superscript𝒈𝒈superscript𝑈2𝐿\begin{array}[]{l}\dfrac{\partial\boldsymbol{u}}{\partial t}+(\boldsymbol{u}% \cdot\nabla)\boldsymbol{u}=-\dfrac{\nabla p}{\rho}+\boldsymbol{g}+\nu\nabla^{2% }\boldsymbol{u}\\ \dfrac{\partial\boldsymbol{u^{*}}}{\partial t^{*}}+(\boldsymbol{u^{*}}\cdot% \nabla^{*})\boldsymbol{u^{*}}-\nabla^{*}p+\boldsymbol{g}^{*}+\dfrac{1}{Re}% \nabla^{*2}\boldsymbol{u^{*}}\\ \text{where }\boldsymbol{x}^{*}=\boldsymbol{x}/L,\quad\boldsymbol{u}^{*}=% \boldsymbol{u}/U,\quad t^{*}=t\dfrac{U}{L}=tT,\quad p^{*}=p/\rho U^{2},\quad% \boldsymbol{g}^{*}=\boldsymbol{g}\dfrac{U^{2}}{L}\\ \end{array}start_ARRAY start_ROW start_CELL divide start_ARG ∂ bold_italic_u end_ARG start_ARG ∂ italic_t end_ARG + ( bold_italic_u ⋅ ∇ ) bold_italic_u = - divide start_ARG ∇ italic_p end_ARG start_ARG italic_ρ end_ARG + bold_italic_g + italic_ν ∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_italic_u end_CELL end_ROW start_ROW start_CELL divide start_ARG ∂ bold_italic_u start_POSTSUPERSCRIPT bold_∗ end_POSTSUPERSCRIPT end_ARG start_ARG ∂ italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_ARG + ( bold_italic_u start_POSTSUPERSCRIPT bold_∗ end_POSTSUPERSCRIPT ⋅ ∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) bold_italic_u start_POSTSUPERSCRIPT bold_∗ end_POSTSUPERSCRIPT - ∇ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_p + bold_italic_g start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_R italic_e end_ARG ∇ start_POSTSUPERSCRIPT ∗ 2 end_POSTSUPERSCRIPT bold_italic_u start_POSTSUPERSCRIPT bold_∗ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL where bold_italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = bold_italic_x / italic_L , bold_italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = bold_italic_u / italic_U , italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_t divide start_ARG italic_U end_ARG start_ARG italic_L end_ARG = italic_t italic_T , italic_p start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_p / italic_ρ italic_U start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_g start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = bold_italic_g divide start_ARG italic_U start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_L end_ARG end_CELL end_ROW end_ARRAY (13)

where asterisk denotes non-dimensionalized quantities, ν𝜈\nuitalic_ν is the fluid’s kinematic viscosity, 𝒈𝒈\boldsymbol{g}bold_italic_g is the acceleration, and L𝐿Litalic_L, U𝑈Uitalic_U, T𝑇Titalic_T denotes the reference length, velocity, and time, respectively. Conventionally, when inertial forces dominate the system, the Reynolds number is larger than unity Re>1𝑅𝑒1Re>1italic_R italic_e > 1. The practical utility of the Reynolds number lies in its ability to relate flows of different length and time scales to each other, e.g., flow past a real object versus its scaled model used in an experiment. The dependence of hydrodynamic coefficients, like drag, lift, or permeability, is often presented in relation to Re𝑅𝑒Reitalic_R italic_e, thus making the results more universal. Reynolds number also allows to estimate the relative importance of non-linearity, turbulence, and momentum diffusion phenomena in a flow when choosing an appropriate physical model to describe the system.

In MLBM, the streaming distance length δx𝛿𝑥\delta xitalic_δ italic_x can be varied independently from the positions of the nodes discretizing the domain. It means that the whole space spanned by the streaming distance δx𝛿𝑥\delta xitalic_δ italic_x and the Eulerian discretization parameter hhitalic_h is available in MLBM, in contrast to traveling along the δx=h𝛿𝑥\delta x=hitalic_δ italic_x = italic_h line in the standard LBM (see Fig. 3). For instance, consider a flow domain of size L=100𝐿100L=100italic_L = 100. Let us assume that the streaming distance is δx=0.1𝛿𝑥0.1\delta x=0.1italic_δ italic_x = 0.1 (we do this by specifying the positions of Lagrangian points to differ from their Eulerian lattice centers by δ𝒙k𝛿subscript𝒙𝑘\delta\boldsymbol{x}_{k}italic_δ bold_italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT vectors). With such choice of L𝐿Litalic_L and δx𝛿𝑥\delta xitalic_δ italic_x we arrive at the domain size in lattice units equal to Llb=L/δx=1000subscript𝐿𝑙𝑏𝐿𝛿𝑥1000L_{lb}=L/\delta x=1000italic_L start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT = italic_L / italic_δ italic_x = 1000, no matter what Eulerian discretization parameter hhitalic_h we choose. Now, increasing the size of the domain L𝐿Litalic_L, e.g., by multiplying the position vectors of the Eulerian points 𝒙isubscript𝒙𝑖\boldsymbol{x}_{i}bold_italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT by some constant αEsubscript𝛼𝐸\alpha_{E}italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT (with δx𝛿𝑥\delta xitalic_δ italic_x unaltered) changes the domain size in lattice units αEsubscript𝛼𝐸\alpha_{E}italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT times. The same effect can be achieved by multiplying the streaming distance by αδ=1/αEsubscript𝛼𝛿1subscript𝛼𝐸\alpha_{\delta}=1/\alpha_{E}italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT = 1 / italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT and keeping the positions of Eulerian points constant (we use the notation αδsubscript𝛼𝛿\alpha_{\delta}italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT to distinguish the factor for the scaling of the streaming distance from the one for the scaling of the Eulerian points positions). This is because the approximation coefficients wisubscript𝑤𝑖w_{i}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT from Eq. (11) are invariant under translation, rotation, and scaling of the stencil nodes and the interpolation point, and the presented MLBM algorithm uses the non-dimensionalized formulation of LBM (δx=1𝛿𝑥1\delta x=1italic_δ italic_x = 1 and δt=1𝛿𝑡1\delta t=1italic_δ italic_t = 1 in lattice units).

Both strategies (depending on scaling the domain size in physical units or the streaming distance length) alter the flow’s Reynolds number in the following ways. Considering the Reynolds number calculated from the quantities in lattice units, both strategies change the size of the domain, changing the Llbsubscript𝐿𝑙𝑏L_{lb}italic_L start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT value from the previous paragraph. The same happens with Re𝑅𝑒Reitalic_R italic_e calculated using the quantities in physical units when the αEsubscript𝛼𝐸\alpha_{E}italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT scaling is used. Finally, when the scaling is applied to the streaming distance (i.e. δxαδδx𝛿𝑥subscript𝛼𝛿𝛿𝑥\delta x\rightarrow\alpha_{\delta}\delta xitalic_δ italic_x → italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT italic_δ italic_x), to keep the physical viscosity unchanged, timestep length must scale with the square of αδsubscript𝛼𝛿\alpha_{\delta}italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT, since ν=cs2(τ1/2)δx2/δt𝜈superscriptsubscript𝑐𝑠2𝜏12𝛿superscript𝑥2𝛿𝑡\nu=c_{s}^{2}(\tau-1/2)\delta x^{2}/\delta titalic_ν = italic_c start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_τ - 1 / 2 ) italic_δ italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / italic_δ italic_t. When this happens, the physical velocity U𝑈Uitalic_U changes into its counterpart in the scaled setup Uδsuperscript𝑈𝛿U^{\delta}italic_U start_POSTSUPERSCRIPT italic_δ end_POSTSUPERSCRIPT as

U=UlbδxδtUδ=Ulbδxαδδtαδ2=U1αδ𝑈subscript𝑈𝑙𝑏𝛿𝑥𝛿𝑡superscript𝑈𝛿subscript𝑈𝑙𝑏𝛿𝑥subscript𝛼𝛿𝛿𝑡superscriptsubscript𝛼𝛿2𝑈1subscript𝛼𝛿U=U_{lb}\frac{\delta x}{\delta t}\>\rightarrow\>U^{\delta}=U_{lb}\frac{\delta x% \alpha_{\delta}}{\delta t\alpha_{\delta}^{2}}=U\>\frac{1}{\alpha_{\delta}}italic_U = italic_U start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT divide start_ARG italic_δ italic_x end_ARG start_ARG italic_δ italic_t end_ARG → italic_U start_POSTSUPERSCRIPT italic_δ end_POSTSUPERSCRIPT = italic_U start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT divide start_ARG italic_δ italic_x italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_ARG start_ARG italic_δ italic_t italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = italic_U divide start_ARG 1 end_ARG start_ARG italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_ARG (14)

which in the case of decreasing the streaming distance (αδ<1subscript𝛼𝛿1\alpha_{\delta}<1italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT < 1) means increasing the velocity U𝑈Uitalic_U in physical units.

Refer to caption
Figure 3: The δxh𝛿𝑥\delta x-hitalic_δ italic_x - italic_h discretization parameters space and its three discussed directions (denoted by gray dashed arrows). Blue circles denote Eulerian points, and small dark gray squares denote Lagrangian nodes. Moving along each of the shown directions results in different scaling of the streaming length δx𝛿𝑥\delta xitalic_δ italic_x and Eulerian discretization parameter hhitalic_h. In particular, in LBM, moving only along the constant δx/h𝛿𝑥\delta x/hitalic_δ italic_x / italic_h direction is possible.

In general, the scaling of the streaming distance is followed by the scaling of the conversion factors from lattice units to physical units:

CL=δxCLδ=δxαδ=CLαδCt=δtCtδ=δtαδ2=Ctαδ2Cu=δx/δtCuδ=δxαδ/δtαδ2=Cu/αδCν=δx2/δtCνδ=CνCg=δx/δt2Cgδ=δxαδ/δt2αδ4=Cg/αδ3subscript𝐶𝐿𝛿𝑥superscriptsubscript𝐶𝐿𝛿𝛿𝑥subscript𝛼𝛿subscript𝐶𝐿subscript𝛼𝛿subscript𝐶𝑡𝛿𝑡superscriptsubscript𝐶𝑡𝛿𝛿𝑡superscriptsubscript𝛼𝛿2subscript𝐶𝑡superscriptsubscript𝛼𝛿2subscript𝐶𝑢𝛿𝑥𝛿𝑡superscriptsubscript𝐶𝑢𝛿𝛿𝑥subscript𝛼𝛿𝛿𝑡superscriptsubscript𝛼𝛿2subscript𝐶𝑢subscript𝛼𝛿subscript𝐶𝜈𝛿superscript𝑥2𝛿𝑡superscriptsubscript𝐶𝜈𝛿subscript𝐶𝜈subscript𝐶𝑔𝛿𝑥𝛿superscript𝑡2superscriptsubscript𝐶𝑔𝛿𝛿𝑥subscript𝛼𝛿𝛿superscript𝑡2superscriptsubscript𝛼𝛿4subscript𝐶𝑔superscriptsubscript𝛼𝛿3\begin{array}[]{lcr}C_{L}=\delta x&\rightarrow&C_{L}^{\delta}=\delta x\alpha_{% \delta}=C_{L}\alpha_{\delta}\\ C_{t}=\delta t&\rightarrow&C_{t}^{\delta}=\delta t\alpha_{\delta}^{2}=C_{t}% \alpha_{\delta}^{2}\\ C_{u}=\delta x/\delta t&\rightarrow&C_{u}^{\delta}=\delta x\alpha_{\delta}/% \delta t\alpha_{\delta}^{2}=C_{u}/\alpha_{\delta}\\ C_{\nu}=\delta x^{2}/\delta t&\rightarrow&C_{\nu}^{\delta}=C_{\nu}\\ C_{g}=\delta x/\delta t^{2}&\rightarrow&C_{g}^{\delta}=\delta x\alpha_{\delta}% /\delta t^{2}\alpha_{\delta}^{4}=C_{g}/\alpha_{\delta}^{3}\end{array}start_ARRAY start_ROW start_CELL italic_C start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = italic_δ italic_x end_CELL start_CELL → end_CELL start_CELL italic_C start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_δ end_POSTSUPERSCRIPT = italic_δ italic_x italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT = italic_C start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_δ italic_t end_CELL start_CELL → end_CELL start_CELL italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_δ end_POSTSUPERSCRIPT = italic_δ italic_t italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_C start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = italic_δ italic_x / italic_δ italic_t end_CELL start_CELL → end_CELL start_CELL italic_C start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_δ end_POSTSUPERSCRIPT = italic_δ italic_x italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT / italic_δ italic_t italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_C start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT / italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_C start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT = italic_δ italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / italic_δ italic_t end_CELL start_CELL → end_CELL start_CELL italic_C start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_δ end_POSTSUPERSCRIPT = italic_C start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_C start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT = italic_δ italic_x / italic_δ italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL start_CELL → end_CELL start_CELL italic_C start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_δ end_POSTSUPERSCRIPT = italic_δ italic_x italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT / italic_δ italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT = italic_C start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT / italic_α start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARRAY (15)

We note that the discussed method of increasing the Reynolds number in MLBM should apply when Eulerian schemes are used for solving the streaming step[39]. The scaled streaming distance would then be equivalent to the scaled streaming velocity in the term 𝒄𝒄\boldsymbol{c}\cdot\nablabold_italic_c ⋅ ∇ of the Boltzmann equation, where 𝒄𝒄\boldsymbol{c}bold_italic_c is the microscopic population velocity.

3 Results

3.1 Validation and execution time analysis

We investigate the flow over a cylindrical obstacle in a plane channel, Fig. 4. The length of the channel is 3333, the height of the channel is 1111 and the cylindrical obstacle of diameter d=0.14𝑑0.14d=0.14italic_d = 0.14 is placed on the channel’s axis, at a distance 0.50.50.50.5 from its inlet.

For MLBM simulations we use the scattered discretization near the cylinder and in its wake and the regular discretization elsewhere. We use the meshless RBF-FD approximation for the Lagrangian points whose closest Eulerian neighbors do not form a regular lattice. In the remaining points we use biquadratic Lagrange interpolation. The distance between the nodes vary from hmin=1/200subscriptmin1200h_{\text{min}}=1/200italic_h start_POSTSUBSCRIPT min end_POSTSUBSCRIPT = 1 / 200 on the cylinder’s surface to hmax=1/25subscriptmax125h_{\text{max}}=1/25italic_h start_POSTSUBSCRIPT max end_POSTSUBSCRIPT = 1 / 25 away from the cylinder. The set of points used for the discretization is symmetric about the channel’s axis to achieve a symmetric lift force profile in time. The boundary conditions on the obstacle’s surface is the no-slip wall. It is implemented using a simple bounceback rule for the populations for which the Lagrangian points lie inside the cylinder. The inlet and the outlet has the prescribed parabolic velocity profile with the maximum value of U0,lb=102subscript𝑈0𝑙𝑏superscript102U_{0,lb}=10^{-2}italic_U start_POSTSUBSCRIPT 0 , italic_l italic_b end_POSTSUBSCRIPT = 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT and unit density ρ=1𝜌1\rho=1italic_ρ = 1. Zero macroscopic velocity 𝒖=𝟎𝒖0\boldsymbol{u}=\boldsymbol{0}bold_italic_u = bold_0 and unit density ρ=1𝜌1\rho=1italic_ρ = 1 is imposed on the top and bottom boundaries. The former four boundaries are implemented by overwriting all populations in their nodes with the equilibrium distributions parametrized with the desired macroscopic density and velocity. The initial condition are the equilibrium populations parametrized with the unit density ρ=1𝜌1\rho=1italic_ρ = 1 and zero macroscopic velocity 𝒖=𝟎𝒖0\boldsymbol{u}=\boldsymbol{0}bold_italic_u = bold_0.

Refer to caption
Refer to caption
Figure 4: The hybrid regular-irregular discretization of the flow over a cylindrical obstacle used in the simulations (left) and a zoom in at the discretization around the obstacle (right).

The Reynolds number of each case was calculated based on the inlet velocity U0subscript𝑈0U_{0}italic_U start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and the cylinder’s diameter d𝑑ditalic_d:

Re=U0dν.𝑅𝑒subscript𝑈0𝑑𝜈Re=\frac{U_{0}d}{\nu}.italic_R italic_e = divide start_ARG italic_U start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_d end_ARG start_ARG italic_ν end_ARG . (16)

The values considered in MLBM simulations are Re=105,126,147,168𝑅𝑒105126147168Re=105,126,147,168italic_R italic_e = 105 , 126 , 147 , 168. We increase Re𝑅𝑒Reitalic_R italic_e by decreasing the streaming distance length from δx=1.33103𝛿𝑥1.33superscript103\delta x=1.33\cdot 10^{-3}italic_δ italic_x = 1.33 ⋅ 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT for Re=105𝑅𝑒105Re=105italic_R italic_e = 105 to δx=8.33104𝛿𝑥8.33superscript104\delta x=8.33\cdot 10^{-4}italic_δ italic_x = 8.33 ⋅ 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT for Re=168𝑅𝑒168Re=168italic_R italic_e = 168. The relaxation time is τ+=0.53superscript𝜏0.53\tau^{+}=0.53italic_τ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT = 0.53 and the diffusive scaling of the timestep is used.

For the reference in timings analysis we perform a series of standard LBM simulations with the same initial and boundary conditions as in the MLBM setup. The Reynolds numbers considered in LBM simulations are Re=84,105,126,147,168𝑅𝑒84105126147168Re=84,105,126,147,168italic_R italic_e = 84 , 105 , 126 , 147 , 168. The lattice parameter for Re=84𝑅𝑒84Re=84italic_R italic_e = 84 is δx=1/200𝛿𝑥1200\delta x=1/200italic_δ italic_x = 1 / 200 and the increase in Re𝑅𝑒Reitalic_R italic_e is achieved by the appropriate decrease in δx𝛿𝑥\delta xitalic_δ italic_x, such that Re=168𝑅𝑒168Re=168italic_R italic_e = 168 yields the value δx=1/400𝛿𝑥1400\delta x=1/400italic_δ italic_x = 1 / 400. The relaxation time is τ+=0.51superscript𝜏0.51\tau^{+}=0.51italic_τ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT = 0.51 which is close to the stability limit. Also here we use the diffusive scaling of the timestep. We note that MLBM setups use higher value of τ+superscript𝜏\tau^{+}italic_τ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT due to stability issues.

We investigate the onset of inertial effects in the flow qualitatively by observing the appearance of the Kármán vortex street in the obstacle’s wake and quantitatively via the increase of the Strouhal number St𝑆𝑡Stitalic_S italic_t:

St=fdU0,𝑆𝑡𝑓𝑑subscript𝑈0St=\frac{fd}{U_{0}},italic_S italic_t = divide start_ARG italic_f italic_d end_ARG start_ARG italic_U start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG , (17)

where f𝑓fitalic_f is the frequency of the wake vortices shedding.

Refer to caption
Refer to caption
Figure 5: Streamlines of the velocity fields of the flow over a cylindrical obstacle for Reynolds numbers 84 (LBM results) and 168 (MLBM result).

Fig. 5 shows velocity streamlines of the channel flow for two extreme Reynolds numbers: Re=84𝑅𝑒84Re=84italic_R italic_e = 84 obtained with LBM and Re=168𝑅𝑒168Re=168italic_R italic_e = 168 obtained with MLBM. The onset of inertial effects is visible as the distance between the repeating velocity field patterns in the obstacle’s wake decreases.

Refer to caption
Figure 6: Strouhal number values achieved in flows around a cylindrical obstacle as a function of the flow’s Reynolds number. The dashed part of the reference line for Norberg et al. [40] indicates the region where irregular shedding took place.

Fig. 6 shows the values of the Strouhal number St𝑆𝑡Stitalic_S italic_t against the Reynolds number of the flows. The present results are in good agreement with a fit to the experimental data for the cylinder’s diameter-to-length ratio equal to 2000 [40] and earlier experimental works [41, 42, 43].

Let us now move to the timings analysis. For this comparison, LBM and MLBM codes are executed single-threaded on 2x Intel Xeon E5520 with 12GB RAM machine. Fig. 7 shows the CPU time needed to simulate 1s of vortex shedding. For both methods, the execution time increases with Re𝑅𝑒Reitalic_R italic_e. This is caused by the diffusive scaling of the timestep with the streaming distance δx𝛿𝑥\delta xitalic_δ italic_x and the decreasing value of δx𝛿𝑥\delta xitalic_δ italic_x with the increasing Re𝑅𝑒Reitalic_R italic_e. This means that more timesteps is needed to reach 1s in the simulation for larger Re𝑅𝑒Reitalic_R italic_e. Nevertheless, for MLBM the execution time rises slower than for LBM (𝒪(Re2)𝒪𝑅superscript𝑒2\mathcal{O}(Re^{2})caligraphic_O ( italic_R italic_e start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) vs. 𝒪(Re4)𝒪𝑅superscript𝑒4\mathcal{O}(Re^{4})caligraphic_O ( italic_R italic_e start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT )), since in the former it is only the timestep number that increases with Re𝑅𝑒Reitalic_R italic_e. For LBM, on the other hand, not only the timestep number increases with Re𝑅𝑒Reitalic_R italic_e, but also the total number of nodes in the domain does. For the highest value of Re𝑅𝑒Reitalic_R italic_e considered here, MLBM performs 3similar-toabsent3\sim 3∼ 3 times faster than LBM.

Refer to caption
Figure 7: CPU time needed to execute 1s of the vortex shedding for LBM and MLBM in a single threaded run for various Reynolds numbers. Both axes use logarithmic scale. The dashed lines denote the 2nd and the 4th order slopes.

To analyze the sources of MLBM speedup, we consider the CPU time needed to perform 103superscript10310^{3}10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT iterations in a single-threaded run on the same machine as before, shown in Fig. 8. The data are presented for LBM and for three scenarios of MLBM. The first scenario uses fully scattered discretization with RBF-FD approximation at each Lagrangian node, with uniform in space internodal distance equal to 1/20012001/2001 / 200. In the second scenario we exchange the scattered discretization away from the cylinder and its wake for a regular points arrangement. We also use biquadratic Lagrange interpolation for the Lagrangian nodes whose closest Eulerian neighbors form a regular lattice. The third scenario additionally features coarser discretization away from the cylinder and is the one used in the calculation of the Strouhal number. The number of nodes in the LBM and MLBM setups is summarized in Table 1. The presented execution time increases with Re𝑅𝑒Reitalic_R italic_e for LBM because of the increasing number of nodes in the domain when the streaming distance δx𝛿𝑥\delta xitalic_δ italic_x is decreased. This is in contrast to MLBM, when the increase of Re𝑅𝑒Reitalic_R italic_e is achieved only by decreasing the value of δx𝛿𝑥\delta xitalic_δ italic_x, with no additional nodes inserted in the domain. Due to this, the execution time stays constant for each scenario across the considered Reynolds numbers. Further, the execution times differ between the MLBM scenarios. About 40%percent4040\%40 % speedup between scenario 1 and 2 is achieved by switching from RBF-FD approximation with NL=25subscript𝑁𝐿25N_{L}=25italic_N start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = 25 nodes to the computationally cheaper biquadratic Lagrange interpolation away from the cylinder. It is achieved in spite of the increase of the total number of nodes by about 10%percent1010\%10 %. Decreasing the discretization density away from the cylinder accounts for the subsequent 86%percent8686\%86 % speedup between scenario 2 and 3.

Refer to caption
Figure 8: CPU time needed to execute 103superscript10310^{3}10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT timesteps of LBM and MLBM in a single threaded run for various Reynolds numbers. The visualizations of the discretizations of the three MLBM scenarios are shown on the right hand side. Note that two uniform visualizations show the discretizations coarser than h=1/2001200h=1/200italic_h = 1 / 200 for clarity.
Table 1: Number of nodes N𝑁Nitalic_N discretizing the domain in the LBM and MLBM simulations of the Kármán vortex street.
LBM case N𝑁Nitalic_N
Re=84𝑅𝑒84Re=84italic_R italic_e = 84 119988119988119988119988
Re=105𝑅𝑒105Re=105italic_R italic_e = 105 187290187290187290187290
Re=126𝑅𝑒126Re=126italic_R italic_e = 126 269522269522269522269522
Re=147𝑅𝑒147Re=147italic_R italic_e = 147 366668366668366668366668
Re=168𝑅𝑒168Re=168italic_R italic_e = 168 478740478740478740478740
MLBM case N𝑁Nitalic_N
scattered uniform 104825104825104825104825
hybrid uniform 115185115185115185115185
hybrid coarsened 10117101171011710117

3.2 Application of MLBM1 in the lid-driven cavity

The flow domain is a [0,1]×[0,1]0101[0,1]\times[0,1][ 0 , 1 ] × [ 0 , 1 ] square (Fig. 9). We use irregular discretizations with the Eulerian discretization parameter h=1/1601160h=1/160italic_h = 1 / 160 uniform in space. We set the top lid velocity to ulb(x,1)=U0,lb(1(2x1)24)subscript𝑢𝑙𝑏𝑥1subscript𝑈0𝑙𝑏1superscript2superscript𝑥124u_{lb}(x,1)=U_{0,{lb}}(1-(2x^{*}-1)^{24})italic_u start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT ( italic_x , 1 ) = italic_U start_POSTSUBSCRIPT 0 , italic_l italic_b end_POSTSUBSCRIPT ( 1 - ( 2 italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - 1 ) start_POSTSUPERSCRIPT 24 end_POSTSUPERSCRIPT ) where U0,lb=0.1subscript𝑈0𝑙𝑏0.1U_{0,{lb}}=0.1italic_U start_POSTSUBSCRIPT 0 , italic_l italic_b end_POSTSUBSCRIPT = 0.1 is the reference velocity in LB units and x=min(max(0,x),1)superscript𝑥0𝑥1x^{*}=\min(\max(0,x),1)italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = roman_min ( roman_max ( 0 , italic_x ) , 1 ) is the truncated x𝑥xitalic_x-coordinate. The no-slip boundary condition on the left, right, and bottom walls is implemented via multireflection bounceback [44]. The unknown populations at the top boundary are equilibrium distributions parametrized with ρlb=1subscript𝜌𝑙𝑏1\rho_{lb}=1italic_ρ start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT = 1 and velocities: ulb(x,1)subscript𝑢𝑙𝑏𝑥1u_{lb}(x,1)italic_u start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT ( italic_x , 1 ) and vlb(x,1)=0subscript𝑣𝑙𝑏𝑥10v_{lb}(x,1)=0italic_v start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT ( italic_x , 1 ) = 0. We introduce xsuperscript𝑥x^{*}italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT to be able to calculate fkeqsubscriptsuperscript𝑓eq𝑘f^{\text{eq}}_{k}italic_f start_POSTSUPERSCRIPT eq end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT streamed from x<0𝑥0x<0italic_x < 0 or x>1𝑥1x>1italic_x > 1. To increase the Reynolds number, we scale the positions of Eulerian points by the factor of αE=1,4,10,32,50subscript𝛼𝐸14103250\alpha_{E}=1,4,10,32,50italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT = 1 , 4 , 10 , 32 , 50. Along with the streaming distance δx=6103𝛿𝑥6superscript103\delta x=6\cdot 10^{-3}italic_δ italic_x = 6 ⋅ 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT and physical viscosity ν=1𝜈1\nu=1italic_ν = 1, this gives Reynolds numbers based on the square side length and top lid velocity equal to 100,400,1000,320010040010003200100,400,1000,3200100 , 400 , 1000 , 3200 and 5000500050005000. We initialize the simulation with equilibrium distributions parametrized with zero velocity and unit density and iterate the simulation until the relative change of the x𝑥xitalic_x-component of the velocity between two timesteps defined as:

Δu=maxi(|uin+1uin+1|)U0Δ𝑢subscript𝑖superscriptsubscript𝑢𝑖𝑛1superscriptsubscript𝑢𝑖𝑛1subscript𝑈0\Delta u=\frac{\max\limits_{i}\left(\left|u_{i}^{n+1}-u_{i}^{n+1}\right|\right% )}{U_{0}}roman_Δ italic_u = divide start_ARG roman_max start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( | italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT | ) end_ARG start_ARG italic_U start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG (18)

falls below 1012superscript101210^{-12}10 start_POSTSUPERSCRIPT - 12 end_POSTSUPERSCRIPT or up to 61066superscript1066\cdot 10^{6}6 ⋅ 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT (for αE=32subscript𝛼𝐸32\alpha_{E}=32italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT = 32) or 1210612superscript10612\cdot 10^{6}12 ⋅ 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT (for αE=50subscript𝛼𝐸50\alpha_{E}=50italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT = 50) iterations. By uinsuperscriptsubscript𝑢𝑖𝑛u_{i}^{n}italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT we mean the value of the x𝑥xitalic_x-component of the velocity at the i𝑖iitalic_i-th Eulerian node at the n𝑛nitalic_n-th timestep.

Refer to caption
Figure 9: The irregular discretization used in the simulations of the flow in a driven cavity.
Refer to caption
Figure 10: Profiles of u𝑢uitalic_u and v𝑣vitalic_v components of velocity along the lines x=0.5𝑥0.5x=0.5italic_x = 0.5 and y=0.5𝑦0.5y=0.5italic_y = 0.5 for the driven cavity flow. Small symbols denote MLBM1 results, larger squares denote the reference solution by Ghia [45]. Colors indicate the Reynolds number.
Refer to caption
Figure 11: Positions of the centers of the vortices in the driven cavity. Left: the primary vortex, center: the secondary bottom left vortex and right: the secondary bottom right vortex. The dotted line denotes the sequence of increasing Re𝑅𝑒Reitalic_R italic_e: 100,400,1000,320010040010003200100,400,1000,3200100 , 400 , 1000 , 3200. The MLBM1 data are compared with the results of Ghia [45].
Refer to caption
Refer to caption
Figure 12: Paths traversed by the centers of the primary vortex (left) and the secondary bottom left vortex (right), Re=5000𝑅𝑒5000Re=5000italic_R italic_e = 5000. The MLBM1 data are compared with the vortices centers reported by Ghia [45].

Figure 10 shows the normalized u𝑢uitalic_u and v𝑣vitalic_v component of velocity profiles along the centerlines of the cavity (x=0.5𝑥0.5x=0.5italic_x = 0.5, y=0.5𝑦0.5y=0.5italic_y = 0.5 after rescaling the domain by 1/αE1subscript𝛼𝐸1/\alpha_{E}1 / italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT). Small circles denote MLBM1 results and are compared with the results of Ghia [45] denoted by larger squares. A good compliance between the two methods is visible, especially in the middle of the primary vortex. The difference in the velocity profiles increases closer to the boundaries, possibly due to large velocity gradients and shear stresses. Fig. 11 shows the centers of the primary vortex and the secondary vortices in the bottom left and bottom right corners of the cavity. The compliance between MLBM1 results and Ghia’s solution [45] is within about 1% of the domain side length. As the Re=5000𝑅𝑒5000Re=5000italic_R italic_e = 5000 case exhibited an oscillatory behavior of slowly decreasing amplitude, we plot the path of the primary vortex and the secondary vortex in the bottom left corner of the cavity in Fig. 12. After 1210612superscript10612\cdot 10^{6}12 ⋅ 10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT iterations, corresponding to t=72𝑡72t=72italic_t = 72, the centers of the vortices oscillate around points close to those reported by Ghia [45]. We note that to obtain a stable solution in Re=5000𝑅𝑒5000Re=5000italic_R italic_e = 5000 case, we needed to increase the order of augmentation in the meshless approximation to the fourth and the number of stencil members to NL=70subscript𝑁𝐿70N_{L}=70italic_N start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = 70. Despite the stability, spurious shockwaves propagating from the right boundary could be seen throughout the simulation.

3.3 Inertial effects in a flow through a porous sample

Refer to caption
Refer to caption
Figure 13: The point cloud used for the simulations of a flow through a porous sample: the whole discretized domain and a zoom at the discretization detail (right).

We investigate the onset of inertial effects in a flow through a porous medium (see Fig. 13) of porosity ϕ=0.64italic-ϕ0.64\phi=0.64italic_ϕ = 0.64 with MLBM1. We use irregular discretization with the Eulerian discretization parameter h=51035superscript103h=5\cdot 10^{-3}italic_h = 5 ⋅ 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT. The no-slip boundary condition is imposed on the top and bottom boundary and the circular obstacles’ surfaces using the multireflection bounce back rule [44]. The periodic boundary condition is imposed on the left and right boundary by periodic search of the stencil members. The flow is forced by a constant acceleration of value glb=107subscript𝑔𝑙𝑏superscript107g_{lb}=10^{-7}italic_g start_POSTSUBSCRIPT italic_l italic_b end_POSTSUBSCRIPT = 10 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT along the x𝑥xitalic_x direction. The meshless discretization of the domain corresponds to its size of 1×1111\times 11 × 1. The investigated values of the scaling parameter αEsubscript𝛼𝐸\alpha_{E}italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT range from 1111 to 17171717. We note that for αE=18subscript𝛼𝐸18\alpha_{E}=18italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT = 18, we could not obtain a stable solution despite increasing the stencil size to NL=70subscript𝑁𝐿70N_{L}=70italic_N start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = 70 and using fourth-order polynomial augmentation. The streaming distance length is chosen to be δx=2.5103𝛿𝑥2.5superscript103\delta x=2.5\cdot 10^{-3}italic_δ italic_x = 2.5 ⋅ 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT, timestep length is δt=1.04106𝛿𝑡1.04superscript106\delta t=1.04\cdot 10^{-6}italic_δ italic_t = 1.04 ⋅ 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT and the kinematic viscosity is ν=1𝜈1\nu=1italic_ν = 1. The Reynolds number for each case was calculated based on the mean flow velocity in the x𝑥xitalic_x-direction udelimited-⟨⟩𝑢\langle u\rangle⟨ italic_u ⟩:

u=1ϕD𝑑Ωu(𝒙),delimited-⟨⟩𝑢1italic-ϕsubscript𝐷differential-dΩ𝑢𝒙\langle u\rangle=\frac{1}{\phi}\int\limits_{D}d\Omega\>u(\boldsymbol{x}),⟨ italic_u ⟩ = divide start_ARG 1 end_ARG start_ARG italic_ϕ end_ARG ∫ start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT italic_d roman_Ω italic_u ( bold_italic_x ) , (19)

sample’s side length equal to αEsubscript𝛼𝐸\alpha_{E}italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT and kinematic viscosity ν𝜈\nuitalic_ν.

We show the visualization of the velocity field for three chosen values of αEsubscript𝛼𝐸\alpha_{E}italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT in Fig. 14. As the Reynolds number increases, recirculation zones grow causing the streamlines to separate from the grains’ wakes.

Refer to caption
Refer to caption
Refer to caption
Figure 14: Streamlines of the velocity field of flows through the porous domain scaled with the factors αE=1,8,17subscript𝛼𝐸1817\alpha_{E}=1,8,17italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT = 1 , 8 , 17 (Re=0.76𝑅𝑒0.76Re=0.76italic_R italic_e = 0.76, 305305305305, and 1511151115111511, respectively). The points coordinates were rescaled by a factor of 1/αE1subscript𝛼𝐸1/\alpha_{E}1 / italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT prior to plotting.

The inertial effects appearance can be quantified in terms of kinetic energy distribution in the system [5]. The left plot in Fig. 15 shows the fraction of the kinetic energy defined as ek(𝒙)=𝒖2(𝒙)subscript𝑒𝑘𝒙superscript𝒖2𝒙e_{k}(\boldsymbol{x})=\boldsymbol{u}^{2}(\boldsymbol{x})italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_x ) = bold_italic_u start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_italic_x ) confined in vortices to the total kinetic energy in the system:

𝒙:Q<0ek(𝒙)𝑑Ωek(𝒙)𝑑Ω.subscript:𝒙𝑄0subscript𝑒𝑘𝒙differential-dΩsubscript𝑒𝑘𝒙differential-dΩ\frac{\displaystyle\int\limits_{\boldsymbol{x}\>:\>Q<0}e_{k}(\boldsymbol{x})\>% d\Omega}{\displaystyle\int e_{k}(\boldsymbol{x})\>d\Omega}\>.divide start_ARG ∫ start_POSTSUBSCRIPT bold_italic_x : italic_Q < 0 end_POSTSUBSCRIPT italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_x ) italic_d roman_Ω end_ARG start_ARG ∫ italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_x ) italic_d roman_Ω end_ARG . (20)

We use the Q-criterion to define a vortex as a volume where:

Q=12(|S|2|Ω|2)<0, where S=12(uixj+ujxi),Ω=12(uixjujxi)formulae-sequence𝑄12superscript𝑆2superscriptΩ20formulae-sequence where 𝑆12subscript𝑢𝑖subscript𝑥𝑗subscript𝑢𝑗subscript𝑥𝑖Ω12subscript𝑢𝑖subscript𝑥𝑗subscript𝑢𝑗subscript𝑥𝑖Q=\frac{1}{2}\left(|S|^{2}-|\Omega|^{2}\right)<0,\text{ where }S=\frac{1}{2}% \left(\frac{\partial u_{i}}{\partial x_{j}}+\frac{\partial u_{j}}{\partial x_{% i}}\right),\>\Omega=\frac{1}{2}\left(\frac{\partial u_{i}}{\partial x_{j}}-% \frac{\partial u_{j}}{\partial x_{i}}\right)italic_Q = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( | italic_S | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - | roman_Ω | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) < 0 , where italic_S = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG ∂ italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG ∂ italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG + divide start_ARG ∂ italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG ∂ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) , roman_Ω = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG ∂ italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG ∂ italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG - divide start_ARG ∂ italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG ∂ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) (21)

and |A|2=i,jAij2superscript𝐴2subscript𝑖𝑗superscriptsubscript𝐴𝑖𝑗2|A|^{2}=\sum\limits_{i,j}A_{ij}^{2}| italic_A | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. By uisubscript𝑢𝑖u_{i}italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT we mean the velocity components (i.e. u0usubscript𝑢0𝑢u_{0}\equiv uitalic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≡ italic_u and u1vsubscript𝑢1𝑣u_{1}\equiv vitalic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≡ italic_v). In the range of Re=100𝑅𝑒100Re=100italic_R italic_e = 100700700700700 a rapid increase in the value of the fraction from Eq. (20) occurs.

We can relate this to the transition in the behavior of the udelimited-⟨⟩𝑢\langle u\rangle⟨ italic_u ⟩. In the Darcy regime (inertial effects negligible), udelimited-⟨⟩𝑢\langle u\rangle⟨ italic_u ⟩ is proportional to the pressure loss between the inlet and the outlet of the sample:

u=kΔPLμ=kgνdelimited-⟨⟩𝑢𝑘Δ𝑃𝐿𝜇𝑘𝑔𝜈\langle u\rangle=-k\frac{\Delta P}{L\mu}=-k\frac{g}{\nu}⟨ italic_u ⟩ = - italic_k divide start_ARG roman_Δ italic_P end_ARG start_ARG italic_L italic_μ end_ARG = - italic_k divide start_ARG italic_g end_ARG start_ARG italic_ν end_ARG (22)

where k𝑘kitalic_k is the sample’s permeability, L𝐿Litalic_L is the sample length, and the last equality holds if we relate the acceleration g𝑔gitalic_g forcing the flow to the mean pressure drop via ρg=ΔP/L𝜌𝑔Δ𝑃𝐿\rho g=\Delta P/Litalic_ρ italic_g = roman_Δ italic_P / italic_L. Next, Eq. (22) needs to be related to the changes in the size of the sample since this is the way of increasing Re𝑅𝑒Reitalic_R italic_e in our study. If in Eq. (13) we omit the pressure gradient term and change the reference length from L𝐿Litalic_L to αELsubscript𝛼𝐸𝐿\alpha_{E}Litalic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT italic_L, we find that the only difference between the two systems (the one rescaled by αEsubscript𝛼𝐸\alpha_{E}italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT and the non-rescaled one) is in the term Re=UL/ν=L2/Tν𝑅𝑒𝑈𝐿𝜈superscript𝐿2𝑇𝜈Re=UL/\nu=L^{2}/T\nuitalic_R italic_e = italic_U italic_L / italic_ν = italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / italic_T italic_ν. This means that a system scaled by a factor of αEsubscript𝛼𝐸\alpha_{E}italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT behaves as if it was not scaled, but the viscosity was changed by a factor of αE2superscriptsubscript𝛼𝐸2\alpha_{E}^{-2}italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT (due to the term L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT in the denominator of Re𝑅𝑒Reitalic_R italic_e). Inserting this relation into Eq. (22) gives:

ukgναE2αE2proportional-todelimited-⟨⟩𝑢𝑘𝑔𝜈superscriptsubscript𝛼𝐸2proportional-tosuperscriptsubscript𝛼𝐸2\langle u\rangle\propto k\frac{g}{\nu\alpha_{E}^{-2}}\propto\alpha_{E}^{2}⟨ italic_u ⟩ ∝ italic_k divide start_ARG italic_g end_ARG start_ARG italic_ν italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT end_ARG ∝ italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (23)

A similar procedure applied in the Forchheimer regime (leaving only the second order term u2superscriptdelimited-⟨⟩𝑢2\langle u\rangle^{2}⟨ italic_u ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT) yields:

u2αE2.proportional-tosuperscriptdelimited-⟨⟩𝑢2superscriptsubscript𝛼𝐸2\langle u\rangle^{2}\propto\alpha_{E}^{2}.⟨ italic_u ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∝ italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (24)

Presenting Eqs. (23) and (24) in logarithmic scale gives:

logu{2logαE,Darcy regimelogαE,Forchheimer regimeproportional-to𝑢cases2subscript𝛼𝐸Darcy regimesubscript𝛼𝐸Forchheimer regime\log\langle u\rangle\propto\begin{cases}2\log\alpha_{E},&\text{Darcy regime}\\ \log\alpha_{E},&\text{Forchheimer regime}\\ \end{cases}roman_log ⟨ italic_u ⟩ ∝ { start_ROW start_CELL 2 roman_log italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT , end_CELL start_CELL Darcy regime end_CELL end_ROW start_ROW start_CELL roman_log italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT , end_CELL start_CELL Forchheimer regime end_CELL end_ROW (25)

which is visible in the right plot of Fig. 15. The transition between the second and the first order relations, denoted by dashed grey lines in Fig. 15, occurs at the scaling factor αE7subscript𝛼𝐸7\alpha_{E}\approx 7italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT ≈ 7 for which Re25𝑅𝑒25Re\approx 25italic_R italic_e ≈ 25 (see the inset in the left subplot of Fig. 15) and corresponds well with the predictions of the onset of inertial effects from the vortices’ kinetic energy approach.

Refer to caption
Refer to caption
Figure 15: Left: the fraction of the kinetic energy confined in the Q<0𝑄0Q<0italic_Q < 0 volumes as a function of the Reynolds number. The inset shows the relation between the Reynolds number and the scaling factor αEsubscript𝛼𝐸\alpha_{E}italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT. Right: mean x𝑥xitalic_x-component velocity defined by Eq.(19) as a function of the scaling factor. The dashed lines denote the 1st and the 2nd order slopes.

Computational complexity and memory demands of the MLBM1 algorithm

Matyka and Dzikowski [18] derived the expression for the memory demands of their algorithm, which yields:

MLBM1=NLBM12Nfcsubscript𝑀LBM1subscript𝑁LBM12subscript𝑁𝑓𝑐M_{\text{LBM1}}=N_{\text{LBM1}}\cdot 2N_{f}citalic_M start_POSTSUBSCRIPT LBM1 end_POSTSUBSCRIPT = italic_N start_POSTSUBSCRIPT LBM1 end_POSTSUBSCRIPT ⋅ 2 italic_N start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT italic_c (26)

where NLBM1subscript𝑁LBM1N_{\text{LBM1}}italic_N start_POSTSUBSCRIPT LBM1 end_POSTSUBSCRIPT is the number of nodes in the discretization, Nfsubscript𝑁𝑓N_{f}italic_N start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT is the number of macroscopic fields (in our case 3 - ρ𝜌\rhoitalic_ρ,u𝑢uitalic_u and v𝑣vitalic_v), and c𝑐citalic_c is the number of bytes per single number (e.g. 8 for double). MLBM1 uses additional memory to store the interpolation weight vectors for each of the N(q1)𝑁𝑞1N(q-1)italic_N ( italic_q - 1 ) Lagrangian nodes (the term 11-1- 1 appears because no interpolation is needed for the populations with zero microscopic velocity). Assuming that each interpolation stencil has the same number of nodes NLsubscript𝑁𝐿N_{L}italic_N start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT, this results in the memory demands for MLBM1:

MMLBM1=NMLBM1(2Nf+NL(q1))csubscript𝑀MLBM1subscript𝑁MLBM12subscript𝑁𝑓subscript𝑁𝐿𝑞1𝑐M_{\text{MLBM1}}=N_{\text{MLBM1}}\cdot(2N_{f}+N_{L}(q-1))citalic_M start_POSTSUBSCRIPT MLBM1 end_POSTSUBSCRIPT = italic_N start_POSTSUBSCRIPT MLBM1 end_POSTSUBSCRIPT ⋅ ( 2 italic_N start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT + italic_N start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( italic_q - 1 ) ) italic_c (27)

where NMLBM1subscript𝑁MLBM1N_{\text{MLBM1}}italic_N start_POSTSUBSCRIPT MLBM1 end_POSTSUBSCRIPT is the number of Eulerian nodes. Taking the ratio of the two gives:

rM=MMLBM1MLBM1=NMLBM1NLBM12Nf+NL(q1)2Nf=NMLBM1NLBM1(1+NL(q1)2Nf)subscript𝑟𝑀subscript𝑀MLBM1subscript𝑀LBM1subscript𝑁MLBM1subscript𝑁LBM12subscript𝑁𝑓subscript𝑁𝐿𝑞12subscript𝑁𝑓subscript𝑁MLBM1subscript𝑁LBM11subscript𝑁𝐿𝑞12subscript𝑁𝑓r_{M}=\frac{M_{\text{MLBM1}}}{M_{\text{LBM1}}}=\frac{N_{\text{MLBM1}}}{N_{% \text{LBM1}}}\cdot\frac{2N_{f}+N_{L}(q-1)}{2N_{f}}=\frac{N_{\text{MLBM1}}}{N_{% \text{LBM1}}}\cdot\left(1+\frac{N_{L}(q-1)}{2N_{f}}\right)italic_r start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT = divide start_ARG italic_M start_POSTSUBSCRIPT MLBM1 end_POSTSUBSCRIPT end_ARG start_ARG italic_M start_POSTSUBSCRIPT LBM1 end_POSTSUBSCRIPT end_ARG = divide start_ARG italic_N start_POSTSUBSCRIPT MLBM1 end_POSTSUBSCRIPT end_ARG start_ARG italic_N start_POSTSUBSCRIPT LBM1 end_POSTSUBSCRIPT end_ARG ⋅ divide start_ARG 2 italic_N start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT + italic_N start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( italic_q - 1 ) end_ARG start_ARG 2 italic_N start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT end_ARG = divide start_ARG italic_N start_POSTSUBSCRIPT MLBM1 end_POSTSUBSCRIPT end_ARG start_ARG italic_N start_POSTSUBSCRIPT LBM1 end_POSTSUBSCRIPT end_ARG ⋅ ( 1 + divide start_ARG italic_N start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( italic_q - 1 ) end_ARG start_ARG 2 italic_N start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT end_ARG ) (28)

In our case, Nf=3subscript𝑁𝑓3N_{f}=3italic_N start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = 3, q=9𝑞9q=9italic_q = 9, and for most of the presented cases, NL=25subscript𝑁𝐿25N_{L}=25italic_N start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = 25. Considering that sparser discretizations can be used in the meshless algorithm, it turns out that MLBM1 can outperform LBM1 regarding memory demands for the same streaming distance value. For example, in the driven cavity flow, we used δx=6103𝛿𝑥6superscript103\delta x=6\cdot 10^{-3}italic_δ italic_x = 6 ⋅ 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT, which would give the number of nodes in the LBM1 setup ranging from 2.81042.8superscript1042.8\cdot 10^{4}2.8 ⋅ 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT for αE=1subscript𝛼𝐸1\alpha_{E}=1italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT = 1 to 71077superscript1077\cdot 10^{7}7 ⋅ 10 start_POSTSUPERSCRIPT 7 end_POSTSUPERSCRIPT for αE=50subscript𝛼𝐸50\alpha_{E}=50italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT = 50. At the same time the term in brackets in Eq. (28) is equal to 34343434 for αE<50subscript𝛼𝐸50\alpha_{E}<50italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT < 50 and 94949494 for αE=50subscript𝛼𝐸50\alpha_{E}=50italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT = 50 (due to the change of NLsubscript𝑁𝐿N_{L}italic_N start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT for αE=50subscript𝛼𝐸50\alpha_{E}=50italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT = 50). This means that for the MLBM1 domain with N=22615𝑁22615N=22615italic_N = 22615 Eulerian nodes, the ratio rMsubscript𝑟𝑀r_{M}italic_r start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ranges from 27272727 times more memory in the case of MLBM1 for αE=1subscript𝛼𝐸1\alpha_{E}=1italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT = 1 to 32323232 times more memory in the case of LBM1 for αE=50subscript𝛼𝐸50\alpha_{E}=50italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT = 50. This shows the possibility of considerably reducing the number of nodes in LBM1 models with an off-grid (especially meshless) streaming procedure. When a proper local refinement strategy is used and less computationally intensive approximation scheme is employed e.g. in the bulk of the flow, as shown in Section 3.1, the memory savings may come at no cost to the solution’s accuracy.

The computational complexity of MLBM1 carries the overhead of interpolation during the streaming step. To exploit the absence of explicit storage of the distribution function in the memory, the collision (i.e., the calculation of fkeqsubscriptsuperscript𝑓eq𝑘f^{\text{eq}}_{k}italic_f start_POSTSUPERSCRIPT eq end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT) must take place in Lagrangian nodes, and the macroscopic fields must be interpolated to those points. Otherwise (if the collision took place in Eulerian nodes), fkeqsubscriptsuperscript𝑓eq𝑘f^{\text{eq}}_{k}italic_f start_POSTSUPERSCRIPT eq end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT would have to be stored in memory between its calculation and interpolation. This means that for each Eulerian point, one needs qOcol+(q1)NfOint𝑞subscript𝑂col𝑞1subscript𝑁𝑓subscript𝑂intqO_{\text{col}}+(q-1)N_{f}O_{\text{int}}italic_q italic_O start_POSTSUBSCRIPT col end_POSTSUBSCRIPT + ( italic_q - 1 ) italic_N start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT italic_O start_POSTSUBSCRIPT int end_POSTSUBSCRIPT operations per timestep, where Ocolsubscript𝑂colO_{\text{col}}italic_O start_POSTSUBSCRIPT col end_POSTSUBSCRIPT is the number of operations needed to calculate fkeqsubscriptsuperscript𝑓eq𝑘f^{\text{eq}}_{k}italic_f start_POSTSUPERSCRIPT eq end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and Ointsubscript𝑂intO_{\text{int}}italic_O start_POSTSUBSCRIPT int end_POSTSUBSCRIPT is the number of operations needed to interpolate a scalar within a stencil (note that for k=0𝑘0k=0italic_k = 0 one does not need to interpolate, but still the calculation of f0eqsubscriptsuperscript𝑓eq0f^{\text{eq}}_{0}italic_f start_POSTSUPERSCRIPT eq end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is necessary). This shows that the savings in memory for q𝑞qitalic_q distributions come at the cost of Nfsubscript𝑁𝑓N_{f}italic_N start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT times more interpolations which have to be done for each Lagrangian node, compared to MLBM (in MLBM, one interpolates only one population to each Lagrangian point).

4 Discussion

The decoupled space and velocity discretizations in MLBM change the paradigm of increasing the model’s size typical for LBM and make it more similar to the one of the Navier-Stokes solvers. In LBM, the information about the characteristic size of the problem is encoded in the number of lattice links/nodes that discretize this size. So is the accuracy of the approximation in space. In MLBM, the information about the system’s size is stored in the ratio between the streaming distance and the distance between Eulerian points representing the characteristic length. The approximation accuracy in space depends on the distances between the approximation stencil members. It is invariant under their translation, rotation, and scaling by a constant as long as the approximated function is devoid of discontinuities. Consequently, in MLBM, the velocity discretization parameters are determined by the desired compressibility errors. In contrast, the space discretization parameters are determined by the flow domain’s geometry and the desired approximation accuracy in space. This is interpreted as hypervolumes in the space of those two groups of discretization parameters accessible to MLBM, where each point might be assigned its Reynolds number, in contrast to hypersurfaces in the case of LBM.

We chose the presented way of increasing the Reynolds number in MLBM to make the procedure as similar to the LBM approach as possible and highlight how the two methods relate to each other. This allows us to state the following. First, the qualitative results show that the described approach causes the Reynolds number to increase, in line with the findings of previous authors [22]. It also suggests that the propagation of information in the model based on the approximation, rather than the exact streaming, does not impair the physical mechanism of mass and momentum transfer in the discrete Boltzmann equation. Large streaming distances compared to the spacings between the Eulerian points correspond to fast momentum diffusion, small δx/h𝛿𝑥\delta x/hitalic_δ italic_x / italic_h ratios - on the contrary. Second, quantitative analysis underlines the importance of the approximation accuracy for the solution. At the same time, the presented discretization parameters may serve as a crude estimation of the method’s stability limits along with the aids to mitigate the divergence.

The accuracy and convergence of MLBM and other off-grid LBM methods were investigated in numerous works [27, 46, 20], especially in application to inertial flows in works by Lin and others [47] or Musavi and Ashrafizaadeh [48, 28]. According to the findings by, e.g., Krämer and others [46], the decreasing ratio of δx/h𝛿𝑥\delta x/hitalic_δ italic_x / italic_h introduces larger approximation errors in OLBM. When semi-Lagrangian streaming is used, this has directly to do with the error term 𝒪(hp+1/δx)𝒪superscript𝑝1𝛿𝑥\mathcal{O}\left(h^{p+1}/\delta x\right)caligraphic_O ( italic_h start_POSTSUPERSCRIPT italic_p + 1 end_POSTSUPERSCRIPT / italic_δ italic_x ) characteristic for semi-Lagrangian methods, as suggested in our previous work [20]. To prevent the divergence of this term, for the growing hhitalic_h, one would need to either increase δx𝛿𝑥\delta xitalic_δ italic_x, which would again decrease the size of the domain, or increase the accuracy of the approximation. We presented this procedure in two tests. First, we used regular space discretization in the far-field subdomain of the flow past a cylinder, in contrast to an irregular discretization around the cylinder. Second, in the driven cavity test for the highest Reynolds number (Re=5000𝑅𝑒5000Re=5000italic_R italic_e = 5000), the augmentation order and the number of stencil members were increased compared to the cases with lower Re𝑅𝑒Reitalic_R italic_e to mitigate the divergence of the simulation. However, the mentioned flow through a porous medium for αE=18subscript𝛼𝐸18\alpha_{E}=18italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT = 18 shows that increasing the approximation order or the stencil size does not always guarantee stability retrieval. In such cases, the Eulerian discretization refinement should be used. On the other hand, from the LBM error point of view, decreasing the δx/h𝛿𝑥\delta x/hitalic_δ italic_x / italic_h ratio acts in favor of the compressibility error, which would increase when higher Re𝑅𝑒Reitalic_R italic_e would be achieved by larger inlet velocity or acceleration. Finally, the error of the solution obtained with MLBM on rescaled domains may also come from the under-resolution of physical phenomena, such as small vortices typical for inertial flows. However, this under-resolution might be exploited to the method’s advantage, as suggested by Chen [49].

The solution’s accuracy considerations must always be counterbalanced by preserving the simulation’s efficiency and robustness. In MLBM, one can look for memory and computational savings in discretizing space and velocity independently. First, the feasibility of the local static and dynamic refinement in meshless approximation methods offers great promise to reduce the number of Eulerian points in MLBM compared to LBM. Apart from an arbitrary refinement, as shown in the present work in the case of the flow around a cylinder, the choice of a proper error indicator, such as quantities based on vorticity suggested by Fakhari and Lee [50] or on the velocity field divergence [15], can allow for dynamic hhitalic_h- or p𝑝pitalic_p-refinement during the simulation run. In turn, the error of the solution becomes more uniform in space granting a better accuracy at a little increase of the computational expense. Second, switching to regular discretizations with the approximation schemes computationally less intensive than the scattered ones, can considerably decrease the computational burden in the parts of the domain devoid of geometrical intricacies. The research on this approach is given attention in the field of meshless analysis [51, 52, 53]. Finally, LBM1 models [18, 19] reduce the memory overhead by explicitly storing just a few macroscopic fields rather than a large number of velocity distributions. We show that by introducing meshless space discretization, it is possible to achieve a several-fold memory advantage in comparison to the lattice-based LBM1, which is already estimated to use about 76%percent7676\%76 % less memory than its arbitrary relaxation time counterpart [18].

5 Conclusions

We investigate the procedure to increase the Reynolds number of a flow from laminar to the transitional regime in meshless LBM simulations by increasing the system’s domain size. We achieve this solely by multiplying the coordinates of Eulerian points by some constant αE>1subscript𝛼𝐸1\alpha_{E}>1italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT > 1, or equivalently - the streaming distance by its inverse 1/αE1subscript𝛼𝐸1/\alpha_{E}1 / italic_α start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT, thus no additional nodes are inserted into the domain. We apply this method of the Reynolds number increase to a general meshless LBM scheme and to the meshless implementation of a memory-efficient LBM model with the relaxation time τ=1𝜏1\tau=1italic_τ = 1 (LBM1). We show that the procedure works by numerically investigating three test cases in a range of Reynolds numbers - von Kârmn̂ vortex street behind a cylindrical obstacle of infinite length, driven cavity flow, and a flow through a porous sample. The qualitative observation of the velocity fields indicates the onset of inertial effects in the flows. Also, the obtained values of hydrodynamic parameters agree well with the numerical and experimental references in a wide range of Reynolds numbers. We highlight the computational speedup of MLBM compared to the standard LBM when a proper choice of the nodes arrangement and approximation scheme is made. From the performed simulations, we estimate the possible memory savings of the meshless LBM1 compared to the standard (lattice-based) LBM1 to be as high as 32-fold and pinpoint the source of the computational overhead of the meshless LBM1.

Our findings open new perspectives for further studies on the development and applications of MLBM, and its competitiveness with LBM. Questions yet to be answered address the issues of the possibility of simultaneous scaling of space and velocity discretizations, or the detailed mechanism of the information propagation when the non-purely Lagrangian streaming step is used.

6 Acknowledgments

Funded by National Science Centre, Poland under the OPUS call in the Weave programme 2021/43/I/ST3/00228. This research was funded in whole or in part by National Science Centre (2021/43/I/ST3/00228). For the purpose of Open Access, the author has applied a CC-BY public copyright licence to any Author Accepted Manuscript (AAM) version arising from this submission.

References