Turbulence Scaling from Deep Learning Diffusion Generative Models

Tim Whittaker Département des sciences de la Terre et de l’atmosphère, Université du Québec à Montréal, Montréal, QC H3C 3P8, Canada    Romuald A. Janik Institute of Theoretical Physics and Mark Kac Center for Complex Systems Research, Jagiellonian University, ul. Łojasiewicza 11, 30-348 Kraków, Poland    Yaron Oz School of Physics and Astronomy, Tel-Aviv University, Tel-Aviv 69978, Israel.
(July 5, 2024)
Abstract

Complex spatial and temporal structures are inherent characteristics of turbulent fluid flows and comprehending them poses a major challenge. This comprehesion necessitates an understanding of the space of turbulent fluid flow configurations. We employ a diffusion-based generative model to learn the distribution of turbulent vorticity profiles and generate snapshots of turbulent solutions to the incompressible Navier-Stokes equations. We consider the inverse cascade in two spatial dimensions and generate diverse turbulent solutions that differ from those in the training dataset. We analyze the statistical scaling properties of the new turbulent profiles, calculate their structure functions, energy power spectrum, velocity probability distribution function and moments of local energy dissipation. All the learnt scaling exponents are consistent with the expected Kolmogorov scaling. This agreement with established turbulence characteristics provides strong evidence of the model’s capability to capture essential features of real-world turbulence.

I Introduction

Fluid turbulence stands as a profound, unsolved challenge in physics Frisch . It manifests as a complex emergent phenomenon, arising from the application of Newton’s second law to fluid elements. Extensive research spanning centuries has been dedicated to unraveling the structure of turbulent flows, which encompasses most fluid behaviors in nature across all scales. However, our comprehension of fluid flows in the nonlinear regime remains incomplete. The study of turbulence holds the promise of shedding light on the principles and dynamics of nonlinear systems characterized by a multitude of strongly interacting degrees of freedom in a far from equilibrium state. An intriguing characteristic of turbulence is the phenomenon of scaling, encapsulating the statistical properties and structural complexity of turbulence. Despite significant experimental Benzi1995OnTS and numerical chen_dhruva_kurien_sreenivasan_taylor_2005 ; Biferale_2019 progress, the precision of available data remains inadequate to definitively distinguish among the various models proposed, e.g. for the anomalous scaling in three-dimensional incompressible fluid turbulence PhysRevLett.72.336 ; PhysRevE.63.026307 ; Eling2015TheAS ; Oz:2017ihc . This emphasizes the need to transition towards an era characterized by “precision turbulence.”

Learning capabilities of deep learning algorithms have revolutionized diverse fields, providing a new lens to explore complex systems. Among various applications, deep learning methods have been increasingly utilized to generate turbulent flows, with several different approaches showing promise. Generative Adversarial Networks (GANs) have been employed to model turbulence in Drygala_2022 ; tretiak2022physicsconstrained ; Li2023 . The use of Physics-Informed Neural Networks (PINNs), which incorporate physical laws into the learning process, thereby allowing for more accurate predictions in scenarios where the data is sparse or noisy (for a review see Shu_2023 ). In yang2023denoising it was proposed to use denoising diffusion models and it was shown to be capable of generating fluid fields from either low resolution or even irregular samples. In a similar vein, super-resolution models, which generate high-resolution output from low-resolution input, have been used for turbulent flow data fukami_fukagata_taira_2019 ; ZHOU2022105382 . Super-resolution models effectively bridge the gap between low-resolution measurements and the need for high-resolution reconstructions, making them an ideal tool for the study of turbulence, where fine-scale details can be critical. Another line of research uses diffusion models for generating single particle trajectories in three-dimensional turbulence li2023synthetic . Deep learning approaches for modeling the temporal evolution in turbulent flows were developed mohan2019compressed ; king2018deep ; doi:10.1080/14685248.2020.1757685 ; moghaddam2018deep ; li_yang_zhang_he_deng_shen_2020 ; Buzzicotti2022 ; PhysRevFluids3104604 ; lellep_prexl_eckhardt_linkmann_2022 , complexity of turbulent versus chaotic snapshots was estimated whittaker2022neural , and other recent use of generative models can be found in kohl2023turbulent ; apte2023diffusion ; lienen2023zero .

Motivated by the promise of deep learning we aim in this work to harness the potential of denoising diffusion probabilistic model (DDPM)s to learn statistical turbulence. The question that we will address is whether deep learning can comprehend the properties of turbulence, and whether it can decrease the errors of the training data and make more accurate statistical predictions. We will employ a diffusion-based generative model to learn the distribution of turbulent velocity and vorticity profiles and generate snapshots of turbulent solutions to the incompressible Navier-Stokes (NS) equations. We will consider the inverse cascade in two spatial dimensions, generate diverse new turbulent solutions and analyze the statistical scaling properties of these turbulent profiles. We will calculate their structure functions, energy power spectrum, velocity probability distribution function and moments of local energy dissipation and show that the learnt scaling exponents are consistent with the expected Kolmogorov scaling. The paper is structured as follows: in Sect. II we provide a background overview of fluid turbulence scaling, as well as introduce the DDPM. Subsequently, in Sect. III we cover our methodology, the specifics of the DDPM, the dataset used, and the approach to model training and evaluation. We then present and discuss the results of our learning experiments in Sect. IV. We conclude with a discussion of the research findings, their potential applications, and avenues for future research.

II Background

II.1 2D Fluid Turbulence

The incompressible NS equations provide a mathematical formulation of the fluid flow evolution in d𝑑ditalic_d spatial dimensions at velocities much smaller than the speed of sound:

tvi+vjjvi=ip+νjjvi+fi,ivi=0,formulae-sequencesubscript𝑡superscript𝑣𝑖superscript𝑣𝑗subscript𝑗superscript𝑣𝑖superscript𝑖𝑝𝜈subscript𝑗𝑗superscript𝑣𝑖superscript𝑓𝑖subscript𝑖superscript𝑣𝑖0\partial_{t}v^{i}+v^{j}\partial_{j}v^{i}=-\partial^{i}p+\nu\partial_{jj}v^{i}+% f^{i},~{}~{}~{}~{}~{}~{}\partial_{i}v^{i}=0\ ,∂ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT + italic_v start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = - ∂ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_p + italic_ν ∂ start_POSTSUBSCRIPT italic_j italic_j end_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT + italic_f start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , ∂ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = 0 , (1)

where vi,i=1dsuperscript𝑣𝑖𝑖1𝑑v^{i},i=1...ditalic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , italic_i = 1 … italic_d is the fluid velocity, p𝑝pitalic_p is the fluid pressure, ν𝜈\nuitalic_ν is the kinematic viscosity, and fisuperscript𝑓𝑖f^{i}italic_f start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT is a random forcing. In the two-dimensional case, it is useful to work with the pseudo-scalar vorticity variable ω=ϵijivj𝜔subscriptitalic-ϵ𝑖𝑗superscript𝑖superscript𝑣𝑗\omega=\epsilon_{ij}\partial^{i}v^{j}italic_ω = italic_ϵ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ∂ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_v start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT.

An important dimensionless parameter in the study of fluid flows is the Reynolds number e=lvνsubscript𝑒𝑙𝑣𝜈{\cal R}_{e}=\frac{lv}{\nu}caligraphic_R start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = divide start_ARG italic_l italic_v end_ARG start_ARG italic_ν end_ARG, where l𝑙litalic_l is a characteristic length scale, v𝑣vitalic_v is the velocity difference at that scale, and ν𝜈\nuitalic_ν is the kinematic viscosity. The Reynolds number quantifies the relative strength of the non-linear interaction compared to the viscous term in (1). When the Reynolds number is of order 1010210superscript10210-10^{2}10 - 10 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT one observes a chaotic fluid flow, while when it is 103superscript10310^{3}10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT or higher, one observes a fully developed turbulent structure of the flow. The turbulent velocity field exhibits highly complex spatial and temporal structures and appears to be a random process. A single realization of a turbulent solution to the NS equations is unpredictable even in the absence of a random force. However, the study of statistical averages reveals a hidden scaling structure. Indeed, experimental and numerical data suggest that turbulent fluid flows exhibit a statistically homogeneous and isotropic steady state at the inertial range of scales lvrlfmuch-less-thansubscript𝑙𝑣𝑟much-less-thansubscript𝑙𝑓l_{v}\ll r\ll l_{f}italic_l start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ≪ italic_r ≪ italic_l start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT, where the distance scales lvsubscript𝑙𝑣l_{v}italic_l start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT and lfsubscript𝑙𝑓l_{f}italic_l start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT are determined by the viscosity and driving force, respectively.

The properties of this statistical structure can be quantified by studying statistical averages of fluid observables. For instance, if we denote the velocity of the fluid by v(t,r)𝑣𝑡𝑟\vec{v}(t,\vec{r})over→ start_ARG italic_v end_ARG ( italic_t , over→ start_ARG italic_r end_ARG ) then the turbulent behavior can be characterized by the longitudinal structure functions Sn(r)=(δv(r))nsubscript𝑆𝑛𝑟delimited-⟨⟩superscript𝛿𝑣𝑟𝑛S_{n}(r)=\langle(\delta v(r))^{n}\rangleitalic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_r ) = ⟨ ( italic_δ italic_v ( italic_r ) ) start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⟩ of velocity differences δv(r)=(v(r)v(0))r|r|𝛿𝑣𝑟𝑣𝑟𝑣0𝑟𝑟\delta v(r)=(\vec{v}(\vec{r})-\vec{v}(0))\cdot\frac{\vec{r}}{|\vec{r}|}italic_δ italic_v ( italic_r ) = ( over→ start_ARG italic_v end_ARG ( over→ start_ARG italic_r end_ARG ) - over→ start_ARG italic_v end_ARG ( 0 ) ) ⋅ divide start_ARG over→ start_ARG italic_r end_ARG end_ARG start_ARG | over→ start_ARG italic_r end_ARG | end_ARG between points separated by a fixed distance r𝑟ritalic_r. In the inertial range of scales these correlation functions exhibit a universal scaling law Sn(r)rξnsimilar-tosubscript𝑆𝑛𝑟superscript𝑟subscript𝜉𝑛S_{n}(r)\sim r^{\xi_{n}}italic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_r ) ∼ italic_r start_POSTSUPERSCRIPT italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, where the exponents ξnsubscript𝜉𝑛\xi_{n}italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT are independent of the fluid details and depend only on the number of spatial dimensions.

In a seminal work Kolmogorov , Kolmogorov used the inertial range cascade-like behavior (introduced by Richardson) of incompressible non-relativistic fluids, where large eddies break into smaller eddies in a process where energy is transferred without dissipation. Assuming scale invariant statistics for this direct cascade (from large to small length scales), he deduced that ξn=n/3subscript𝜉𝑛𝑛3\xi_{n}=n/3italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = italic_n / 3. Thus, for instance, the Fourier transform of S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT gives the energy power spectrum that exhibits Kolmogorov scaling:

E(k)k53.similar-to𝐸𝑘superscript𝑘53E(k)\sim k^{-\frac{5}{3}}\ .italic_E ( italic_k ) ∼ italic_k start_POSTSUPERSCRIPT - divide start_ARG 5 end_ARG start_ARG 3 end_ARG end_POSTSUPERSCRIPT . (2)

It is established numerically and experimentally that Kolmogorov linear scaling is corrected by intermittency in the direct cascades. Kolomogorov scaling seems to hold in the two-dimensional inverse cascade, where the energy flows from the UV to the IR and the inertial range holds for rlfmuch-greater-than𝑟subscript𝑙𝑓r\gg l_{f}italic_r ≫ italic_l start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT.

A correspondence can be established between two-dimensional scaling and the local energy dissipation, ϵ(x)=ν2(ivj+jvi)2italic-ϵ𝑥𝜈2superscriptsubscript𝑖superscript𝑣𝑗subscript𝑗superscript𝑣𝑖2\epsilon(x)=\frac{\nu}{2}\left(\partial_{i}v^{j}+\partial_{j}v^{i}\right)^{2}italic_ϵ ( italic_x ) = divide start_ARG italic_ν end_ARG start_ARG 2 end_ARG ( ∂ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT + ∂ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Taking the normalized local spatial average of the energy dissipation over a ball with a d𝑑ditalic_d-dimensional radius, r𝑟ritalic_r, denoted as Bd(r)subscript𝐵𝑑𝑟B_{d}(r)italic_B start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ( italic_r ), and centered around a point x𝑥xitalic_x:

ϵr(x)=1Vol(Bd(r))|xx|rddxϵ(x),subscriptitalic-ϵ𝑟𝑥1𝑉𝑜𝑙subscript𝐵𝑑𝑟subscriptsuperscript𝑥𝑥𝑟superscript𝑑𝑑superscript𝑥italic-ϵsuperscript𝑥\epsilon_{r}(x)=\frac{1}{Vol(B_{d}(r))}\int_{|x^{\prime}-x|\leq r}d^{d}x^{% \prime}\epsilon(x^{\prime})\ ,italic_ϵ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( italic_x ) = divide start_ARG 1 end_ARG start_ARG italic_V italic_o italic_l ( italic_B start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ( italic_r ) ) end_ARG ∫ start_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_x | ≤ italic_r end_POSTSUBSCRIPT italic_d start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_ϵ ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , (3)

the ensemble averages according to the K41 theory satisfy:

ϵrnrτn,τn3=(ξnn3).formulae-sequencesimilar-todelimited-⟨⟩superscriptsubscriptitalic-ϵ𝑟𝑛superscript𝑟subscript𝜏𝑛subscript𝜏𝑛3subscript𝜉𝑛𝑛3\langle\epsilon_{r}^{n}\rangle\sim r^{\tau_{n}},~{}~{}~{}\tau_{\frac{n}{3}}=% \left(\xi_{n}-\frac{n}{3}\right)\ .⟨ italic_ϵ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⟩ ∼ italic_r start_POSTSUPERSCRIPT italic_τ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUPERSCRIPT , italic_τ start_POSTSUBSCRIPT divide start_ARG italic_n end_ARG start_ARG 3 end_ARG end_POSTSUBSCRIPT = ( italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - divide start_ARG italic_n end_ARG start_ARG 3 end_ARG ) . (4)

In the two-dimensional inverse cascade case studied in the present paper we expect τn=0subscript𝜏𝑛0\tau_{n}=0italic_τ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = 0.

II.2 Diffusion Generative Models

A DDPM ho2020denoising is a powerful probabilistic generative framework that has shown success in transforming and generating images (see yang2023diffusion for a review), by progressively injecting noise into the original data and subsequently reversing the process during sample generation. The process begins with the original data and perturbs it leading to noisy data. The goal is to transform the data distribution into a simple prior distribution. Given a data distribution 𝐱𝟎q(𝐱𝟎)similar-tosubscript𝐱0𝑞subscript𝐱0{\bf x_{0}}\sim q({\bf x_{0}})bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ∼ italic_q ( bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ), a sequence of random variables (RV), 𝐱𝟎,𝐱𝟏,𝐱𝐓subscript𝐱0subscript𝐱1subscript𝐱𝐓{\bf x_{0}},{\bf x_{1}},...{\bf x_{T}}bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT bold_1 end_POSTSUBSCRIPT , … bold_x start_POSTSUBSCRIPT bold_T end_POSTSUBSCRIPT, are generated from a Markov process with transition kernel q(𝐱𝐭|𝐱𝐭𝟏)𝑞conditionalsubscript𝐱𝐭subscript𝐱𝐭1q({\bf x_{t}}|{\bf x_{t-1}})italic_q ( bold_x start_POSTSUBSCRIPT bold_t end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT bold_t - bold_1 end_POSTSUBSCRIPT ). In a DDPM, the kernel is designed to transform the distribution q(𝐱𝟎)𝑞subscript𝐱0q({\bf x_{0}})italic_q ( bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ) into a Normal distribution,

q(𝐱𝐭|𝐱𝐭𝟏)=𝒩(𝐱𝐭;1βt𝐱𝐭𝟏,βt𝟏),𝑞conditionalsubscript𝐱𝐭subscript𝐱𝐭1𝒩subscript𝐱𝐭1subscript𝛽𝑡subscript𝐱𝐭1subscript𝛽𝑡1q({\bf x_{t}}|{\bf x_{t-1}})=\mathcal{N}({\bf x_{t}};\sqrt{1-\beta_{t}}{\bf x_% {t-1}},\beta_{t}{\bf 1})\ ,italic_q ( bold_x start_POSTSUBSCRIPT bold_t end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT bold_t - bold_1 end_POSTSUBSCRIPT ) = caligraphic_N ( bold_x start_POSTSUBSCRIPT bold_t end_POSTSUBSCRIPT ; square-root start_ARG 1 - italic_β start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_x start_POSTSUBSCRIPT bold_t - bold_1 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_1 ) , (5)

where βt(0,1)subscript𝛽𝑡01\beta_{t}\in(0,1)italic_β start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ ( 0 , 1 ) is a hyper-parameter. The joint distribution of the RVs can be factorized using the Markov property and the chain rule to get,

q(𝐱𝟏,𝐱𝐓|𝐱𝟎)=t=1Tq(𝐱𝐭|𝐱𝐭𝟏).𝑞subscript𝐱1conditionalsubscript𝐱𝐓subscript𝐱0superscriptsubscriptproduct𝑡1𝑇𝑞conditionalsubscript𝐱𝐭subscript𝐱𝐭1q({\bf x_{1}},...{\bf x_{T}}|{\bf x_{0}})=\prod_{t=1}^{T}q({\bf x_{t}}|{\bf x_% {t-1}})\ .italic_q ( bold_x start_POSTSUBSCRIPT bold_1 end_POSTSUBSCRIPT , … bold_x start_POSTSUBSCRIPT bold_T end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ) = ∏ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_q ( bold_x start_POSTSUBSCRIPT bold_t end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT bold_t - bold_1 end_POSTSUBSCRIPT ) . (6)

Now using the Gaussian kernel we can marginalize the joint distribution, we get

q(𝐱𝐭|𝐱𝟎)=𝒩(𝐱𝐭;μt𝐱𝟎,σt𝟏),𝑞conditionalsubscript𝐱𝐭subscript𝐱0𝒩subscript𝐱𝐭subscript𝜇𝑡subscript𝐱0subscript𝜎𝑡1q({\bf x_{t}}|{\bf x_{0}})=\mathcal{N}({\bf x_{t}};\mu_{t}{\bf x_{0}},\sigma_{% t}{\bf 1})\ ,italic_q ( bold_x start_POSTSUBSCRIPT bold_t end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ) = caligraphic_N ( bold_x start_POSTSUBSCRIPT bold_t end_POSTSUBSCRIPT ; italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_1 ) , (7)

where μt=1βtsubscript𝜇𝑡1subscript𝛽𝑡\mu_{t}=\sqrt[]{1-\beta_{t}}italic_μ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = square-root start_ARG 1 - italic_β start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG, and σt=1s=0tμs2subscript𝜎𝑡1superscriptsubscriptproduct𝑠0𝑡superscriptsubscript𝜇𝑠2\sigma_{t}=1-\prod_{s=0}^{t}\mu_{s}^{2}italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = 1 - ∏ start_POSTSUBSCRIPT italic_s = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_μ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. In this sense the forward process transforms the data distribution into a Normal distribution.

When generating new data samples using DDPMs, an unstructured noise vector is generated from the prior distribution. Since the prior distribution is typically chosen as a simple Gaussian distribution, obtaining this noise vector is straightforward. To gradually remove the noise from this noise vector and generate meaningful data, a learnable Markov chain operates in the reverse time direction. The reverse Markov chain consists of transition kernels parameterized by deep neural networks (a U-net in our case). These transition kernels are designed to undo the perturbations caused by the forward process and recover the original data

pθ(𝐱𝐭𝟏|𝐱𝐭)=𝒩(𝐱𝐭𝟏;μθ(𝐱𝐭,t),σθ(𝐱𝐭,t)),subscript𝑝𝜃conditionalsubscript𝐱𝐭1subscript𝐱𝐭𝒩subscript𝐱𝐭1subscript𝜇𝜃subscript𝐱𝐭𝑡subscript𝜎𝜃subscript𝐱𝐭𝑡p_{\theta}({\bf x_{t-1}}|{\bf x_{t}})=\mathcal{N}({\bf x_{t-1}};\mu_{\theta}({% \bf x_{t}},t),\sigma_{\theta}({\bf x_{t}},t))\ ,italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT bold_t - bold_1 end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT bold_t end_POSTSUBSCRIPT ) = caligraphic_N ( bold_x start_POSTSUBSCRIPT bold_t - bold_1 end_POSTSUBSCRIPT ; italic_μ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT bold_t end_POSTSUBSCRIPT , italic_t ) , italic_σ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT bold_t end_POSTSUBSCRIPT , italic_t ) ) , (8)

where θ𝜃\thetaitalic_θ are the model parameters tuned during the training process. The training process consists of minimizing the distance between the reverse process joint distribution pθ(𝐱𝟎,𝐱𝟏,,𝐱𝐓)subscript𝑝𝜃subscript𝐱0subscript𝐱1subscript𝐱𝐓p_{\theta}({\bf x_{0}},{\bf x_{1}},...,{\bf x_{T}})italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT bold_1 end_POSTSUBSCRIPT , … , bold_x start_POSTSUBSCRIPT bold_T end_POSTSUBSCRIPT ) and the forward process q(𝐱𝟎,𝐱𝟏,,𝐱𝐓)𝑞subscript𝐱0subscript𝐱1subscript𝐱𝐓q({\bf x_{0}},{\bf x_{1}},...,{\bf x_{T}})italic_q ( bold_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT bold_1 end_POSTSUBSCRIPT , … , bold_x start_POSTSUBSCRIPT bold_T end_POSTSUBSCRIPT ). To this end, the usual variational bound on negative log likelihood is optimized:

𝔼[logpθ(x0)]𝔼q[logpθ(x0:T)q(x1:T|x0)]𝔼delimited-[]subscript𝑝𝜃subscript𝑥0subscript𝔼𝑞delimited-[]subscript𝑝𝜃subscript𝑥:0𝑇𝑞conditionalsubscript𝑥:1𝑇subscript𝑥0\displaystyle\mathbb{E}[-\log p_{\theta}(x_{0})]\leq\mathbb{E}_{q}\left[-\log% \frac{p_{\theta}(x_{0:T})}{q(x_{1:T}|x_{0})}\right]blackboard_E [ - roman_log italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] ≤ blackboard_E start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT [ - roman_log divide start_ARG italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 0 : italic_T end_POSTSUBSCRIPT ) end_ARG start_ARG italic_q ( italic_x start_POSTSUBSCRIPT 1 : italic_T end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG ]
=𝔼q[logp(xT)t1logpθ(xt1|xt)q(xt|xt1)]absentsubscript𝔼𝑞delimited-[]𝑝subscript𝑥𝑇subscript𝑡1subscript𝑝𝜃conditionalsubscript𝑥𝑡1subscript𝑥𝑡𝑞conditionalsubscript𝑥𝑡subscript𝑥𝑡1\displaystyle=\mathbb{E}_{q}\left[-\log p(x_{T})-\sum_{t\geq 1}\log\frac{p_{% \theta}(x_{t-1}|x_{t})}{q(x_{t}|x_{t-1})}\right]= blackboard_E start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT [ - roman_log italic_p ( italic_x start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) - ∑ start_POSTSUBSCRIPT italic_t ≥ 1 end_POSTSUBSCRIPT roman_log divide start_ARG italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) end_ARG start_ARG italic_q ( italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT ) end_ARG ]
=𝔼q[logpθ(𝐱0)+t=1TKL(q(𝐱t|𝐱t1)pθ(𝐱t1|𝐱t))]\displaystyle=\mathbb{E}_{q}\left[-\log p_{\theta}({\bf x}_{0})+\sum_{t=1}^{T}% \text{KL}(q({\bf x}_{t}|{\bf x}_{t-1})\|p_{\theta}({\bf x}_{t-1}|{\bf x}_{t}))\right]= blackboard_E start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT [ - roman_log italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) + ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT KL ( italic_q ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT ) ∥ italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) ]
=:\displaystyle=:\mathcal{L}= : caligraphic_L

For normal Gaussian distributions, the KL divergence:

KL(q(𝐱t|𝐱t1)pθ(𝐱t1|𝐱t))=12(tr(σθ2σq,t2𝐈)+(μθμq,t)σθ2(μθμq,t)d+log|σθ2𝐈||σq,t2𝐈|),\text{KL}(q({\bf x}_{t}|{\bf x}_{t-1})\|p_{\theta}({\bf x}_{t-1}|{\bf x}_{t}))% =\frac{1}{2}\left(\text{tr}(\sigma_{\theta}^{-2}\sigma_{q,t}^{2}{\bf I})+(\mu_% {\theta}-\mu_{q,t})^{\top}\sigma_{\theta}^{-2}(\mu_{\theta}-\mu_{q,t})-d+\log% \frac{|\sigma_{\theta}^{2}{\bf I}|}{|\sigma_{q,t}^{2}{\bf I}|}\right),KL ( italic_q ( bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT ) ∥ italic_p start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT | bold_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( tr ( italic_σ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT italic_σ start_POSTSUBSCRIPT italic_q , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I ) + ( italic_μ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT - italic_μ start_POSTSUBSCRIPT italic_q , italic_t end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT italic_σ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT ( italic_μ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT - italic_μ start_POSTSUBSCRIPT italic_q , italic_t end_POSTSUBSCRIPT ) - italic_d + roman_log divide start_ARG | italic_σ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I | end_ARG start_ARG | italic_σ start_POSTSUBSCRIPT italic_q , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I | end_ARG ) , (9)

where d𝑑ditalic_d is the dimensionality of the Gaussian distributions, and for each time step t𝑡titalic_t. The loss function (θ)𝜃\mathcal{L}(\theta)caligraphic_L ( italic_θ ) is thus composed of the expected negative log likelihood of the data under the model and the sum of KL divergences across all timesteps, which measures the discrepancy between the forward and reverse transition probabilities.

III Methodology

III.1 Model and Architecture

We adopt a fairly standard diffusion model architecture based on the U-Net with components from the hugginface diffusers library von-platen-etal-2022-diffusers . Refer to Figure 1 for a diagram of the Markov chain model, where the U-Net architecture is employed to parameterize the p kernel. The input/output image size is 256×256256256256\times 256256 × 256 with 7 downsampling and upsampling blocks. The 6th downsampling and the corresponding 2nd upsampling block have in addition spatial self-attention. The respective number of channels is 128,128,256,256,512,512,10241281282562565125121024128,128,256,256,512,512,1024128 , 128 , 256 , 256 , 512 , 512 , 1024, which is comparable to diffusion models generating real-world images. The overall number of trainable parameters is 28.22×10728.22superscript10728.22\times 10^{7}28.22 × 10 start_POSTSUPERSCRIPT 7 end_POSTSUPERSCRIPT.

X0subscript𝑋0X_{0}italic_X start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT...Xt1subscript𝑋𝑡1X_{t-1}italic_X start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPTXtsubscript𝑋𝑡X_{t}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPTq𝑞qitalic_qp𝑝pitalic_pq𝑞qitalic_qp𝑝pitalic_pq𝑞qitalic_qp𝑝pitalic_p
Figure 1: Forward noising and backward denoising Markov chains.

III.2 Training Datasets

We generated turbulent data by solving NS equations on a uniform spatial grid spanning a domain of Lx=Ly=2πsubscript𝐿𝑥subscript𝐿𝑦2𝜋L_{x}=L_{y}=2\piitalic_L start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = italic_L start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT = 2 italic_π as in whittaker2022neural . Initialized with v=(0,0)𝑣00v=(0,0)italic_v = ( 0 , 0 ), this system underwent numerical evolution with periodic boundary conditions. To drive turbulence, we applied a divergence-free, statistically homogeneous, and isotropic Gaussian random forcing function within an annulus in Fourier space centered at kfsubscript𝑘𝑓k_{f}italic_k start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT. For numerical reasons, the second-order viscous term in eq. (1) was replaced with a hyperviscous term, as discussed in 2dTurbReview . We use a dealiased spectral method code with Crank-Nicolson time stepping specmeth .

Our dataset consists of 5000500050005000 snapshots from an ensemble of ten simulations, each with a resolution of 512×512512512512\times 512512 × 512 pixels and a forcing parameter of kf40similar-tosubscript𝑘𝑓40k_{f}\sim 40italic_k start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ∼ 40. Upon evolution, the system reached a steady state, as shown in the left panel of Fig. 2. At this state, we observed the 5/353-5/3- 5 / 3 scaling of the energy power spectrum, marking the system’s transition to turbulence. This observation is illustrated in the right panel of Fig. 2. This quantity is computed as the mean over the ensemble and time slices. We note that 2D turbulence has the possibility of producing double cascades as shown in doi:10.1063/1.1762301 and numerically produced in PhysRevLett.81.2244 ; PhysRevE.82.016307 , though we do not generate the direct cascade in our simulations.

The inertial range of the simulations was determined by initially calculating the third-order structure function, which is expected to exhibit a scaling behavior such that S3(r)rsimilar-tosubscript𝑆3𝑟𝑟S_{3}(r)\sim ritalic_S start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_r ) ∼ italic_r. We designate the inertial range as the interval where this scaling relation fits optimally.

Due to the DDPM’s memory constraints, which restricts it to displaying and generating images of 256×256256256256\times 256256 × 256 resolution, we downscaled our simulation data. Additionally, we converted the data values from floating-point numbers to integers in the 0-255 range. This resizing was performed using bilinear interpolation onto a 256×256256256256\times 256256 × 256 grid.

Refer to caption
Refer to caption
Figure 2: Left: Evolution in time of the fluid energy, highlighting the attainment of steady state. Right: Energy power spectrum showcasing the 5/353-5/3- 5 / 3 scaling with standard deviation in the shaded regions, indicative of turbulence state in the inertial range.

III.3 Training Procedure

The diffusion model is trained for 50 epochs with a batch size of 16, an AdamW optimizer loshchilov2019decoupled , base learning rate 1e41𝑒41e-41 italic_e - 4 and cosine learning rate scheduler with a warm-up of 500 steps. Gradient norm is clipped to 1.0. Automatic mixed precision is employed.

The training data comprises 5000500050005000 256×256256256256\times 256256 × 256 vorticity images. During training, each image is rotated by mutliples of 90 degrees and/or mirror reflected. For the investigation of memorization presented in section IV.2, this augmentation procedure is turned off and the image is used without any rotation or reflection.

IV Results

IV.1 Generated Samples

Refer to caption
Refer to caption
Figure 3: Sample images from the training set (left), and generated by the diffusion model (right).

In Fig. 3, we show a sample 256×256256256256\times 256256 × 256 image from the vorticity profiles generated using the numerical simulations described in section III.2 and an image generated by the trained diffusion model. The generated images look very similar to the real ones, so as to be basically indistinguishable by eye. We observe only a relatively large variation in overall lightness of the generated images. This might be due to accumulating overall systematic shifts in the generation procedure (which requires iterative evaluation of the neural network). However, as the precise linear mapping between pixel intensities and values of vorticity is not essential for extracting the statistics of turbulence (and also varies in our training set), we did not attempt to ameliorate this behaviour. Indeed, as we show in subsequent analysis, various normalization independent quantitative characteristics of turbulent vorticity profiles are very well reproduced in the images generated by the diffusion model.

IV.2 A Test for Memorization

Refer to caption

Refer to caption

Figure 4: The histogram of cosine distances between 16 generated samples and the 5000 training images (left) and a pair of the most similar sample and training image (right).

An important requirement for the application of neural network models for generating new samples of turbulence profiles is that the generated samples are genuinely new and not just memorized images from the training data. Judging by experience with generative neural network and real world images we do not expect this to be a problem. Nevertheless, we perform a quantitative test, as it is much more difficult to judge by eye the similarity of turbulence profiles in contrast to e.g. celebrity faces.

In order to measure the similarity of generated images to the training data we use the standard cosine distance between the vectors of pixel intensities obtained by flattening the 2D images:

cosinedistance(v1,v2)=1v1v2|v1||v2|𝑐𝑜𝑠𝑖𝑛𝑒𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒subscript𝑣1subscript𝑣21subscript𝑣1subscript𝑣2subscript𝑣1subscript𝑣2cosine\ distance(v_{1},v_{2})=1-\frac{v_{1}\cdot v_{2}}{|v_{1}||v_{2}|}italic_c italic_o italic_s italic_i italic_n italic_e italic_d italic_i italic_s italic_t italic_a italic_n italic_c italic_e ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = 1 - divide start_ARG italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⋅ italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG | italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | | italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | end_ARG (10)

Such a pixelwise comparison is appropriate as a test of memorization. To perform this experiment we turned off image augmentation and trained the diffusion model without any subsequent reflection or rotation. This somewhat reduced the diversity of the training dataset, making the danger of overfitting (or memorization) more acute.

We generated a batch of 16 images and evaluated their cosine distance to all the 5000 training images. The histogram is shown in Fig. 4 (left) and we see that all the distances are centered around 1 – which is the extreme distance in this metric. We also identified the most similar pair of generated and training images which we show in Fig. 4 (right). The two images are globally clearly different. Hence, the diffusion model indeed generates genuinely novel samples.

IV.3 Inverse Cascade

We begin by assessing the capability of the DDPM in replicating characteristics of the energy cascade, using numerical simulations as a benchmark. The dataset features an inverse cascade, characterized by a 5/353-5/3- 5 / 3 scaling within the inertial range. In Figure 5(a), both the mean and variance of the energy spectrum of the ensemble derived from the DDPM closely match those of the numerical simulation, particularly evident in the 5/353-5/3- 5 / 3 scaling within the inertial range. Notwithstanding this agreement, discrepancies are observed at the lower wavenumbers.

For a more comprehensive analysis, slopes within the inertial range were examined across varying sample set sizes to quantify deviation from the 5/353-5/3- 5 / 3 scaling. The scaling is measured within the range highlighted in Fig. 5(a) by the orange slope. The measured error for the numerical simulation and the DDPM is depicted in Fig. 5(b) with the standard error of the linear fits. We employed bootstrapping to gain a more granular understanding of the errors. This involved 5000 iterations, each with sample sizes of 1000. The findings from this exercise are depicted in Fig. 5(c). We find good agreement between the DDPM and numerical distribution indicating the machine has successfully learnt the distribution.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Figure 5: (a) Energy cascade from the numerical simulation and the DDPM, emphasizing the 5/353-5/3- 5 / 3 slope in the inertial range with a noted discrepancy at lower wavenumbers. The standard deviation is shown in the shaded regions. The vertical line indicates the forcing scale. (b) Measured slope error with standard errors of the fits across varying set sizes, revealing the pronounced influence of the selected range. (c) Measured slope errors using bootstraping.

IV.4 Structure Function

To delve deeper into the statistics of the turbulent flows, we turn our attention to velocity structure functions, defined as:

Sn(r)=(δv(r))n,subscript𝑆𝑛𝑟delimited-⟨⟩superscript𝛿𝑣𝑟𝑛S_{n}(r)=\langle(\delta v(r))^{n}\rangle\ ,italic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_r ) = ⟨ ( italic_δ italic_v ( italic_r ) ) start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⟩ , (11)

where averaging occurs over all positions x𝑥xitalic_x within a specific velocity profile image and then extends to various turbulence realizations. The energy power spectrum is the Fourier transform of S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and as discussed in the background section, we expect Sn(r)rn/3similar-tosubscript𝑆𝑛𝑟superscript𝑟𝑛3S_{n}(r)\sim r^{n/3}italic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_r ) ∼ italic_r start_POSTSUPERSCRIPT italic_n / 3 end_POSTSUPERSCRIPT.

Our primary data sources, namely the images from the DDPM and the numerical simulations, provides vorticity. To correlate this with our structure functions, we first derive the velocities from these vorticity profiles, as elaborated in App. A. When deriving this observable from images, whether they are machine-generated or sourced from the diffusion model, it’s crucial to acknowledge a linear transformation between pixel intensities and vorticity:

intensity(x)=αv(x)+β.𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝑥𝛼𝑣𝑥𝛽intensity(x)=\alpha\cdot v(x)+\beta\ .italic_i italic_n italic_t italic_e italic_n italic_s italic_i italic_t italic_y ( italic_x ) = italic_α ⋅ italic_v ( italic_x ) + italic_β . (12)

This transformation might differ across images. As such, to ensure consistency, we normalize the above correlator by Sn(r=1px)subscript𝑆𝑛𝑟1𝑝𝑥S_{n}(r=1px)italic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_r = 1 italic_p italic_x ) where r=1𝑟1r=1italic_r = 1 pixel.

In Fig. 6, we present a comparative analysis of:

1nlogSn(r)Sn(1px),1𝑛subscript𝑆𝑛𝑟subscript𝑆𝑛1𝑝𝑥\frac{1}{n}\log\left\langle\frac{S_{n}(r)}{S_{n}(1{px})}\right\rangle,divide start_ARG 1 end_ARG start_ARG italic_n end_ARG roman_log ⟨ divide start_ARG italic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_r ) end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( 1 italic_p italic_x ) end_ARG ⟩ , (13)

for n=2,3𝑛23n=2,3italic_n = 2 , 3. We see that both the numerical and DDPM results agree fairly well. We evaluate the intermittency parameter, defined as S4(r)/(S2(r))2subscript𝑆4𝑟superscriptsubscript𝑆2𝑟2S_{4}(r)/(S_{2}(r))^{2}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ( italic_r ) / ( italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_r ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, and observe a minor slope, suggesting the anticipated independence from r𝑟ritalic_r. The standard deviation for n=2,3,4𝑛234n=2,3,4italic_n = 2 , 3 , 4 is also computed, demonstrating agreement between the numerical and DDPM ensemble results. Additionally, we compare the probability distribution functions of δv𝛿𝑣\delta vitalic_δ italic_v (refer to Fig. 7) across two distances and find that the resulting distributions are nearly indistinguishable.

Refer to caption
Refer to caption
Refer to caption
Figure 6: Second (top left) and third (top right) order structure function for both simulations and the machines results with standard deviation in the shaded regions. The dashed orange line shows the 2/3 slope and linear slope for the second and third moments respectively. Bottom left: S4(r)/(S2(r))2subscript𝑆4𝑟superscriptsubscript𝑆2𝑟2S_{4}(r)/(S_{2}(r))^{2}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ( italic_r ) / ( italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_r ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is consistent with no intermittency at the inertial range of scales, as evidenced by a nearly flat slope, 0.0003similar-toabsent0.0003\sim 0.0003∼ 0.0003, in this range.
Refer to caption
Refer to caption
Figure 7: The distribution of δv𝛿𝑣\delta vitalic_δ italic_v at r=2𝑟2r=2italic_r = 2 (left) and r=4𝑟4r=4italic_r = 4 (right). We compute the KL divergence between the distributions and find for r=2𝑟2r=2italic_r = 2, DKL=0.033subscript𝐷𝐾𝐿0.033D_{KL}=0.033italic_D start_POSTSUBSCRIPT italic_K italic_L end_POSTSUBSCRIPT = 0.033 while for r=4𝑟4r=4italic_r = 4, DKL=0.007subscript𝐷𝐾𝐿0.007D_{KL}=0.007italic_D start_POSTSUBSCRIPT italic_K italic_L end_POSTSUBSCRIPT = 0.007

IV.5 Local Energy Dissipation

Finally, we assessed the local energy dissipation as detailed in eq. 3 for n=1𝑛1n=1italic_n = 1. This assessment was conducted over a sample of 150 images with random x𝑥xitalic_x sampling. In Fig. 8, a comparison of the numerical simulation and the DDPM outcomes is presented. Notably, there’s a consistent agreement between both datasets within the inertial range, aligning with theory (eq. 4). We observe a significant variance stemming from the limited number of samples available. However, the variance gradually diminishes as we approach the inertial range.

Refer to caption
Figure 8: Energy dissipation. Eq. 3 estimated with Monte Carlo over 150 snapshots with the standard deviation of the ensemble is shown in the shaded region. We observe that both datasets produce a similar result in the inertial range. The local energy dissipation is independent of r𝑟ritalic_r as expected from eq. 4.

V Discussion

In this paper we have employed a generative diffusion neural network model (DDPM) to generate snapshots of 2D turbulence. The DDPM was trained on vorticity profiles extracted from direct numerical simulations of the incompressible NS equations in two spatial dimensions. We verified that the generated samples exhibit the key statistical properties of turbulence, namely the -5/3 scaling of the energy cascade (Fig. 5), the behaviour of the second, third and fourth structure function (Fig. 6) as well as supporting the conjectured behaviour of energy dissipation (Fig. 8). In all cases, the statistics of the generated data follow rather closely the properties extracted from direct numerical simulations.

These results indicate that deep learning diffusion generative models may serve as a useful tool for learning statistics of turbulence and creating proxy independent flow profiles, which can be used to increase statistics for analyzing turbulence. We note that the diffusion models essentially work “out of the box” and can generate very realistic turbulent profiles. This is in contrast to other standard deep learning generative approaches such as Generative Adversarial Networks (GAN) and Variational Auto-Encoders (VAE). Prior to this work we tested a state of the art GAN network (StyleGAN) and some variants of variational autoencoders but could not attain sufficiently realistic vorticity profiles. The diffusion models seem to work much better in this respect. We have to emphasize, however, that whether a particular generative model works better or worse is really an empirical question given our current knowledge of deep learning, and may depend on the details of the specific use case. Indeed there are examples where GANs seem to work quite well for convective turbulence heyder2023generative .

We note that the flexibility of the diffusion models in mimicking the training data may be sometimes a two-edged sword. The statistics of the generated data mimic, by construction, the statistics of the training data including all non-universal particularities of the data, like behaviour away from the inertial range etc. or any deviations like non-fully developed turbulence. Therefore, the generative model may not be able to cure possible systematic deficiencies of the data used for training. This can set a high bar for the quality of the simulation data used for training the generative diffusion model w.r.t to the physical properties that we are interested in studying.

Our application of the diffusion model to the study of statistical turbulence is limited by the dataset of solutions to the Navier-Stokes equations. In particular, it would be desirable to include turbulent fluid solutions at higher Reynolds numbers which will enlarge the inertial range of scales.

Acknowledgments. This research was enabled in part by support provided by Calcul Québec (calculquebec.ca) and the Digital Research Alliance of Canada (alliancecan.ca). R.J. was supported by the research project Bio-inspired artificial neural networks (grant no. POIR.04.04.00-00-14DE/18-00) within the Team-Net program of the Foundation for Polish Science co-financed by the European Union under the European Regional Development Fund and by a grant from the Priority Research Area DigiWorld under the Strategic Programme Excellence Initiative at Jagiellonian University. The work of Y.O is supported by the ISF center of excellence and the U.S-Israel Binational Science Foundation.

References

  • (1) U. Frisch, Turbulence: The Legacy of A. N. Kolmogorov, Cambridge University Press (1995), 10.1017/CBO9781139170666.
  • (2) R. Benzi, S. Ciliberto, C. Baudet and G.R. Chavarria, On the scaling of three-dimensional homogeneous and isotropic turbulence, Physica D: Nonlinear Phenomena 80 (1995) 385.
  • (3) S.Y. Chen, B. Dhruva, S. Kurien, K.R. Sreenivasan and M.A. Taylor, Anomalous scaling of low-order structure functions of turbulent velocity, Journal of Fluid Mechanics 533 (2005) 183–192.
  • (4) L. Biferale, F. Bonaccorso, M. Buzzicotti and K.P. Iyer, Self-similar subgrid-scale models for inertial range turbulence and accurate measurements of intermittency, Physical Review Letters 123 (2019) .
  • (5) Z.-S. She and E. Leveque, Universal scaling laws in fully developed turbulence, Phys. Rev. Lett. 72 (1994) 336.
  • (6) V. Yakhot, Mean-field approximation and a small parameter in turbulence theory, Phys. Rev. E 63 (2001) 026307.
  • (7) C. Eling and Y. Oz, The anomalous scaling exponents of turbulence in general dimension from random geometry, Journal of High Energy Physics 2015 (2015) 1.
  • (8) Y. Oz, Spontaneous Symmetry Breaking, Conformal Anomaly and Incompressible Fluid Turbulence, JHEP 11 (2017) 040 [1707.07855].
  • (9) C. Drygala, B. Winhart, F. di Mare and H. Gottschalk, Generative modeling of turbulence, Physics of Fluids 34 (2022) 035114.
  • (10) D. Tretiak, A.T. Mohan and D. Livescu, Physics-constrained generative adversarial networks for 3d turbulence, 2022.
  • (11) T. Li, M. Buzzicotti, L. Biferale and F. Bonaccorso, Generative adversarial networks to infer velocity components in rotating turbulent flows, The European Physical Journal E 46 (2023) 31.
  • (12) D. Shu, Z. Li and A.B. Farimani, A physics-informed diffusion model for high-fidelity flow field reconstruction, Journal of Computational Physics 478 (2023) 111972.
  • (13) G. Yang and S. Sommer, A denoising diffusion model for fluid field prediction, 2023.
  • (14) K. Fukami, K. Fukagata and K. Taira, Super-resolution reconstruction of turbulent flows with machine learning, Journal of Fluid Mechanics 870 (2019) 106–120.
  • (15) Z. Zhou, B. Li, X. Yang and Z. Yang, A robust super-resolution reconstruction model of turbulent flow data based on deep learning, Computers & Fluids 239 (2022) 105382.
  • (16) T. Li, L. Biferale, F. Bonaccorso, M.A. Scarpolini and M. Buzzicotti, Synthetic lagrangian turbulence by generative diffusion models, 2023.
  • (17) A. Mohan, D. Daniel, M. Chertkov and D. Livescu, Compressed convolutional lstm: An efficient deep learning framework to model high fidelity 3d turbulence, 2019.
  • (18) R. King, O. Hennigh, A. Mohan and M. Chertkov, From deep to physics-informed learning of turbulence: Diagnostics, 2018.
  • (19) S. Pandey, J. Schumacher and K.R. Sreenivasan, A perspective on machine learning in turbulent flows, Journal of Turbulence 21 (2020) 567 [https://doi.org/10.1080/14685248.2020.1757685].
  • (20) A.A. Moghaddam and A. Sadaghiyani, A deep learning framework for turbulence modeling using data assimilation and feature extraction, 2018.
  • (21) B. Li, Z. Yang, X. Zhang, G. He, B.-Q. Deng and L. Shen, Using machine learning to detect the turbulent region in flow past a circular cylinder, Journal of Fluid Mechanics 905 (2020) A10.
  • (22) M. Buzzicotti and F. Bonaccorso, Inferring turbulent environments via machine learning, The European Physical Journal E 45 (2022) 102.
  • (23) P. Clark Di Leoni, A. Mazzino and L. Biferale, Inferring flow parameters and turbulent configuration with physics-informed data assimilation and spectral nudging, Phys. Rev. Fluids 3 (2018) 104604.
  • (24) M. Lellep, J. Prexl, B. Eckhardt and M. Linkmann, Interpreted machine learning in fluid dynamics: explaining relaminarisation events in wall-bounded shear flows, Journal of Fluid Mechanics 942 (2022) A2.
  • (25) T. Whittaker, R.A. Janik and Y. Oz, Neural network complexity of chaos and turbulence, The European Physical Journal E 46 (2023) 57.
  • (26) G. Kohl, L.-W. Chen and N. Thuerey, Turbulent flow simulation using autoregressive conditional diffusion models, 2023.
  • (27) R. Apte, S. Nidhan, R. Ranade and J. Pathak, Diffusion model based data generation for partial differential equations, 2023.
  • (28) M. Lienen, D. Lüdke, J. Hansen-Palmus and S. Günnemann, From zero to turbulence: Generative modeling for 3d flow simulation, 2023.
  • (29) A.N. Kolmogorov, The local structure of turbulence in incompressible viscous fluid for very large reynolds numbers, Cr Acad. Sci. URSS 30 (1941) 301.
  • (30) J. Ho, A. Jain and P. Abbeel, Denoising diffusion probabilistic models, Advances in neural information processing systems 33 (2020) 6840.
  • (31) L. Yang, Z. Zhang, Y. Song, S. Hong, R. Xu, Y. Zhao et al., Diffusion models: A comprehensive survey of methods and applications, ACM Comput. Surv. (2023) .
  • (32) P. von Platen, S. Patil, A. Lozhkov, P. Cuenca, N. Lambert, K. Rasul et al., “Diffusers: State-of-the-art diffusion models.” https://github.com/huggingface/diffusers, 2022.
  • (33) G. Boffetta and R.E. Ecke, Two-dimensional turbulence, Annual Review of Fluid Mechanics 44 (2012) 427 [https://doi.org/10.1146/annurev-fluid-120710-101240].
  • (34) C. Canuto, M. Hussaini, A. Quarteroni and T. Zang, Spectral methods. evolution to complex geometries and applications to fluid dynamics, .
  • (35) R.H. Kraichnan, Inertial ranges in two‐dimensional turbulence, The Physics of Fluids 10 (1967) 1417 [https://aip.scitation.org/doi/pdf/10.1063/1.1762301].
  • (36) M.A. Rutgers, Forced 2d turbulence: Experimental evidence of simultaneous inverse energy and forward enstrophy cascades, Phys. Rev. Lett. 81 (1998) 2244.
  • (37) G. Boffetta and S. Musacchio, Evidence for the double cascade scenario in two-dimensional turbulence, Phys. Rev. E 82 (2010) 016307.
  • (38) I. Loshchilov and F. Hutter, Decoupled weight decay regularization, in International Conference on Learning Representations, 2019, https://openreview.net/forum?id=Bkg6RiCqY7.
  • (39) F. Heyder, J.P. Mellado and J. Schumacher, Generative convective parametrization of dry atmospheric boundary layer, 2023.

Appendix A Vorticity Notation

In a two-dimensional, incompressible flow, the velocity components vxsubscript𝑣𝑥v_{x}italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT and vysubscript𝑣𝑦v_{y}italic_v start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT can be described using a streamfunction ψ𝜓\psiitalic_ψ as:

vx=ψy,subscript𝑣𝑥𝜓𝑦\displaystyle v_{x}=\frac{\partial\psi}{\partial y},\quaditalic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = divide start_ARG ∂ italic_ψ end_ARG start_ARG ∂ italic_y end_ARG , vy=ψx,subscript𝑣𝑦𝜓𝑥\displaystyle v_{y}=-\frac{\partial\psi}{\partial x}\ ,italic_v start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT = - divide start_ARG ∂ italic_ψ end_ARG start_ARG ∂ italic_x end_ARG , (14)

and the vorticity, denoted by ω𝜔\omegaitalic_ω, is given by:

ω=vyxvxy=2ψ.𝜔subscript𝑣𝑦𝑥subscript𝑣𝑥𝑦superscript2𝜓\omega=\frac{\partial v_{y}}{\partial x}-\frac{\partial v_{x}}{\partial y}=-% \nabla^{2}\psi.italic_ω = divide start_ARG ∂ italic_v start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT end_ARG start_ARG ∂ italic_x end_ARG - divide start_ARG ∂ italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_ARG start_ARG ∂ italic_y end_ARG = - ∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_ψ . (15)

The kinetic energy per unit mass for an incompressible flow is given by:

E=12(vx2+vy2).𝐸12superscriptsubscript𝑣𝑥2superscriptsubscript𝑣𝑦2E=\frac{1}{2}(v_{x}^{2}+v_{y}^{2})\ .italic_E = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_v start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) . (16)

Using the expressions for vxsubscript𝑣𝑥v_{x}italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT and vysubscript𝑣𝑦v_{y}italic_v start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT from above:

E𝐸\displaystyle Eitalic_E =12((ψy)2+(ψx)2)absent12superscript𝜓𝑦2superscript𝜓𝑥2\displaystyle=\frac{1}{2}\left(\left(\frac{\partial\psi}{\partial y}\right)^{2% }+\left(-\frac{\partial\psi}{\partial x}\right)^{2}\right)= divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( ( divide start_ARG ∂ italic_ψ end_ARG start_ARG ∂ italic_y end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( - divide start_ARG ∂ italic_ψ end_ARG start_ARG ∂ italic_x end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )
=12((ψy)2+(ψx)2).absent12superscript𝜓𝑦2superscript𝜓𝑥2\displaystyle=\frac{1}{2}\left(\left(\frac{\partial\psi}{\partial y}\right)^{2% }+\left(\frac{\partial\psi}{\partial x}\right)^{2}\right).= divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( ( divide start_ARG ∂ italic_ψ end_ARG start_ARG ∂ italic_y end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( divide start_ARG ∂ italic_ψ end_ARG start_ARG ∂ italic_x end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) .

Using the definition of the Laplacian operator, 2ψ=2ψx2+2ψy2superscript2𝜓superscript2𝜓superscript𝑥2superscript2𝜓superscript𝑦2\nabla^{2}\psi=\frac{\partial^{2}\psi}{\partial x^{2}}+\frac{\partial^{2}\psi}% {\partial y^{2}}∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_ψ = divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_ψ end_ARG start_ARG ∂ italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG + divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_ψ end_ARG start_ARG ∂ italic_y start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG, and the expression for vorticity, we can express the energy spectrum E(k)𝐸𝑘E(k)italic_E ( italic_k ) in the wave number space as:

E(k)=12k2|ω(k)|2.𝐸𝑘12superscript𝑘2superscript𝜔𝑘2E(k)=\frac{1}{2k^{2}}|\omega(k)|^{2}.italic_E ( italic_k ) = divide start_ARG 1 end_ARG start_ARG 2 italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG | italic_ω ( italic_k ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (17)