[1]\fnmIacopo \surTirelli

[1]\orgdivDepartment of Aerospace Engineering, \orgnameUniversidad Carlos III de Madrid, \orgaddress\streetAvda. Universidad 30, \cityLeganés, \postcode28911, \stateMadrid, \countrySpain

2]\orgdivEnvironmental and Applied Fluid Dynamics, \orgnamevon Karman Institute for Fluid Dynamics, \orgaddress\streetWaterloosesteenweg 72, \citySint-Genesius-Rode, \postcode1640, \stateBruxelles, \countryBelgium

A meshless method to compute the proper orthogonal decomposition and its variants from scattered data

iacopo.tirelli@uc3m.es \scalerel*    \fnmMiguel Alfonso \surMendez mendez@vki.ac.be \scalerel*    \fnmAndrea \surIaniro aianiro@ing.uc3m.es \scalerel*    \fnmStefano \surDiscetti sdiscett@ing.uc3m.es \scalerel* * [
Abstract

Complex phenomena can be better understood when broken down into a limited number of simpler “components”. Linear statistical methods such as the principal component analysis and its variants are widely used across various fields of applied science to identify and rank these components based on the variance they represent in the data. These methods can be seen as factorizations of the matrix collecting all the data, which are assumed to be a collection of time series sampled from fixed points in space. However, when data sampling locations vary over time, as with mobile monitoring stations in meteorology and oceanography or with particle tracking velocimetry in experimental fluid dynamics, advanced interpolation techniques are required to project the data onto a fixed grid before carrying out the factorization. This interpolation is often expensive and inaccurate. This work proposes a method to decompose scattered data without interpolating. The approach is based on physics-constrained radial basis function regression to compute inner products in space and time. The method provides an analytical and mesh-independent decomposition in space and time, demonstrating higher accuracy than the traditional approach. Our results show that it is possible to distill the most relevant “components” even for measurements whose natural output is a distribution of data scattered in space and time, maintaining high accuracy and mesh independence.

keywords:
POD, RBF, meshless algorithm, PCA, EOF, KL transform, modal decomposition

1 Introduction

Methods from dimensionality reduction and pattern identification have become ubiquitous in many applied sciences, allowing streamlining data analysis, simplifying complex datasets, highlighting essential features and uncovering critical patterns and relationships [1, 2].

The simplest and fundamental approach to linear dimensionality reduction is the Principal Component Analysis (PCA), originally formulated in multivariate statistics [3, 4] and known as Karhunen-Loève transform (KL) in mathematics [5], Empirical Orthogonal Functions (EOF) in meteorology and climatology [6, 7, 8] and Proper Orthogonal Decomposition (POD) in fluid mechanics [9, 10, 11].

The PCA simplifies large datasets by decomposing them into a set of linearly uncorrelated variables known as principal components, ordered by the amount of variance they capture from the data. Its simplicity, versatility and effectiveness have made the PCA one of the crucial first steps in many data analysis workflows, and many extensions and variants have been proposed [12, 13, 14, 15, 16].

Beyond multivariate statistical analysis, the principal components are often expected to provide information about the spatial correlation of time series. In fluid mechanics, for example, the extension of PCA onto POD aimed at using such dimensionality reduction methods to identify coherent structures in turbulent flows [9, 1, 17]. In computational physics, the Galerkin projection of a PDE onto these modes is the cornerstone to reduce the computational cost of large simulations [18], to build a reduced order model that enables model-based control [19, 20] and to derive more efficient Large Eddy Simulation formulations [21]. In climatology, the EOF [22, 23, 2] has been developed to identify and analyze dominant patterns of variability in climate data and to reduce the dimensionality of large climate datasets.

In all these applications, the leading patterns are identified as eigenvectors of the correlation matrices, computed in space or in time. However, the computation of these matrices implies the definition of a grid, to approximate the integral in the continuous formulation of the problems (see [24, 25]). A major research effort in both fluid mechanics and climatology is the problem of handling missing data from these grids [26, 27, 28, 29]. Common approaches to handle these cases are the Gappy POD, initially introduced in [26], Data Interpolating EOF (DINEOF, [30]), or probabilistic PCA (PPCA, [31]). An extensive overview of gap-filling methods is provided in [32, 33].

When the level of missing data is extreme, the concept of a grid becomes irrelevant and data could be treated as randomly scattered. This is the natural settings in image velocimetry measurements carried out with particle tracking methods [34, 35, 36], in the measurement of wind or concentration fields from limited stations [32, 33] or in the interpolation of geological data from limited locations [37].

This article proposes a grid-less approach to the computation of POD/EOF, removing the need for a grid. The idea consists of computing correlation matrices of scattered data using constrained Radial Basis Functions (RBFs) as introduced in [38], together with a Gauss-Legendre quadrature for the integration. We show that the approach provides higher accuracy than the usual interpolation-based approach and provides an analytical (mesh-independent) representation of the spatial structures. The theoretical background of the methodology is detailed in §2, while the validation is presented in §3 on two different synthetic datasets: an analytical sinusoidal test case and the wake of a fluidic pinball. An experimental validation is carried out using the precipitation data collected from satellite microwave observations.

2 Methodology

The proposed methodology follows the classic snapshot-based POD by Sirovich [11], but replaces all inner products with an RBF formulation that requires no grid and thus no interpolation. The data can be scattered both in space and time. The final result is a linear decomposition of the data in the form:

u(x,ti)=r=1rcσrϕr(x)ψr(t),𝑢xsubscript𝑡𝑖subscriptsuperscriptsubscript𝑟𝑐𝑟1subscript𝜎𝑟subscriptitalic-ϕ𝑟xsubscript𝜓𝑟𝑡u(\textbf{x},t_{i})=\sum^{r_{c}}_{r=1}\sigma_{r}\phi_{r}(\textbf{x})\psi_{r}(t% )\,,italic_u ( x , italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = ∑ start_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT italic_σ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( x ) italic_ψ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( italic_t ) , (1)

where rcsubscript𝑟𝑐r_{c}italic_r start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT is the rank, σrsubscript𝜎𝑟\sigma_{r}italic_σ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT are scalars defining the amplitude of each contribution, ϕr(x)subscriptitalic-ϕ𝑟x\phi_{r}(\textbf{x})italic_ϕ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( x ) and ψr(t)subscript𝜓𝑟𝑡\psi_{r}(t)italic_ψ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( italic_t ) are the spatial and the temporal structures of the POD modes. The methodology is made of four steps:

Step 1: Analytical snapshot representation with RBF

Starting from scattered data available for the different time realizations, it is possible to obtain an analytical approximation of the field for each time instant as a linear combination of a set of basis functions. This approximation is denoted as:

u~(x,ti)=q=1Nbwq(ti)γq(x),~𝑢xsubscript𝑡𝑖superscriptsubscript𝑞1subscript𝑁𝑏subscript𝑤𝑞subscript𝑡𝑖subscript𝛾𝑞x\tilde{u}(\textbf{x},t_{i})=\sum_{q=1}^{N_{b}}w_{q}(t_{i})\gamma_{q}(\textbf{x% }),over~ start_ARG italic_u end_ARG ( x , italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_q = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_γ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ( x ) , (2)

where Nbsubscript𝑁𝑏N_{b}italic_N start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT is the number of basis functions used in the approximation, γqsubscript𝛾𝑞\gamma_{q}italic_γ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT is the qthsuperscript𝑞𝑡q^{th}italic_q start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT regression basis, and wqsubscript𝑤𝑞w_{q}italic_w start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT are the corresponding weights. The vector x contains the coordinates on which the data are evaluated at the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT time instant tisubscript𝑡𝑖t_{i}italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT among the total Ntsubscript𝑁𝑡N_{t}italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT time instances. With no loss of generality, the basis functions employed in this work are thin-plate RBF. These only require the definition of the collocation points and have no shape factor parameters. We use the sampling locations xpsubscriptx𝑝\textbf{x}_{p}x start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT as collocation points. Therefore, the current approach has no hyper-parameters to tune, needs no user input and only requires extra memory storage for the weights. Future work will explore more advanced formulations and the use of physics constraints as in Sperotto et al. [38].

Step 2: Temporal correlation matrix

The temporal correlation matrix KNt×NtKsuperscriptsubscript𝑁𝑡subscript𝑁𝑡\textbf{K}\in\mathbb{R}^{N_{t}\times N_{t}}K ∈ blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT × italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT is defined as the matrix whose elements are the inner products in space, denoted as ,ssubscript𝑠\langle\cdot,\cdot\rangle_{s}⟨ ⋅ , ⋅ ⟩ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT, between all the snapshots:

K=[u1,u1su1,u2su1,uNtsu2,u1su2,u2su2,uNtsuNt,u1suNt,u2suNt,uNts]Kmatrixsubscriptsubscript𝑢1subscript𝑢1𝑠subscriptsubscript𝑢1subscript𝑢2𝑠subscriptsubscript𝑢1subscript𝑢𝑁𝑡𝑠subscriptsubscript𝑢2subscript𝑢1𝑠subscriptsubscript𝑢2subscript𝑢2𝑠subscriptsubscript𝑢2subscript𝑢𝑁𝑡𝑠subscriptsubscript𝑢𝑁𝑡subscript𝑢1𝑠subscriptsubscript𝑢𝑁𝑡subscript𝑢2𝑠subscriptsubscript𝑢𝑁𝑡subscript𝑢𝑁𝑡𝑠\textbf{K}=\begin{bmatrix}\langle u_{1},u_{1}\rangle_{s}&\langle u_{1},u_{2}% \rangle_{s}&\dots&\langle u_{1},u_{Nt}\rangle_{s}\\ \langle u_{2},u_{1}\rangle_{s}&\langle u_{2},u_{2}\rangle_{s}&\dots&\langle u_% {2},u_{Nt}\rangle_{s}\\ \vdots&\vdots&\ddots&\vdots\\ \langle u_{Nt},u_{1}\rangle_{s}&\langle u_{Nt},u_{2}\rangle_{s}&\dots&\langle u% _{Nt},u_{Nt}\rangle_{s}\\ \end{bmatrix}K = [ start_ARG start_ROW start_CELL ⟨ italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_CELL start_CELL ⟨ italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_CELL start_CELL … end_CELL start_CELL ⟨ italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT italic_N italic_t end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⟨ italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_CELL start_CELL ⟨ italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_CELL start_CELL … end_CELL start_CELL ⟨ italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT italic_N italic_t end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL ⟨ italic_u start_POSTSUBSCRIPT italic_N italic_t end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_CELL start_CELL ⟨ italic_u start_POSTSUBSCRIPT italic_N italic_t end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_CELL start_CELL … end_CELL start_CELL ⟨ italic_u start_POSTSUBSCRIPT italic_N italic_t end_POSTSUBSCRIPT , italic_u start_POSTSUBSCRIPT italic_N italic_t end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] (3)

In the original formulation of the POD [9], the inner product in space or time was defined in terms of correlation of square-integrable real-valued functions, that is:

Kij=1ΩΩu~(x,ti)u~(x,tj)𝑑Ω,subscript𝐾𝑖𝑗1normΩsubscriptΩ~𝑢xsubscript𝑡𝑖~𝑢xsubscript𝑡𝑗differential-dΩK_{ij}=\frac{1}{\parallel\Omega\parallel}\int_{\Omega}\tilde{u}(\textbf{x},t_{% i})~{}\tilde{u}(\textbf{x},t_{j})~{}d\Omega\,,italic_K start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG ∥ roman_Ω ∥ end_ARG ∫ start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT over~ start_ARG italic_u end_ARG ( x , italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) over~ start_ARG italic_u end_ARG ( x , italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) italic_d roman_Ω , (4)

with ΩΩ\Omegaroman_Ω the area (in 2D) or volume (in 3D) of the spatial domain considered. If the spatial domain is partitioned in uniform Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT elements of area (or volume) ΔΩΔΩ\Delta\Omegaroman_Δ roman_Ω, indexed by k[0,Ns1]𝑘0subscript𝑁𝑠1k\in[0,N_{s}-1]italic_k ∈ [ 0 , italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT - 1 ] the most natural approximation of Eq. 4 reads

Kij1NsΔΩku~(xk,ti)u~(xk,tj)ΔΩ=1Ns𝐮jT𝐮i,subscript𝐾𝑖𝑗1subscript𝑁𝑠ΔΩsubscript𝑘~𝑢subscriptx𝑘subscript𝑡𝑖~𝑢subscriptx𝑘subscript𝑡𝑗ΔΩ1subscript𝑁𝑠subscriptsuperscript𝐮𝑇𝑗subscript𝐮𝑖K_{ij}\approx\frac{1}{N_{s}\Delta\Omega}\sum_{k}\tilde{u}(\textbf{x}_{k},t_{i}% )~{}\tilde{u}(\textbf{x}_{k},t_{j})~{}\Delta\Omega=\frac{1}{N_{s}}\mathbf{u}^{% T}_{j}\mathbf{u}_{i}\,,italic_K start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ≈ divide start_ARG 1 end_ARG start_ARG italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT roman_Δ roman_Ω end_ARG ∑ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT over~ start_ARG italic_u end_ARG ( x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) over~ start_ARG italic_u end_ARG ( x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) roman_Δ roman_Ω = divide start_ARG 1 end_ARG start_ARG italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_ARG bold_u start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , (5)

where 𝐮i,𝐮jNssubscript𝐮𝑖subscript𝐮𝑗superscriptsubscript𝑁𝑠\mathbf{u}_{i},\mathbf{u}_{j}\in\mathbb{R}^{N_{s}}bold_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_POSTSUPERSCRIPT are the vectors collecting the data sampled at time steps tisubscript𝑡𝑖t_{i}italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and tjsubscript𝑡𝑗t_{j}italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and corresponding to each of the Nssubscript𝑁𝑠N_{s}italic_N start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT portions of the spatial domain. In this setting, the correlation matrix can be computed using simple matrix multiplication as K=UTUKsuperscriptU𝑇U\textbf{K}=\textbf{U}^{T}\textbf{U}K = U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT U. Eq. 5 is an approximation of Eq. 4 using the mid-point rule [39], that is assuming a piecewise-constant approximation of the velocity field.

POD algorithms for gridded data use Eq. 5. Interpolation-based formulations seek to bring the scattered data onto a uniform grid so that Eq. 5 can still be used. We here propose to use the original version in Eq. 4, leveraging the regression in Eq. 2 from step 1 and a quadrature method that interrogates the integrand on specific quadrature points. With no loss of generality, we here use Gauss-Legendre quadrature because of its excellent trade-off between accuracy and computational costs. It is worth emphasizing that quadrature methods based on analytic approximations of the integrand yield higher accuracy compared to the mid-point rule implied in Eq. 5.

Step 3: Computing Temporal Structures

The temporal structure of POD modes are eigenvectors of the temporal correlation matrix. Therefore, given the temporal correlation matrix these reads

K=ΨΣ2ΨT,KsuperscriptΨΣ2superscriptΨ𝑇\textbf{K}=\textbf{$\Psi$}\textbf{$\Sigma$}^{2}\textbf{$\Psi$}^{T},K = bold_Ψ bold_Σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_Ψ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , (6)

where the matrix ΨNt×rcΨsuperscriptsubscript𝑁𝑡subscript𝑟𝑐\textbf{$\Psi$}\in\mathbb{R}^{N_{t}\times r_{c}}roman_Ψ ∈ blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT × italic_r start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT end_POSTSUPERSCRIPT collects the eigenvectors along its columns and ΣΣ\Sigmaroman_Σ is a diagonal matrix whose elements σisubscript𝜎𝑖\sigma_{i}italic_σ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT represent the mode amplitudes (cf. Eq. 1). This step is identical to the classic snapshot POD for gridded data.

Step 4: Computing Spatial Structures

The rthsuperscript𝑟𝑡r^{th}italic_r start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT spatial structure ϕrsubscriptitalic-ϕ𝑟\phi_{r}italic_ϕ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT is the result of projecting the dataset u~(x,t)~𝑢xt\tilde{u}(\textbf{x},\textbf{t})over~ start_ARG italic_u end_ARG ( x , t ) onto the temporal structure ψrsubscript𝜓𝑟\psi_{r}italic_ψ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT. This project requires an inner product in time. The same discussion in Step 2 now applies to the inner product in the time domain, which reads

ϕr(x)=1σru~(x,t),ψr(t)=1σrTtu~(x,t)ψr(t)𝑑t1σrNtk=1Ntu~(x,tk)ψr(tk),subscriptitalic-ϕ𝑟x1subscript𝜎𝑟~𝑢x𝑡subscript𝜓𝑟t1subscript𝜎𝑟𝑇subscript𝑡~𝑢xtsubscript𝜓𝑟tdifferential-d𝑡1subscript𝜎𝑟subscript𝑁𝑡subscriptsuperscriptsubscript𝑁𝑡𝑘1~𝑢xsubscriptt𝑘subscript𝜓𝑟subscriptt𝑘\phi_{r}(\textbf{x})=\frac{1}{\sigma_{r}}\langle\tilde{u}(\textbf{x},t),\psi_{% r}(\textbf{t})\rangle=\frac{1}{\sigma_{r}T}\int_{t}\tilde{u}(\textbf{x},% \textbf{t})~{}\psi_{r}(\textbf{t})~{}dt\,\approx\frac{1}{\sigma_{r}N_{t}}\sum^% {N_{t}}_{k=1}\tilde{u}(\textbf{x},\textbf{t}_{k})\psi_{r}(\textbf{t}_{k}),italic_ϕ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( x ) = divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT end_ARG ⟨ over~ start_ARG italic_u end_ARG ( x , italic_t ) , italic_ψ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( t ) ⟩ = divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT italic_T end_ARG ∫ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT over~ start_ARG italic_u end_ARG ( x , t ) italic_ψ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( t ) italic_d italic_t ≈ divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG ∑ start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT over~ start_ARG italic_u end_ARG ( x , t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) italic_ψ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) , (7)

assuming that the time domain is t[0,T]𝑡0𝑇t\in[0,T]italic_t ∈ [ 0 , italic_T ]. Here both the temporal structures and the regression of the snapshots are available on a discrete set of times tNttsuperscriptsubscript𝑁𝑡\textbf{t}\in\mathbb{R}^{N_{t}}t ∈ blackboard_R start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT. Although advanced quadratures could also be used in Eq. 7 together with an analytic regression in time, the gain in accuracy was deemed not sufficient to justify the increase in computational cost. Therefore, in this work, this projection is carried out via mid-point approximation in time.

We stress that Eq. 7 holds for any set of points x. Hence Eq. 7 enables super-resolution of the POD modes. In this work, we compute it over a uniform grid for plotting purposes and compare it with interpolation-based approaches.

3 Validation

3.1 Analytical testcase

Refer to caption
Figure 1: Root mean square error of the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT temporal mode ψisubscript𝜓𝑖\psi_{i}italic_ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (δRMSψisubscript𝛿𝑅𝑀subscript𝑆subscript𝜓𝑖\delta_{RMS_{\psi_{i}}}italic_δ start_POSTSUBSCRIPT italic_R italic_M italic_S start_POSTSUBSCRIPT italic_ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT) as function of d^^𝑑\hat{d}over^ start_ARG italic_d end_ARG for the analytical testcase. The values are normalized with the standard deviation of its corresponding reference modes ψrefisubscript𝜓𝑟𝑒subscript𝑓𝑖\psi_{ref_{i}}italic_ψ start_POSTSUBSCRIPT italic_r italic_e italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT. The red curve with star markers depicts the meshless POD, while the blue curve with square markers represents the gridded POD.
\begin{overpic}[scale={1},unit=1mm]{figs/Analytical_Spatial_modes_comparison_N% .png} \par\put(7.0,88.0){\parbox{71.13188pt}{\centering{{Reference POD}}% \@add@centering}} \put(45.0,88.0){\parbox{71.13188pt}{\centering{{Gridded POD}}\@add@centering}} \put(81.0,88.0){\parbox{71.13188pt}{\centering{{Meshless POD}}\@add@centering}% } \put(0.0,0.0){\parbox{28.45274pt}{\centering{{b)}}\@add@centering}} \put(0.0,92.0){\parbox{28.45274pt}{\centering{{a)}}\@add@centering}} \par\end{overpic}
Refer to caption
Figure 2: a) Comparison of the spatial modes ϕisubscriptitalic-ϕ𝑖\phi_{i}italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for the analytical test case. The reference POD is represented in the third row, while the gridded and the meshless approaches are displayed in the first and the second row, respectively. Only the first two modes are pictured. Case d^=0.2^𝑑0.2\hat{d}=0.2over^ start_ARG italic_d end_ARG = 0.2 b) Root mean square error of the first two spatial modes ϕitalic-ϕ\phiitalic_ϕ (δRMSϕisubscript𝛿𝑅𝑀subscript𝑆subscriptitalic-ϕ𝑖\delta_{RMS_{\phi_{i}}}italic_δ start_POSTSUBSCRIPT italic_R italic_M italic_S start_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT) as a function of d^^𝑑\hat{d}over^ start_ARG italic_d end_ARG. The values are normalized with the standard deviation of its corresponding reference modes ϕrefisubscriptitalic-ϕ𝑟𝑒subscript𝑓𝑖\phi_{ref_{i}}italic_ϕ start_POSTSUBSCRIPT italic_r italic_e italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT. The dashed circles highlight the case pictured on the top part of the figure.

A synthetic test case was first considered for benchmarking the proposed method on a problem for which the POD is analytically available. This allows for isolating all sources of errors. In this test case, we consider a spatial domain of length Lxsubscript𝐿𝑥L_{x}italic_L start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT and Lysubscript𝐿𝑦L_{y}italic_L start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT that spans x,y[0,12]𝑥𝑦012x,y\in[0,12]italic_x , italic_y ∈ [ 0 , 12 ], discretized with 100100100100 grid points. Two 2D sinusoidal orthogonal waves have been chosen as spatial structures, with a leading wavenumber kx=ky=3subscript𝑘𝑥subscript𝑘𝑦3k_{x}=k_{y}=3italic_k start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = italic_k start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT = 3 in both x𝑥xitalic_x and y𝑦yitalic_y direction:

ϕ1(x,y)=sin(kxx)sin(kyy),ϕ2(x,y)=sin(2kxx)sin(2kyy).formulae-sequencesubscriptitalic-ϕ1𝑥𝑦subscript𝑘𝑥𝑥subscript𝑘𝑦𝑦subscriptitalic-ϕ2𝑥𝑦2subscript𝑘𝑥𝑥2subscript𝑘𝑦𝑦\phi_{1}(x,y)=\sin\left(k_{x}x\right)\sin\left(k_{y}y\right)\,,\,\phi_{2}(x,y)% =\sin\left(2k_{x}x\right)\sin\left(2k_{y}y\right).italic_ϕ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_x , italic_y ) = roman_sin ( italic_k start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_x ) roman_sin ( italic_k start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT italic_y ) , italic_ϕ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_x , italic_y ) = roman_sin ( 2 italic_k start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_x ) roman_sin ( 2 italic_k start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT italic_y ) . (8)

Similarly, the associated temporal structures are sampled over a period T=10𝑇10T=10italic_T = 10 discretized with 500500500500 time instants:

ψ1=sin(4πTt),ψ2=sin(2πTt).formulae-sequencesubscript𝜓14𝜋𝑇𝑡subscript𝜓22𝜋𝑇𝑡\psi_{1}=\sin\left(\frac{4\pi}{T}\cdot t\right)\,,\,\psi_{2}=\sin\left(\frac{2% \pi}{T}t\right).italic_ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = roman_sin ( divide start_ARG 4 italic_π end_ARG start_ARG italic_T end_ARG ⋅ italic_t ) , italic_ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = roman_sin ( divide start_ARG 2 italic_π end_ARG start_ARG italic_T end_ARG italic_t ) . (9)

Assigning an amplitude σ1=10subscript𝜎110\sigma_{1}=10italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 10 and σ2=5subscript𝜎25\sigma_{2}=5italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 5 to each mode respectively, the synthetic dataset D is built as a linear combination of these structures:

D=σ1ϕ1ψ1T+σ2ϕ2ψ2T,Dsubscript𝜎1subscriptitalic-ϕ1superscriptsubscript𝜓1𝑇subscript𝜎2subscriptitalic-ϕ2superscriptsubscript𝜓2𝑇\textbf{D}=\sigma_{1}\phi_{1}\psi_{1}^{T}+\sigma_{2}\phi_{2}\psi_{2}^{T},D = italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , (10)

whose analytical modes represent the ground truth. It is easy to show that the modes with spatial structures (8) and temporal structures (9) are indeed POD modes, being orthogonal both in space and time according to the standard inner product. We consider them as “reference POD”.

A parametric study involving the influence of the sparsity of data has been carried out. To this end, we define the parameter d^^𝑑\hat{d}over^ start_ARG italic_d end_ARG to refer to the different cases, which represents the average distance among the “sensors” normalized with the reference lengthscale lrefsubscript𝑙𝑟𝑒𝑓l_{ref}italic_l start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT:

d^=lrefNppp,^𝑑subscript𝑙𝑟𝑒𝑓subscript𝑁𝑝𝑝𝑝\hat{d}=\frac{l_{ref}}{\sqrt{N_{ppp}}},over^ start_ARG italic_d end_ARG = divide start_ARG italic_l start_POSTSUBSCRIPT italic_r italic_e italic_f end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_N start_POSTSUBSCRIPT italic_p italic_p italic_p end_POSTSUBSCRIPT end_ARG end_ARG , (11)

where Npppsubscript𝑁𝑝𝑝𝑝N_{ppp}italic_N start_POSTSUBSCRIPT italic_p italic_p italic_p end_POSTSUBSCRIPT represents the sensor density. In this specific application, the reference length-scale is the smallest achievable wavelength λ=Lx2kxsuperscript𝜆subscript𝐿𝑥2subscript𝑘𝑥\lambda^{*}=\frac{L_{x}}{2k_{x}}italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = divide start_ARG italic_L start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_ARG start_ARG 2 italic_k start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_ARG.

The different datasets are obtained for 0.1<d^<0.50.1^𝑑0.50.1<\hat{d}<0.50.1 < over^ start_ARG italic_d end_ARG < 0.5, whose corresponding number of particles is computed inverting Eq. 11. These datasets serve as test cases for the meshless POD computation, and the corresponding results are denoted as “meshless POD”. The mapping onto a regular grid is performed using a moving average, with bin sizes determined ensuring approximately 10101010 sensor within each bin. This is a common practice in particle-image-based measurements, for instance, and introduces a high degree of robustness to noise and outliers, although at the expenses of spatial resolution. From this mapping, the classic matrix-factorization approach for the POD is employed. In what follows, we refer to the results of this approach on the gridded data as “gridded POD”.

Fig.1 illustrates the root mean square error of the temporal structures ψisubscript𝜓𝑖\psi_{i}italic_ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (δRMSψisubscript𝛿𝑅𝑀subscript𝑆subscript𝜓𝑖\delta_{RMS_{\psi_{i}}}italic_δ start_POSTSUBSCRIPT italic_R italic_M italic_S start_POSTSUBSCRIPT italic_ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT), normalized by the square root of the number of samples Ntsubscript𝑁𝑡N_{t}italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, for different values of d^^𝑑\hat{d}over^ start_ARG italic_d end_ARG. This test highlights the superior accuracy of the meshless approach compared to the traditional one, even for sparser datasets. A similar trend is observed for the spatial structures depicted in Fig.2.

In Fig.2.a, a visual comparison is presented for the case d^=0.2^𝑑0.2\hat{d}=0.2over^ start_ARG italic_d end_ARG = 0.2 between the reconstructed spatial structures and the analytical ones. It is evident from this figure, particularly in the second mode, that the gridded approach filters out smaller scales more aggressively. This observation is further confirmed by the plot shown in Fig.2.b. As for the temporal case, the δRMSϕisubscript𝛿𝑅𝑀subscript𝑆subscriptitalic-ϕ𝑖\delta_{RMS_{\phi_{i}}}italic_δ start_POSTSUBSCRIPT italic_R italic_M italic_S start_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT is normalized with the standard deviation of the corresponding reference modes.

The benchmark on this toy problem, despite its simplicity, enabled us to eliminate external sources of error in the data originating from simulations or experiments. This process allowed us to evaluate the suitability of the proposed methodology when compared with the traditional approach.

3.2 Fluidic pinball

Refer to caption
Figure 3: Root mean square error of the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT temporal mode ψisubscript𝜓𝑖\psi_{i}italic_ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (δRMSψisubscript𝛿𝑅𝑀subscript𝑆subscript𝜓𝑖\delta_{RMS_{\psi_{i}}}italic_δ start_POSTSUBSCRIPT italic_R italic_M italic_S start_POSTSUBSCRIPT italic_ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT) as function of d^^𝑑\hat{d}over^ start_ARG italic_d end_ARG for the fludic pinball testcase. The values are normalized with the standard deviation of its corresponding reference modes ψrefisubscript𝜓𝑟𝑒subscript𝑓𝑖\psi_{ref_{i}}italic_ψ start_POSTSUBSCRIPT italic_r italic_e italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT. The red curve with star markers depicts the meshless POD, while the blue curve with square markers represents the gridded POD.
\begin{overpic}[scale={1},unit=1mm]{figs/Spatial_modes_comparison_N.png} \par\put(50.0,83.0){\parbox{113.81102pt}{\centering{{Reference POD}}% \@add@centering}} \put(50.0,64.0){\parbox{113.81102pt}{\centering{{Gridded POD}}\@add@centering}% } \put(50.0,45.0){\parbox{113.81102pt}{\centering{{Meshless POD}}\@add@centering% }} \put(0.0,0.0){\parbox{28.45274pt}{\centering{{b)}}\@add@centering}} \put(0.0,85.0){\parbox{28.45274pt}{\centering{{a)}}\@add@centering}} \par\end{overpic}
Refer to caption
Figure 4: a) Comparison of the spatial modes ϕisubscriptitalic-ϕ𝑖\phi_{i}italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT associated to the streamwise velocity component u𝑢uitalic_u for the fluidic pinball. The reference POD is represented in the third row, while the gridded and the meshless approaches are displayed in the first and the second row respectively. Only the first three modes are pictured. Case d^=0.6^𝑑0.6\hat{d}=0.6over^ start_ARG italic_d end_ARG = 0.6 b) Root mean square error of the first three spatial modes ϕisubscriptitalic-ϕ𝑖\phi_{i}italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (δRMSϕisubscript𝛿𝑅𝑀subscript𝑆subscriptitalic-ϕ𝑖\delta_{RMS_{\phi_{i}}}italic_δ start_POSTSUBSCRIPT italic_R italic_M italic_S start_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT) as a function of d^^𝑑\hat{d}over^ start_ARG italic_d end_ARG. The values are normalized with the standard deviation of its corresponding reference modes ϕrefisubscriptitalic-ϕ𝑟𝑒subscript𝑓𝑖\phi_{ref_{i}}italic_ϕ start_POSTSUBSCRIPT italic_r italic_e italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT. The dashed circles highlight the case pictured on the top part of the figure. The red curve with star markers depicts the meshless approach, while the blue curve with square markers represents the gridded method.

The wake of a fluidic pinball is a popular test case in fluid mechanics. A pinball consists of a configuration of three cylinders with diameter D𝐷Ditalic_D, with axes positioned at the vertices of an equilateral triangle with side 3D/23𝐷23D/23 italic_D / 2. The pinball is invested by a uniform constant wind of velocity Usubscript𝑈U_{\infty}italic_U start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT. Direct Numerical Simulation (DNS) data from Deng et al. [40] is used for this case. The DNS data consists of an unstructured grid with 3536353635363536 points within the domain x/D[1,11]𝑥𝐷111x/D\in[1,11]italic_x / italic_D ∈ [ 1 , 11 ] and y/D[2,2]𝑦𝐷22y/D\in[-2,2]italic_y / italic_D ∈ [ - 2 , 2 ] (with x,y𝑥𝑦x,yitalic_x , italic_y being the stream-wise and cross-wise directions, respectively). The triangle formed by the cylinder axis points upstream, with the downstream edge orthogonal to the mean flow direction. The midpoint of the downstream edge is located at (0,0)00(0,0)( 0 , 0 ). The flow kinematic viscosity ν𝜈\nuitalic_ν is set to have a Reynolds number Re=UD/ν=150𝑅𝑒subscript𝑈𝐷𝜈150Re=U_{\infty}D/\nu=150italic_R italic_e = italic_U start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT italic_D / italic_ν = 150, in which the flow exhibits a chaotic behavior [40].

We reproduce the standard configuration of Particle Tracking Velocimetry (PTV) measurements. To this purpose, it is assumed that the domain is observed by a camera with a resolution of 32323232 pixels per diameter. The velocity fields are randomly sampled (as if they were originated from particle measurements) at different concentrations, resulting in a mean distance in the range 0.1<d^<10.1^𝑑10.1<\hat{d}<10.1 < over^ start_ARG italic_d end_ARG < 1. The cylinder diameter D𝐷Ditalic_D has been used here as reference lengthscale. A baseline for the POD computation is obtained by interpolation of the DNS data on a fine grid with spacing Δx=D/16Δ𝑥𝐷16\Delta x=D/16roman_Δ italic_x = italic_D / 16, and computation of the POD with the traditional matrix factorization approach.

Since the dataset is statistically stationary, focus is placed on the decomposition of the fluctuating component of the velocity field. To ensure a fair comparison across the methodologies, an ensemble high-resolution mean is subtracted from all the datasets as explained in Tirelli et al. [41].

Fig. 3 shows the root mean square error of the first six temporal structures ψisubscript𝜓𝑖\psi_{i}italic_ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (δRMSψisubscript𝛿𝑅𝑀subscript𝑆subscript𝜓𝑖\delta_{RMS_{\psi_{i}}}italic_δ start_POSTSUBSCRIPT italic_R italic_M italic_S start_POSTSUBSCRIPT italic_ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT), normalized with the square root of the number of samples Ntsubscript𝑁𝑡N_{t}italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, for different d^^𝑑\hat{d}over^ start_ARG italic_d end_ARG. This kind of comparison is limited to the most energetic modes. Higher-order modes have generally smaller energy difference, which might result in different sorting for different decomposition methods. The illustrated modes suggest that the meshless approach closely follows the reference. The overall trend indicates that an increase in the average distance among sensors corresponds to an increase in the error. Notably, the gridded approach is more susceptible to this effect. To accommodate the larger distances, the method tends to augment the bin size to ensure a sufficient number of sampling points within the bin. Nonetheless, this strategy unintentionally filters out smaller scales and exacerbates spatial resolution limitations. On the other hand, the meshless regression appears more robust to the detrimental effect of increasing the sparsity of the sensors. It is worth remarking that values of δRMSψi>1subscript𝛿𝑅𝑀subscript𝑆subscript𝜓𝑖1\delta_{RMS_{\psi_{i}}}>1italic_δ start_POSTSUBSCRIPT italic_R italic_M italic_S start_POSTSUBSCRIPT italic_ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT > 1 might also be originating from the comparison among temporal structures that are not synchronized with the reference ones due to different sorting of modes with similar energy, as discussed before.

The first 3333 spatial modes ϕisubscriptitalic-ϕ𝑖\phi_{i}italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT associated with the stream-wise velocity component U𝑈Uitalic_U are analysed in Fig. 4.a. The contours in the top part of this figure are representative of the case d^=0.6^𝑑0.6\hat{d}=0.6over^ start_ARG italic_d end_ARG = 0.6, as highlighted by the dashed circles in the bottom part of the picture. These contours show that the gridded POD exhibits more attenuated modes if compared to the ones computed with the meshless approach. In Fig. 4.b the δRMSϕisubscript𝛿𝑅𝑀subscript𝑆subscriptitalic-ϕ𝑖\delta_{RMS_{\phi_{i}}}italic_δ start_POSTSUBSCRIPT italic_R italic_M italic_S start_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT of these modes normalized with the rms of the corresponding reference modes is reported as a function of d^^𝑑\hat{d}over^ start_ARG italic_d end_ARG. Similarly to the temporal modes, the meshless approach shows superior robustness to increased data sparsity.

3.3 C3S precipitation from satellite microwave observations

Refer to caption
Figure 5: Root mean square error of the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT temporal mode ψisubscript𝜓𝑖\psi_{i}italic_ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (δRMSψisubscript𝛿𝑅𝑀subscript𝑆subscript𝜓𝑖\delta_{RMS_{\psi_{i}}}italic_δ start_POSTSUBSCRIPT italic_R italic_M italic_S start_POSTSUBSCRIPT italic_ψ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT) as function of d^^𝑑\hat{d}over^ start_ARG italic_d end_ARG for the C3S precipitation testcase. The values are normalized with the standard deviation of its corresponding reference modes ψrefisubscript𝜓𝑟𝑒subscript𝑓𝑖\psi_{ref_{i}}italic_ψ start_POSTSUBSCRIPT italic_r italic_e italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT. The red curve with star markers depicts the meshless POD, while the blue curve with square markers represents the gridded POD.
\begin{overpic}[scale={1},unit=1mm]{figs/Prep_Spatial_modes_comparison_N.png} \par\put(40.0,105.0){\parbox{113.81102pt}{\centering{{Reference POD}}% \@add@centering}} \put(40.0,79.0){\parbox{113.81102pt}{\centering{{Gridded POD}}\@add@centering}% } \put(40.0,53.5){\parbox{113.81102pt}{\centering{{Meshless POD}}\@add@centering% }} \put(0.0,0.0){\parbox{28.45274pt}{\centering{{b)}}\@add@centering}} \put(0.0,106.0){\parbox{28.45274pt}{\centering{{a)}}\@add@centering}} \par\end{overpic}
Refer to caption
Figure 6: a) Comparison of the spatial modes ϕisubscriptitalic-ϕ𝑖\phi_{i}italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT associated to the anomaly of daily precipitation. The reference POD is represented in the third row, while the gridded and the meshless approach are displayed in the first and the second row respectively. Only the first three modes are pictured. Case d^=0.025^𝑑0.025\hat{d}=0.025over^ start_ARG italic_d end_ARG = 0.025 b) Root mean square error of the first three spatial modes ϕisubscriptitalic-ϕ𝑖\phi_{i}italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (δRMSϕisubscript𝛿𝑅𝑀subscript𝑆subscriptitalic-ϕ𝑖\delta_{RMS_{\phi_{i}}}italic_δ start_POSTSUBSCRIPT italic_R italic_M italic_S start_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT) as a function of d^^𝑑\hat{d}over^ start_ARG italic_d end_ARG. The values are normalized with the standard deviation of its corresponding reference modes ϕrefisubscriptitalic-ϕ𝑟𝑒subscript𝑓𝑖\phi_{ref_{i}}italic_ϕ start_POSTSUBSCRIPT italic_r italic_e italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT.

The experimental testcase is based on the “Copernicus Climate Change Service (C3S) Climate Data Store (CDS): Precipitation monthly and daily gridded data from 2000 to 2017 derived from satellite microwave observations.”([42], DOI:10.24381/cds.ada9c583). This dataset provides global estimates of daily accumulated precipitation and monthly mean precipitation. These estimates are derived from a combination of passive microwave observations obtained from two different types of radiometers on various Low Earth Orbit satellites. Firstly, there are the conically scanning microwave imagers, which undergo processing utilizing methodologies established by the Hamburg Ocean Atmosphere Parameters and Fluxes from Satellite initiative, as part of the Satellite Application Facility on Climate Monitoring. Secondly, the cross-track scanning microwave sounders whose data undergoes processing through the Passive microwave Neural network Precipitation Retrieval for Climate Applications algorithm.

The main variable in this dataset is the precipitation 𝒫𝒫\mathcal{P}caligraphic_P, expressed in mm/day𝑚𝑚𝑑𝑎𝑦mm/dayitalic_m italic_m / italic_d italic_a italic_y, which represents the water-equivalent volume rate per area and per day of atmospheric water in liquid or solid phase reaching the Earth’s surface. The original data are on a uniform grid with a spacing of 1superscript11^{\circ}1 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT, where latitudinal and longitudinal coordinates span [90,90]superscript90superscript90[-90^{\circ},90^{\circ}][ - 90 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT , 90 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT ] and [180,180]superscript180superscript180[-180^{\circ},180^{\circ}][ - 180 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT , 180 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT ] respectively. The ground-truth for this application is a downsampled version of the original data on a coarser grid with a spacing of 2superscript22^{\circ}2 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT, restricted to the region [50,50]superscript50superscript50[-50^{\circ},50^{\circ}][ - 50 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT , 50 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT ] in longitude (LON) and [0,90]superscript0superscript90[0^{\circ},90^{\circ}][ 0 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT , 90 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT ] in latitude (LAT). The observations are made daily and cover the period from 2000 to 2017, leading to a total of 6576657665766576 days (treated here as snapshots in the POD formulation).

To simulate sparse sensor acquisition from this data, values for d^=[0.015,0.02,0.025,0.03,0.035]^𝑑0.0150.020.0250.030.035\hat{d}=[0.015,0.02,0.025,0.03,0.035]over^ start_ARG italic_d end_ARG = [ 0.015 , 0.02 , 0.025 , 0.03 , 0.035 ] are interpolated for random positions. To compute these values of d^^𝑑\hat{d}over^ start_ARG italic_d end_ARG, the latitude and longitude coordinates have been converted into radians, aligning with the reference length scale of the Earth’s radius. These sensor distributions are then employed to perform gridded and meshless POD, as in the previous sections.

Fig. 5 displays the root mean square error of the first 6666 temporal structures, normalized as usual with the square root of the number of snapshots. In Fig. 6, a comparison among the spatial structures for different sparsity levels is proposed. Specifically, Fig. 6.b shows the overall comparison, while Fig. 6.a highlights the specific case of d^=0.02^𝑑0.02\hat{d}=0.02over^ start_ARG italic_d end_ARG = 0.02.

4 Conclusions

We propose a novel meshless approach to compute spatial and temporal modes from scattered measurements eliminating the need for interpolation onto a structured fixed grid. Key enablers are RBF regression and advanced quadrature techniques to compute inner products in space and time, providing an analytic (mesh-independent) representation of the modes. By avoiding the modulation effects associated with the unnecessary mapping of scattered measurements onto a Cartesian grid, our method offers advantages in accuracy and efficiency. Furthermore, the quadrature methods enabled by RBF regression allow for more precise computation of the temporal correlation matrix and, consequently, the temporal eigenvectors. We demonstrate that our method enables the recovery of scales filtered out by binning onto a regular grid.

This work serves as the first step towards a broader endeavour. The concepts presented here can be readily extended to the different variants and applications discussed in previous sections. We envisage future developments to incorporate physical constraints in the regression process, extension to a wide range of modal decompositions, and applications of the meshless analytical treatment to method such as Galerkin projections, among others.

Acknowledgment

This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement No 949085). Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them.

References

  • \bibcommenthead
  • Holmes et al. [1996] Holmes, P., Lumley, J.L., Berkooz, G.: Turbulence, Coherent Structures, Dynamical Systems and Symmetry. Cambridge Monographs on Mechanics. Cambridge University Press, Cambridge (1996). https://doi.org/10.1017/CBO9780511622700
  • Martinson [2018] Martinson, D.G.: Empirical Orthogonal Function (EOF) Analysis, pp. 495–534. Cambridge University Press, Cambridge (2018). https://doi.org/10.1017/9781139342568.016
  • Pearson [1901] Pearson, K.: Principal components analysis. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 6(2), 559 (1901) https://doi.org/10.1080/14786440109462720
  • Hotelling [1936] Hotelling, H.: Simplified calculation of principal components. Psychometrika 1(1), 27–35 (1936) https://doi.org/10.1007/BF02287921
  • Schmidt [1907] Schmidt, E.: Zur theorie der linearen und nicht linearen integralgleichungen zweite abhandlung: Auflösung der allgemeinen linearen integralgleichung. Mathematische Annalen 64(2), 161–174 (1907) https://doi.org/10.1007/BF01449890
  • Obukhov [1947] Obukhov, A.M.: Statistically homogeneous fields on a sphere. Usp. Mat. Nauk 2(2), 196–198 (1947) https://doi.org/%****␣main_final.bbl␣Line␣125␣****10.1016/S0924-7963(02)00240-3
  • Lorenz [1956] Lorenz, E.N.: Empirical Orthogonal Functions and Statistical Weather Prediction vol. 1. Massachusetts Institute of Technology, Department of Meteorology Cambridge, Cambridge (1956)
  • Kutzbach [1967] Kutzbach, J.E.: Empirical eigenvectors of sea-level pressure, surface temperature and precipitation complexes over north america. Journal of Applied Meteorology and Climatology 6(5), 791–802 (1967) https://doi.org/10.1175/1520-0450(1967)006<0791:EEOSLP>2.0.CO;2
  • Lumley [1967] Lumley, J.L.: The structure of inhomogeneous turbulent flows. Atmospheric turbulence and radio wave propagation, 166–178 (1967)
  • Sirovich [1987] Sirovich, L.: Turbulence and the dynamics of coherent structures. i. coherent structures. Quarterly of applied mathematics 45(3), 561–571 (1987) https://doi.org/10.1090/qam/910463
  • Sirovich [1991] Sirovich, L.: Analysis of turbulent flows by means of the empirical eigenfunctions. Fluid Dynamics Research 8(1-4), 85 (1991) https://doi.org/10.1016/0169-5983(91)90033-F
  • Ghojogh and Crowley [2019] Ghojogh, B., Crowley, M.: Unsupervised and Supervised Principal Component Analysis: Tutorial. arXiv (2019). https://doi.org/10.48550/ARXIV.1906.03148
  • Ghojogh et al. [2019] Ghojogh, B., Samad, M.N., Mashhadi, S.A., Kapoor, T., Ali, W., Karray, F., Crowley, M.: Feature Selection and Feature Extraction in Pattern Analysis: A Literature Review. arXiv (2019). https://doi.org/10.48550/ARXIV.1905.02845
  • Schölkopf et al. [1997] Schölkopf, B., Smola, A., Müller, K.-R.: Kernel principal component analysis, pp. 583–588. Springer, Berlin, Heidelberg (1997). https://doi.org/10.1007/bfb0020217
  • Jolliffe and Cadima [2016] Jolliffe, I.T., Cadima, J.: Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 374(2065), 20150202 (2016) https://doi.org/%****␣main_final.bbl␣Line␣250␣****10.1098/rsta.2015.0202
  • Mendez [2023] Mendez, M.A.: Linear and nonlinear dimensionality reduction from fluid mechanics to machine learning. Measurement Science and Technology 34(4), 042001 (2023) https://doi.org/10.1088/1361-6501/acaffe
  • Jiménez [2023] Jiménez, J.: Coherent structures in turbulence. In: Mendez, M.A., Ianiro, A., Noack, B.R., Brunton, S.L.E. (eds.) Data-Driven Fluid Mechanics: Combining First Principles and Machine Learning, pp. 20–33. Cambridge University Press, Cambridge (2023). https://doi.org/10.1017/9781108896214
  • Benner et al. [2015] Benner, P., Gugercin, S., Willcox, K.: A survey of projection-based model reduction methods for parametric dynamical systems. SIAM Review 57(4), 483–531 (2015) https://doi.org/10.1137/130932715
  • Ahmed et al. [2021] Ahmed, S.E., Pawar, S., San, O., Rasheed, A., Iliescu, T., Noack, B.R.: On closures for reduced order models—a spectrum of first-principle to machine-learned avenues. Physics of Fluids 33(9) (2021) https://doi.org/10.1063/5.0061577
  • Bernd R. et al. [2011] Bernd R., N., Marek, M., Gilead, T. (eds.): Reduced-Order Modelling for Flow Control. Springer, Berlin, Heidelberg (2011). https://doi.org/10.1007/978-3-7091-0758-4
  • Girfoglio et al. [2021] Girfoglio, M., Quaini, A., Rozza, G.: A pod-galerkin reduced order model for a les filtering approach. Journal of Computational Physics 436, 110260 (2021) https://doi.org/10.1016/j.jcp.2021.110260
  • Hannachi et al. [2007] Hannachi, A., Jolliffe, I.T., Stephenson, D.B.: Empirical orthogonal functions and related techniques in atmospheric science: A review. International Journal of Climatology: A Journal of the Royal Meteorological Society 27(9), 1119–1152 (2007) https://doi.org/10.1002/joc.1499
  • Monahan et al. [2009] Monahan, A.H., Fyfe, J.C., Ambaum, M.H.P., Stephenson, D.B., North, G.R.: Empirical orthogonal functions: The medium is the message. Journal of Climate 22(24), 6501–6514 (2009) https://doi.org/10.1175/2009JCLI3062.1
  • Shen et al. [1998] Shen, S.S., Smith, T.M., Ropelewski, C.F., Livezey, R.E.: An optimal regional averaging method with error estimates and a test using tropical pacific sst data. Journal of Climate 11(9), 2340–2350 (1998) https://doi.org/10.1175/1520-0442(1998)011<2340:aoramw>2.0.co;2
  • North et al. [1982] North, G.R., Bell, T.L., Cahalan, R.F., Moeng, F.J.: Sampling errors in the estimation of empirical orthogonal functions. Monthly Weather Review 110(7), 699–706 (1982) https://doi.org/10.1175/1520-0493(1982)110<0699:SEITEO>2.0.CO;2
  • Everson and Sirovich [1995] Everson, R., Sirovich, L.: Karhunen–loève procedure for gappy data. Journal of the Optical Society of America A 12(8), 1657 (1995) https://doi.org/10.1364/josaa.12.001657
  • Beckers and Rixen [2003] Beckers, J.-M., Rixen, M.: Eof calculations and data filling from incomplete oceanographic datasets. Journal of Atmospheric and oceanic technology 20(12), 1839–1856 (2003) https://doi.org/10.1175/1520-0426(2003)020<1839:ECADFF>2.0.CO;2
  • [28] Willcox, K.: Unsteady Flow Sensing and Estimation via the Gappy Proper Orthogonal Decomposition. https://doi.org/10.2514/6.2004-2415
  • Tsering-xiao and Xu [2019] Tsering-xiao, B., Xu, Q.: Gappy pod-based reconstruction of the temperature field in tibet. Theoretical and Applied Climatology 138(1–2), 1179–1188 (2019) https://doi.org/10.1007/s00704-019-02898-6
  • Alvera‐Azcárate et al. [2007] Alvera‐Azcárate, A., Barth, A., Beckers, J.M., Weisberg, R.H.: Multivariate reconstruction of missing data in sea surface temperature, chlorophyll, and wind satellite fields. Journal of Geophysical Research: Oceans 112(C3) (2007) https://doi.org/10.1029/2006jc003660
  • Tipping and Bishop [1999] Tipping, M.E., Bishop, C.M.: Probabilistic principal component analysis. Journal of the Royal Statistical Society Series B: Statistical Methodology 61(3), 611–622 (1999) https://doi.org/10.1111/1467-9868.00196
  • Goodin et al. [1979] Goodin, W.R., McRa, G.J., Seinfeld, J.H.: A comparison of interpolation methods for sparse data: Application to wind and concentration fields. Journal of Applied Meteorology and Climatology 18(6), 761–771 (1979) https://doi.org/10.1175/1520-0450(1979)018<0761:ACOIMF>2.0.CO;2
  • Miró et al. [2017] Miró, J.J., Caselles, V., Estrela, M.J.: Multiple imputation of rainfall missing data in the iberian mediterranean context. Atmospheric Research 197, 313–330 (2017) https://doi.org/10.1016/j.atmosres.2017.07.016
  • Schanz et al. [2016] Schanz, D., Gesemann, S., Schröder, A.: Shake-The-Box: Lagrangian particle tracking at high particle image densities. Experiments in Fluids 57 (2016) https://doi.org/10.1007/s00348-016-2157-1
  • Tan et al. [2020] Tan, S., Salibindla, A., Masuk, A.U.M., Ni, R.: Introducing OpenLPT: new method of removing ghost particles and high‑concentration particle shadow tracking. Experiments in Fluids 61(47) (2020) https://doi.org/10.1007/s00348-019-2875-2
  • Schröder and Schanz [2023] Schröder, A., Schanz, D.: 3d lagrangian particle tracking in fluid mechanics. Annual Review of Fluid Mechanics 55(1), 511–540 (2023) https://doi.org/10.1146/annurev-fluid-031822-041721
  • Wang et al. [2017] Wang, Y., Akeju, O.V., Zhao, T.: Interpolation of spatially varying but sparsely measured geo-data: A comparative study. Engineering Geology 231, 200–217 (2017) https://doi.org/10.1016/j.enggeo.2017.10.019
  • Sperotto et al. [2022] Sperotto, P., Pieraccini, S., Mendez, M.A.: A meshless method to compute pressure fields from image velocimetry. Measurement Science and Technology 33(9), 094005 (2022) https://doi.org/10.1088/1361-6501/ac70a9
  • Griffiths and Smith [2006] Griffiths, D.V., Smith, I.M.: Numerical Methods for Engineers. Routledge, Taylor & Francis, Boca Raton, Florida (2006). https://doi.org/10.1201/9781420010244
  • Deng et al. [2020] Deng, N., Noack, B.R., Morzyński, M., Pastur, L.R.: Low-order model for successive bifurcations of the fluidic pinball. Journal of fluid mechanics 884, 37 (2020) https://doi.org/10.1017/jfm.2019.959
  • Tirelli et al. [2023] Tirelli, I., Ianiro, A., Discetti, S.: A simple trick to improve the accuracy of PIV/PTV data. Experimental Thermal and Fluid Science 145, 110872 (2023) https://doi.org/10.1016/j.expthermflusci.2023.110872
  • Konrad et al. [2022] Konrad, H., Panegrossi, G., Bagaglini, L., Sanò, P., Sikorski, T., Cattani, E., Schröder, M., Mikalsen, A., Hollmann, R.: Precipitation monthly and daily gridded data from 2000 to 2017 derived from satellite microwave observations. Copernicus Climate Change Service (C3S) Climate Data Store (CDS) (2022) https://doi.org/10.24381/cds.ada9c583