[1]\fnmIacopo \surTirelli

[1]\orgdivDepartment of Aerospace Engineering, \orgnameUniversidad Carlos III de Madrid, \orgaddress\streetAvda. Universidad 30, \cityLeganés, \postcode28911, \stateMadrid, \countrySpain

2]\orgdivEnvironmental and Applied Fluid Dynamics, \orgnamevon Karman Institute for Fluid Dynamics, \orgaddress\streetWaterloosesteenweg 72, \citySint-Genesius-Rode, \postcode1640, \stateBruxelles, \countryBelgium

A meshless method to compute the proper orthogonal decomposition and its variants from scattered data

iacopo.tirelli@uc3m.es \scalerel* — \fnmMiguel Alfonso \surMendez mendez@vki.ac.be \scalerel* — \fnmAndrea \surIaniro aianiro@ing.uc3m.es \scalerel* — \fnmStefano \surDiscetti sdiscett@ing.uc3m.es \scalerel* — * [

Abstract

Complex phenomena can be better understood when broken down into a limited number of simpler “components”. Linear statistical methods such as the principal component analysis and its variants are widely used across various fields of applied science to identify and rank these components based on the variance they represent in the data. These methods can be seen as factorizations of the matrix collecting all the data, which are assumed to be a collection of time series sampled from fixed points in space. However, when data sampling locations vary over time, as with mobile monitoring stations in meteorology and oceanography or with particle tracking velocimetry in experimental fluid dynamics, advanced interpolation techniques are required to project the data onto a fixed grid before carrying out the factorization. This interpolation is often expensive and inaccurate. This work proposes a method to decompose scattered data without interpolating. The approach is based on physics-constrained radial basis function regression to compute inner products in space and time. The method provides an analytical and mesh-independent decomposition in space and time, demonstrating higher accuracy than the traditional approach. Our results show that it is possible to distill the most relevant “components” even for measurements whose natural output is a distribution of data scattered in space and time, maintaining high accuracy and mesh independence.

keywords:

POD, RBF, meshless algorithm, PCA, EOF, KL transform, modal decomposition

1 Introduction

Methods from dimensionality reduction and pattern identification have become ubiquitous in many applied sciences, allowing streamlining data analysis, simplifying complex datasets, highlighting essential features and uncovering critical patterns and relationships [1, 2].

The simplest and fundamental approach to linear dimensionality reduction is the Principal Component Analysis (PCA), originally formulated in multivariate statistics [3, 4] and known as Karhunen-Loève transform (KL) in mathematics [5], Empirical Orthogonal Functions (EOF) in meteorology and climatology [6, 7, 8] and Proper Orthogonal Decomposition (POD) in fluid mechanics [9, 10, 11].

The PCA simplifies large datasets by decomposing them into a set of linearly uncorrelated variables known as principal components, ordered by the amount of variance they capture from the data. Its simplicity, versatility and effectiveness have made the PCA one of the crucial first steps in many data analysis workflows, and many extensions and variants have been proposed [12, 13, 14, 15, 16].

Beyond multivariate statistical analysis, the principal components are often expected to provide information about the spatial correlation of time series. In fluid mechanics, for example, the extension of PCA onto POD aimed at using such dimensionality reduction methods to identify coherent structures in turbulent flows [9, 1, 17]. In computational physics, the Galerkin projection of a PDE onto these modes is the cornerstone to reduce the computational cost of large simulations [18], to build a reduced order model that enables model-based control [19, 20] and to derive more efficient Large Eddy Simulation formulations [21]. In climatology, the EOF [22, 23, 2] has been developed to identify and analyze dominant patterns of variability in climate data and to reduce the dimensionality of large climate datasets.

In all these applications, the leading patterns are identified as eigenvectors of the correlation matrices, computed in space or in time. However, the computation of these matrices implies the definition of a grid, to approximate the integral in the continuous formulation of the problems (see [24, 25]). A major research effort in both fluid mechanics and climatology is the problem of handling missing data from these grids [26, 27, 28, 29]. Common approaches to handle these cases are the Gappy POD, initially introduced in [26], Data Interpolating EOF (DINEOF, [30]), or probabilistic PCA (PPCA, [31]). An extensive overview of gap-filling methods is provided in [32, 33].

When the level of missing data is extreme, the concept of a grid becomes irrelevant and data could be treated as randomly scattered. This is the natural settings in image velocimetry measurements carried out with particle tracking methods [34, 35, 36], in the measurement of wind or concentration fields from limited stations [32, 33] or in the interpolation of geological data from limited locations [37].

This article proposes a grid-less approach to the computation of POD/EOF, removing the need for a grid. The idea consists of computing correlation matrices of scattered data using constrained Radial Basis Functions (RBFs) as introduced in [38], together with a Gauss-Legendre quadrature for the integration. We show that the approach provides higher accuracy than the usual interpolation-based approach and provides an analytical (mesh-independent) representation of the spatial structures. The theoretical background of the methodology is detailed in §2, while the validation is presented in §3 on two different synthetic datasets: an analytical sinusoidal test case and the wake of a fluidic pinball. An experimental validation is carried out using the precipitation data collected from satellite microwave observations.

2 Methodology

The proposed methodology follows the classic snapshot-based POD by Sirovich [11], but replaces all inner products with an RBF formulation that requires no grid and thus no interpolation. The data can be scattered both in space and time. The final result is a linear decomposition of the data in the form:

u(\textbf{x},t_{i})=\sum^{r_{c}}_{r=1}\sigma_{r}\phi_{r}(\textbf{x})\psi_{r}(t% )\,,

(1)

where $r_{c}$ is the rank, $\sigma_{r}$ are scalars defining the amplitude of each contribution, $\phi_{r}(\textbf{x})$ and $\psi_{r}(t)$ are the spatial and the temporal structures of the POD modes. The methodology is made of four steps:

Step 1: Analytical snapshot representation with RBF

Starting from scattered data available for the different time realizations, it is possible to obtain an analytical approximation of the field for each time instant as a linear combination of a set of basis functions. This approximation is denoted as:

\tilde{u}(\textbf{x},t_{i})=\sum_{q=1}^{N_{b}}w_{q}(t_{i})\gamma_{q}(\textbf{x% }),

(2)

where $N_{b}$ is the number of basis functions used in the approximation, $\gamma_{q}$ is the $q^{th}$ regression basis, and $w_{q}$ are the corresponding weights. The vector x contains the coordinates on which the data are evaluated at the $i^{th}$ time instant $t_{i}$ among the total $N_{t}$ time instances. With no loss of generality, the basis functions employed in this work are thin-plate RBF. These only require the definition of the collocation points and have no shape factor parameters. We use the sampling locations $\textbf{x}_{p}$ as collocation points. Therefore, the current approach has no hyper-parameters to tune, needs no user input and only requires extra memory storage for the weights. Future work will explore more advanced formulations and the use of physics constraints as in Sperotto et al. [38].

Step 2: Temporal correlation matrix

The temporal correlation matrix $\textbf{K}\in\mathbb{R}^{N_{t}\times N_{t}}$ is defined as the matrix whose elements are the inner products in space, denoted as $\langle\cdot,\cdot\rangle_{s}$ , between all the snapshots:

\textbf{K}=\begin{bmatrix}\langle u_{1},u_{1}\rangle_{s}&\langle u_{1},u_{2}% \rangle_{s}&\dots&\langle u_{1},u_{Nt}\rangle_{s}\\ \langle u_{2},u_{1}\rangle_{s}&\langle u_{2},u_{2}\rangle_{s}&\dots&\langle u_% {2},u_{Nt}\rangle_{s}\\ \vdots&\vdots&\ddots&\vdots\\ \langle u_{Nt},u_{1}\rangle_{s}&\langle u_{Nt},u_{2}\rangle_{s}&\dots&\langle u% _{Nt},u_{Nt}\rangle_{s}\\ \end{bmatrix}

(3)

In the original formulation of the POD [9], the inner product in space or time was defined in terms of correlation of square-integrable real-valued functions, that is:

K_{ij}=\frac{1}{\parallel\Omega\parallel}\int_{\Omega}\tilde{u}(\textbf{x},t_{% i})~{}\tilde{u}(\textbf{x},t_{j})~{}d\Omega\,,

(4)

with $\Omega$ the area (in 2D) or volume (in 3D) of the spatial domain considered. If the spatial domain is partitioned in uniform $N_{s}$ elements of area (or volume) $\Delta\Omega$ , indexed by $k\in[0,N_{s}-1]$ the most natural approximation of Eq. 4 reads

K_{ij}\approx\frac{1}{N_{s}\Delta\Omega}\sum_{k}\tilde{u}(\textbf{x}_{k},t_{i}% )~{}\tilde{u}(\textbf{x}_{k},t_{j})~{}\Delta\Omega=\frac{1}{N_{s}}\mathbf{u}^{% T}_{j}\mathbf{u}_{i}\,,

(5)

where $\mathbf{u}_{i},\mathbf{u}_{j}\in\mathbb{R}^{N_{s}}$ are the vectors collecting the data sampled at time steps $t_{i}$ and $t_{j}$ and corresponding to each of the $N_{s}$ portions of the spatial domain. In this setting, the correlation matrix can be computed using simple matrix multiplication as $\textbf{K}=\textbf{U}^{T}\textbf{U}$ . Eq. 5 is an approximation of Eq. 4 using the mid-point rule [39], that is assuming a piecewise-constant approximation of the velocity field.

POD algorithms for gridded data use Eq. 5. Interpolation-based formulations seek to bring the scattered data onto a uniform grid so that Eq. 5 can still be used. We here propose to use the original version in Eq. 4, leveraging the regression in Eq. 2 from step 1 and a quadrature method that interrogates the integrand on specific quadrature points. With no loss of generality, we here use Gauss-Legendre quadrature because of its excellent trade-off between accuracy and computational costs. It is worth emphasizing that quadrature methods based on analytic approximations of the integrand yield higher accuracy compared to the mid-point rule implied in Eq. 5.

Step 3: Computing Temporal Structures

The temporal structure of POD modes are eigenvectors of the temporal correlation matrix. Therefore, given the temporal correlation matrix these reads

\textbf{K}=\textbf{$\Psi$}\textbf{$\Sigma$}^{2}\textbf{$\Psi$}^{T},

(6)

where the matrix $\textbf{$\Psi$}\in\mathbb{R}^{N_{t}\times r_{c}}$ collects the eigenvectors along its columns and $\Sigma$ is a diagonal matrix whose elements $\sigma_{i}$ represent the mode amplitudes (cf. Eq. 1). This step is identical to the classic snapshot POD for gridded data.

Step 4: Computing Spatial Structures

The $r^{th}$ spatial structure $\phi_{r}$ is the result of projecting the dataset $\tilde{u}(\textbf{x},\textbf{t})$ onto the temporal structure $\psi_{r}$ . This project requires an inner product in time. The same discussion in Step 2 now applies to the inner product in the time domain, which reads

\phi_{r}(\textbf{x})=\frac{1}{\sigma_{r}}\langle\tilde{u}(\textbf{x},t),\psi_{% r}(\textbf{t})\rangle=\frac{1}{\sigma_{r}T}\int_{t}\tilde{u}(\textbf{x},% \textbf{t})~{}\psi_{r}(\textbf{t})~{}dt\,\approx\frac{1}{\sigma_{r}N_{t}}\sum^% {N_{t}}_{k=1}\tilde{u}(\textbf{x},\textbf{t}_{k})\psi_{r}(\textbf{t}_{k}),

(7)

assuming that the time domain is $t\in[0,T]$ . Here both the temporal structures and the regression of the snapshots are available on a discrete set of times $\textbf{t}\in\mathbb{R}^{N_{t}}$ . Although advanced quadratures could also be used in Eq. 7 together with an analytic regression in time, the gain in accuracy was deemed not sufficient to justify the increase in computational cost. Therefore, in this work, this projection is carried out via mid-point approximation in time.

We stress that Eq. 7 holds for any set of points x. Hence Eq. 7 enables super-resolution of the POD modes. In this work, we compute it over a uniform grid for plotting purposes and compare it with interpolation-based approaches.

3 Validation

3.1 Analytical testcase

Refer to caption — Figure 1: Root mean square error of the $i^{th}$ temporal mode $\psi_{i}$ ( $\delta_{RMS_{\psi_{i}}}$ ) as function of $\hat{d}$ for the analytical testcase. The values are normalized with the standard deviation of its corresponding reference modes $\psi_{ref_{i}}$ . The red curve with star markers depicts the meshless POD, while the blue curve with square markers represents the gridded POD.

A synthetic test case was first considered for benchmarking the proposed method on a problem for which the POD is analytically available. This allows for isolating all sources of errors. In this test case, we consider a spatial domain of length $L_{x}$ and $L_{y}$ that spans $x,y\in[0,12]$ , discretized with $100$ grid points. Two 2D sinusoidal orthogonal waves have been chosen as spatial structures, with a leading wavenumber $k_{x}=k_{y}=3$ in both $x$ and $y$ direction:

\phi_{1}(x,y)=\sin\left(k_{x}x\right)\sin\left(k_{y}y\right)\,,\,\phi_{2}(x,y)% =\sin\left(2k_{x}x\right)\sin\left(2k_{y}y\right).

(8)

Similarly, the associated temporal structures are sampled over a period $T=10$ discretized with $500$ time instants:

\psi_{1}=\sin\left(\frac{4\pi}{T}\cdot t\right)\,,\,\psi_{2}=\sin\left(\frac{2% \pi}{T}t\right).

(9)

Assigning an amplitude $\sigma_{1}=10$ and $\sigma_{2}=5$ to each mode respectively, the synthetic dataset D is built as a linear combination of these structures:

\textbf{D}=\sigma_{1}\phi_{1}\psi_{1}^{T}+\sigma_{2}\phi_{2}\psi_{2}^{T},

(10)

whose analytical modes represent the ground truth. It is easy to show that the modes with spatial structures (8) and temporal structures (9) are indeed POD modes, being orthogonal both in space and time according to the standard inner product. We consider them as “reference POD”.

A parametric study involving the influence of the sparsity of data has been carried out. To this end, we define the parameter $\hat{d}$ to refer to the different cases, which represents the average distance among the “sensors” normalized with the reference lengthscale $l_{ref}$ :

\hat{d}=\frac{l_{ref}}{\sqrt{N_{ppp}}},

(11)

where $N_{ppp}$ represents the sensor density. In this specific application, the reference length-scale is the smallest achievable wavelength $\lambda^{*}=\frac{L_{x}}{2k_{x}}$ .

The different datasets are obtained for $0.1<\hat{d}<0.5$ , whose corresponding number of particles is computed inverting Eq. 11. These datasets serve as test cases for the meshless POD computation, and the corresponding results are denoted as “meshless POD”. The mapping onto a regular grid is performed using a moving average, with bin sizes determined ensuring approximately $10$ sensor within each bin. This is a common practice in particle-image-based measurements, for instance, and introduces a high degree of robustness to noise and outliers, although at the expenses of spatial resolution. From this mapping, the classic matrix-factorization approach for the POD is employed. In what follows, we refer to the results of this approach on the gridded data as “gridded POD”.

Fig.1 illustrates the root mean square error of the temporal structures $\psi_{i}$ ( $\delta_{RMS_{\psi_{i}}}$ ), normalized by the square root of the number of samples $N_{t}$ , for different values of $\hat{d}$ . This test highlights the superior accuracy of the meshless approach compared to the traditional one, even for sparser datasets. A similar trend is observed for the spatial structures depicted in Fig.2.

In Fig.2.a, a visual comparison is presented for the case $\hat{d}=0.2$ between the reconstructed spatial structures and the analytical ones. It is evident from this figure, particularly in the second mode, that the gridded approach filters out smaller scales more aggressively. This observation is further confirmed by the plot shown in Fig.2.b. As for the temporal case, the $\delta_{RMS_{\phi_{i}}}$ is normalized with the standard deviation of the corresponding reference modes.

The benchmark on this toy problem, despite its simplicity, enabled us to eliminate external sources of error in the data originating from simulations or experiments. This process allowed us to evaluate the suitability of the proposed methodology when compared with the traditional approach.

3.2 Fluidic pinball

The wake of a fluidic pinball is a popular test case in fluid mechanics. A pinball consists of a configuration of three cylinders with diameter $D$ , with axes positioned at the vertices of an equilateral triangle with side $3D/2$ . The pinball is invested by a uniform constant wind of velocity $U_{\infty}$ . Direct Numerical Simulation (DNS) data from Deng et al. [40] is used for this case. The DNS data consists of an unstructured grid with $3536$ points within the domain $x/D\in[1,11]$ and $y/D\in[-2,2]$ (with $x,y$ being the stream-wise and cross-wise directions, respectively). The triangle formed by the cylinder axis points upstream, with the downstream edge orthogonal to the mean flow direction. The midpoint of the downstream edge is located at $(0,0)$ . The flow kinematic viscosity $\nu$ is set to have a Reynolds number $Re=U_{\infty}D/\nu=150$ , in which the flow exhibits a chaotic behavior [40].

We reproduce the standard configuration of Particle Tracking Velocimetry (PTV) measurements. To this purpose, it is assumed that the domain is observed by a camera with a resolution of $32$ pixels per diameter. The velocity fields are randomly sampled (as if they were originated from particle measurements) at different concentrations, resulting in a mean distance in the range $0.1<\hat{d}<1$ . The cylinder diameter $D$ has been used here as reference lengthscale. A baseline for the POD computation is obtained by interpolation of the DNS data on a fine grid with spacing $\Delta x=D/16$ , and computation of the POD with the traditional matrix factorization approach.

Since the dataset is statistically stationary, focus is placed on the decomposition of the fluctuating component of the velocity field. To ensure a fair comparison across the methodologies, an ensemble high-resolution mean is subtracted from all the datasets as explained in Tirelli et al. [41].

Fig. 3 shows the root mean square error of the first six temporal structures $\psi_{i}$ ( $\delta_{RMS_{\psi_{i}}}$ ), normalized with the square root of the number of samples $N_{t}$ , for different $\hat{d}$ . This kind of comparison is limited to the most energetic modes. Higher-order modes have generally smaller energy difference, which might result in different sorting for different decomposition methods. The illustrated modes suggest that the meshless approach closely follows the reference. The overall trend indicates that an increase in the average distance among sensors corresponds to an increase in the error. Notably, the gridded approach is more susceptible to this effect. To accommodate the larger distances, the method tends to augment the bin size to ensure a sufficient number of sampling points within the bin. Nonetheless, this strategy unintentionally filters out smaller scales and exacerbates spatial resolution limitations. On the other hand, the meshless regression appears more robust to the detrimental effect of increasing the sparsity of the sensors. It is worth remarking that values of $\delta_{RMS_{\psi_{i}}}>1$ might also be originating from the comparison among temporal structures that are not synchronized with the reference ones due to different sorting of modes with similar energy, as discussed before.

The first $3$ spatial modes $\phi_{i}$ associated with the stream-wise velocity component $U$ are analysed in Fig. 4.a. The contours in the top part of this figure are representative of the case $\hat{d}=0.6$ , as highlighted by the dashed circles in the bottom part of the picture. These contours show that the gridded POD exhibits more attenuated modes if compared to the ones computed with the meshless approach. In Fig. 4.b the $\delta_{RMS_{\phi_{i}}}$ of these modes normalized with the rms of the corresponding reference modes is reported as a function of $\hat{d}$ . Similarly to the temporal modes, the meshless approach shows superior robustness to increased data sparsity.

3.3 C3S precipitation from satellite microwave observations

The experimental testcase is based on the “Copernicus Climate Change Service (C3S) Climate Data Store (CDS): Precipitation monthly and daily gridded data from 2000 to 2017 derived from satellite microwave observations.”([42], DOI:10.24381/cds.ada9c583). This dataset provides global estimates of daily accumulated precipitation and monthly mean precipitation. These estimates are derived from a combination of passive microwave observations obtained from two different types of radiometers on various Low Earth Orbit satellites. Firstly, there are the conically scanning microwave imagers, which undergo processing utilizing methodologies established by the Hamburg Ocean Atmosphere Parameters and Fluxes from Satellite initiative, as part of the Satellite Application Facility on Climate Monitoring. Secondly, the cross-track scanning microwave sounders whose data undergoes processing through the Passive microwave Neural network Precipitation Retrieval for Climate Applications algorithm.

The main variable in this dataset is the precipitation $\mathcal{P}$ , expressed in $mm/day$ , which represents the water-equivalent volume rate per area and per day of atmospheric water in liquid or solid phase reaching the Earth’s surface. The original data are on a uniform grid with a spacing of $1^{\circ}$ , where latitudinal and longitudinal coordinates span $[-90^{\circ},90^{\circ}]$ and $[-180^{\circ},180^{\circ}]$ respectively. The ground-truth for this application is a downsampled version of the original data on a coarser grid with a spacing of $2^{\circ}$ , restricted to the region $[-50^{\circ},50^{\circ}]$ in longitude (LON) and $[0^{\circ},90^{\circ}]$ in latitude (LAT). The observations are made daily and cover the period from 2000 to 2017, leading to a total of $6576$ days (treated here as snapshots in the POD formulation).

To simulate sparse sensor acquisition from this data, values for $\hat{d}=[0.015,0.02,0.025,0.03,0.035]$ are interpolated for random positions. To compute these values of $\hat{d}$ , the latitude and longitude coordinates have been converted into radians, aligning with the reference length scale of the Earth’s radius. These sensor distributions are then employed to perform gridded and meshless POD, as in the previous sections.

Fig. 5 displays the root mean square error of the first $6$ temporal structures, normalized as usual with the square root of the number of snapshots. In Fig. 6, a comparison among the spatial structures for different sparsity levels is proposed. Specifically, Fig. 6.b shows the overall comparison, while Fig. 6.a highlights the specific case of $\hat{d}=0.02$ .

4 Conclusions

We propose a novel meshless approach to compute spatial and temporal modes from scattered measurements eliminating the need for interpolation onto a structured fixed grid. Key enablers are RBF regression and advanced quadrature techniques to compute inner products in space and time, providing an analytic (mesh-independent) representation of the modes. By avoiding the modulation effects associated with the unnecessary mapping of scattered measurements onto a Cartesian grid, our method offers advantages in accuracy and efficiency. Furthermore, the quadrature methods enabled by RBF regression allow for more precise computation of the temporal correlation matrix and, consequently, the temporal eigenvectors. We demonstrate that our method enables the recovery of scales filtered out by binning onto a regular grid.

This work serves as the first step towards a broader endeavour. The concepts presented here can be readily extended to the different variants and applications discussed in previous sections. We envisage future developments to incorporate physical constraints in the regression process, extension to a wide range of modal decompositions, and applications of the meshless analytical treatment to method such as Galerkin projections, among others.

Acknowledgment

This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement No 949085). Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them.

References

\bibcommenthead
Holmes et al. [1996] Holmes, P., Lumley, J.L., Berkooz, G.: Turbulence, Coherent Structures, Dynamical Systems and Symmetry. Cambridge Monographs on Mechanics. Cambridge University Press, Cambridge (1996). https://doi.org/10.1017/CBO9780511622700
Martinson [2018] Martinson, D.G.: Empirical Orthogonal Function (EOF) Analysis, pp. 495–534. Cambridge University Press, Cambridge (2018). https://doi.org/10.1017/9781139342568.016
Pearson [1901] Pearson, K.: Principal components analysis. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 6(2), 559 (1901) https://doi.org/10.1080/14786440109462720
Hotelling [1936] Hotelling, H.: Simplified calculation of principal components. Psychometrika 1(1), 27–35 (1936) https://doi.org/10.1007/BF02287921
Schmidt [1907] Schmidt, E.: Zur theorie der linearen und nicht linearen integralgleichungen zweite abhandlung: Auflösung der allgemeinen linearen integralgleichung. Mathematische Annalen 64(2), 161–174 (1907) https://doi.org/10.1007/BF01449890
Obukhov [1947] Obukhov, A.M.: Statistically homogeneous fields on a sphere. Usp. Mat. Nauk 2(2), 196–198 (1947) https://doi.org/%****␣main_final.bbl␣Line␣125␣****10.1016/S0924-7963(02)00240-3
Lorenz [1956] Lorenz, E.N.: Empirical Orthogonal Functions and Statistical Weather Prediction vol. 1. Massachusetts Institute of Technology, Department of Meteorology Cambridge, Cambridge (1956)
Kutzbach [1967] Kutzbach, J.E.: Empirical eigenvectors of sea-level pressure, surface temperature and precipitation complexes over north america. Journal of Applied Meteorology and Climatology 6(5), 791–802 (1967) https://doi.org/10.1175/1520-0450(1967)006<0791:EEOSLP>2.0.CO;2
Lumley [1967] Lumley, J.L.: The structure of inhomogeneous turbulent flows. Atmospheric turbulence and radio wave propagation, 166–178 (1967)
Sirovich [1987] Sirovich, L.: Turbulence and the dynamics of coherent structures. i. coherent structures. Quarterly of applied mathematics 45(3), 561–571 (1987) https://doi.org/10.1090/qam/910463
Sirovich [1991] Sirovich, L.: Analysis of turbulent flows by means of the empirical eigenfunctions. Fluid Dynamics Research 8(1-4), 85 (1991) https://doi.org/10.1016/0169-5983(91)90033-F
Ghojogh and Crowley [2019] Ghojogh, B., Crowley, M.: Unsupervised and Supervised Principal Component Analysis: Tutorial. arXiv (2019). https://doi.org/10.48550/ARXIV.1906.03148
Ghojogh et al. [2019] Ghojogh, B., Samad, M.N., Mashhadi, S.A., Kapoor, T., Ali, W., Karray, F., Crowley, M.: Feature Selection and Feature Extraction in Pattern Analysis: A Literature Review. arXiv (2019). https://doi.org/10.48550/ARXIV.1905.02845
Schölkopf et al. [1997] Schölkopf, B., Smola, A., Müller, K.-R.: Kernel principal component analysis, pp. 583–588. Springer, Berlin, Heidelberg (1997). https://doi.org/10.1007/bfb0020217
Jolliffe and Cadima [2016] Jolliffe, I.T., Cadima, J.: Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 374(2065), 20150202 (2016) https://doi.org/%****␣main_final.bbl␣Line␣250␣****10.1098/rsta.2015.0202
Mendez [2023] Mendez, M.A.: Linear and nonlinear dimensionality reduction from fluid mechanics to machine learning. Measurement Science and Technology 34(4), 042001 (2023) https://doi.org/10.1088/1361-6501/acaffe
Jiménez [2023] Jiménez, J.: Coherent structures in turbulence. In: Mendez, M.A., Ianiro, A., Noack, B.R., Brunton, S.L.E. (eds.) Data-Driven Fluid Mechanics: Combining First Principles and Machine Learning, pp. 20–33. Cambridge University Press, Cambridge (2023). https://doi.org/10.1017/9781108896214
Benner et al. [2015] Benner, P., Gugercin, S., Willcox, K.: A survey of projection-based model reduction methods for parametric dynamical systems. SIAM Review 57(4), 483–531 (2015) https://doi.org/10.1137/130932715
Ahmed et al. [2021] Ahmed, S.E., Pawar, S., San, O., Rasheed, A., Iliescu, T., Noack, B.R.: On closures for reduced order models—a spectrum of first-principle to machine-learned avenues. Physics of Fluids 33(9) (2021) https://doi.org/10.1063/5.0061577
Bernd R. et al. [2011] Bernd R., N., Marek, M., Gilead, T. (eds.): Reduced-Order Modelling for Flow Control. Springer, Berlin, Heidelberg (2011). https://doi.org/10.1007/978-3-7091-0758-4
Girfoglio et al. [2021] Girfoglio, M., Quaini, A., Rozza, G.: A pod-galerkin reduced order model for a les filtering approach. Journal of Computational Physics 436, 110260 (2021) https://doi.org/10.1016/j.jcp.2021.110260
Hannachi et al. [2007] Hannachi, A., Jolliffe, I.T., Stephenson, D.B.: Empirical orthogonal functions and related techniques in atmospheric science: A review. International Journal of Climatology: A Journal of the Royal Meteorological Society 27(9), 1119–1152 (2007) https://doi.org/10.1002/joc.1499
Monahan et al. [2009] Monahan, A.H., Fyfe, J.C., Ambaum, M.H.P., Stephenson, D.B., North, G.R.: Empirical orthogonal functions: The medium is the message. Journal of Climate 22(24), 6501–6514 (2009) https://doi.org/10.1175/2009JCLI3062.1
Shen et al. [1998] Shen, S.S., Smith, T.M., Ropelewski, C.F., Livezey, R.E.: An optimal regional averaging method with error estimates and a test using tropical pacific sst data. Journal of Climate 11(9), 2340–2350 (1998) https://doi.org/10.1175/1520-0442(1998)011<2340:aoramw>2.0.co;2
North et al. [1982] North, G.R., Bell, T.L., Cahalan, R.F., Moeng, F.J.: Sampling errors in the estimation of empirical orthogonal functions. Monthly Weather Review 110(7), 699–706 (1982) https://doi.org/10.1175/1520-0493(1982)110<0699:SEITEO>2.0.CO;2
Everson and Sirovich [1995] Everson, R., Sirovich, L.: Karhunen–loève procedure for gappy data. Journal of the Optical Society of America A 12(8), 1657 (1995) https://doi.org/10.1364/josaa.12.001657
Beckers and Rixen [2003] Beckers, J.-M., Rixen, M.: Eof calculations and data filling from incomplete oceanographic datasets. Journal of Atmospheric and oceanic technology 20(12), 1839–1856 (2003) https://doi.org/10.1175/1520-0426(2003)020<1839:ECADFF>2.0.CO;2
[28] Willcox, K.: Unsteady Flow Sensing and Estimation via the Gappy Proper Orthogonal Decomposition. https://doi.org/10.2514/6.2004-2415
Tsering-xiao and Xu [2019] Tsering-xiao, B., Xu, Q.: Gappy pod-based reconstruction of the temperature field in tibet. Theoretical and Applied Climatology 138(1–2), 1179–1188 (2019) https://doi.org/10.1007/s00704-019-02898-6
Alvera‐Azcárate et al. [2007] Alvera‐Azcárate, A., Barth, A., Beckers, J.M., Weisberg, R.H.: Multivariate reconstruction of missing data in sea surface temperature, chlorophyll, and wind satellite fields. Journal of Geophysical Research: Oceans 112(C3) (2007) https://doi.org/10.1029/2006jc003660
Tipping and Bishop [1999] Tipping, M.E., Bishop, C.M.: Probabilistic principal component analysis. Journal of the Royal Statistical Society Series B: Statistical Methodology 61(3), 611–622 (1999) https://doi.org/10.1111/1467-9868.00196
Goodin et al. [1979] Goodin, W.R., McRa, G.J., Seinfeld, J.H.: A comparison of interpolation methods for sparse data: Application to wind and concentration fields. Journal of Applied Meteorology and Climatology 18(6), 761–771 (1979) https://doi.org/10.1175/1520-0450(1979)018<0761:ACOIMF>2.0.CO;2
Miró et al. [2017] Miró, J.J., Caselles, V., Estrela, M.J.: Multiple imputation of rainfall missing data in the iberian mediterranean context. Atmospheric Research 197, 313–330 (2017) https://doi.org/10.1016/j.atmosres.2017.07.016
Schanz et al. [2016] Schanz, D., Gesemann, S., Schröder, A.: Shake-The-Box: Lagrangian particle tracking at high particle image densities. Experiments in Fluids 57 (2016) https://doi.org/10.1007/s00348-016-2157-1
Tan et al. [2020] Tan, S., Salibindla, A., Masuk, A.U.M., Ni, R.: Introducing OpenLPT: new method of removing ghost particles and high‑concentration particle shadow tracking. Experiments in Fluids 61(47) (2020) https://doi.org/10.1007/s00348-019-2875-2
Schröder and Schanz [2023] Schröder, A., Schanz, D.: 3d lagrangian particle tracking in fluid mechanics. Annual Review of Fluid Mechanics 55(1), 511–540 (2023) https://doi.org/10.1146/annurev-fluid-031822-041721
Wang et al. [2017] Wang, Y., Akeju, O.V., Zhao, T.: Interpolation of spatially varying but sparsely measured geo-data: A comparative study. Engineering Geology 231, 200–217 (2017) https://doi.org/10.1016/j.enggeo.2017.10.019
Sperotto et al. [2022] Sperotto, P., Pieraccini, S., Mendez, M.A.: A meshless method to compute pressure fields from image velocimetry. Measurement Science and Technology 33(9), 094005 (2022) https://doi.org/10.1088/1361-6501/ac70a9
Griffiths and Smith [2006] Griffiths, D.V., Smith, I.M.: Numerical Methods for Engineers. Routledge, Taylor & Francis, Boca Raton, Florida (2006). https://doi.org/10.1201/9781420010244
Deng et al. [2020] Deng, N., Noack, B.R., Morzyński, M., Pastur, L.R.: Low-order model for successive bifurcations of the fluidic pinball. Journal of fluid mechanics 884, 37 (2020) https://doi.org/10.1017/jfm.2019.959
Tirelli et al. [2023] Tirelli, I., Ianiro, A., Discetti, S.: A simple trick to improve the accuracy of PIV/PTV data. Experimental Thermal and Fluid Science 145, 110872 (2023) https://doi.org/10.1016/j.expthermflusci.2023.110872
Konrad et al. [2022] Konrad, H., Panegrossi, G., Bagaglini, L., Sanò, P., Sikorski, T., Cattani, E., Schröder, M., Mikalsen, A., Hollmann, R.: Precipitation monthly and daily gridded data from 2000 to 2017 derived from satellite microwave observations. Copernicus Climate Change Service (C3S) Climate Data Store (CDS) (2022) https://doi.org/10.24381/cds.ada9c583