-
Towards the Use of Anderson Acceleration in Coupled Transport-Gyrokinetic Turbulence Simulations
Authors:
David J. Gardner,
Lynda L. LoDestro,
Carol S. Woodward
Abstract:
Predicting the behavior of a magnetically confined fusion plasma over long time periods requires methods that can bridge the difference between transport and turbulent time scales. The nonlinear transport solver, Tango, enables simulations of very long times, in particular to steady state, by advancing each process independently with different time step sizes and couples them through a relaxed ite…
▽ More
Predicting the behavior of a magnetically confined fusion plasma over long time periods requires methods that can bridge the difference between transport and turbulent time scales. The nonlinear transport solver, Tango, enables simulations of very long times, in particular to steady state, by advancing each process independently with different time step sizes and couples them through a relaxed iteration scheme. We examine the use of Anderson Acceleration (AA) to reduce the total number of coupling iterations required by interfacing Tango with the AA implementation, including several extensions to AA, provided by the KINSOL nonlinear solver package in SUNDIALS. The ability to easily enable and adjust algorithmic options through KINSOL allows for rapid experimentation to evaluate different approaches with minimal effort. Additionally, we leverage the GPTune library to automate the optimization of algorithmic parameters within KINSOL. We show that AA can enable faster convergence in stiff and very stiff tests cases without noise present and in all cases, including with noisy fluxes, increases robustness and reduces sensitivity to the choice of relaxation strength.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
SUNDIALS Time Integrators for Exascale Applications with Many Independent ODE Systems
Authors:
Cody J. Balos,
Marc Day,
Lucas Esclapez,
Anne M. Felden,
David J. Gardner,
Malik Hassanaly,
Daniel R. Reynolds,
Jon Rood,
Jean M. Sexton,
Nicholas T. Wimer,
Carol S. Woodward
Abstract:
Many complex systems can be accurately modeled as a set of coupled time-dependent partial differential equations (PDEs). However, solving such equations can be prohibitively expensive, easily taxing the world's largest supercomputers. One pragmatic strategy for attacking such problems is to split the PDEs into components that can more easily be solved in isolation. This operator splitting approach…
▽ More
Many complex systems can be accurately modeled as a set of coupled time-dependent partial differential equations (PDEs). However, solving such equations can be prohibitively expensive, easily taxing the world's largest supercomputers. One pragmatic strategy for attacking such problems is to split the PDEs into components that can more easily be solved in isolation. This operator splitting approach is used ubiquitously across scientific domains, and in many cases leads to a set of ordinary differential equations (ODEs) that need to be solved as part of a larger "outer-loop" time-stepping approach. The SUNDIALS library provides a plethora of robust time integration algorithms for solving ODEs, and the U.S. Department of Energy Exascale Computing Project (ECP) has supported its extension to applications on exascale-capable computing hardware. In this paper, we highlight some SUNDIALS capabilities and its deployment in combustion and cosmology application codes (Pele and Nyx, respectively) where operator splitting gives rise to numerous, small ODE systems that must be solved concurrently.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Numerical coupling of aerosol emissions, dry removal, and turbulent mixing in the E3SM Atmosphere Model version 1 (EAMv1), part II: a semi-discrete error analysis framework for assessing coupling schemes
Authors:
Christopher J. Vogl,
Hui Wan,
Carol S. Woodward,
Quan M. Bui
Abstract:
This paper complements the empirical justification of the revised scheme in Part I of this work with a mathematical justification leveraging a semi-discrete analysis framework for assessing the splitting error of process coupling methods. The novelty of the framework is that splitting error is distinguished from the process time integration errors, i.e., the errors caused by discrete time integrat…
▽ More
This paper complements the empirical justification of the revised scheme in Part I of this work with a mathematical justification leveraging a semi-discrete analysis framework for assessing the splitting error of process coupling methods. The novelty of the framework is that splitting error is distinguished from the process time integration errors, i.e., the errors caused by discrete time integration of individual processes, leading to expressions that are more easily interpreted utilizing existing physical understanding of the processes that the terms represent. This application of this framework to dust life cycle in EAMv1 showcases such an interpretation, using the leading-order splitting error that results from the framework to confirm (i) that the original EAMv1 scheme artificially strengthens the effect of dry removal processes, and (ii) that the revised splitting reduces that artificial strengthening. While the error analysis framework is presented in the context of the dust life cycle in EAMv1, the framework can be broadly leveraged to evaluate process coupling schemes, both in other physical problems and for any number of processes. This framework will be particularly powerful when the various process implementations support a variety of time integration approaches. Whereas traditional local truncation error approaches require separate consideration of each combination of time integration methods, this framework enables evaluation of coupling schemes independent of particular time integration approaches for each process while still allowing for the incorporation of these specific time integration errors if so desired. The framework also explains how the splitting error terms result from (i) the integration of individual processes in isolation from other processes, and (ii) the choices of input state and timestep size for the isolated integration of processes.
△ Less
Submitted 20 February, 2024; v1 submitted 8 June, 2023;
originally announced June 2023.
-
Performance of explicit and IMEX MRI multirate methods on complex reactive flow problems within modern parallel adaptive structured grid frameworks
Authors:
John J. Loffeld,
Andy Nonaka,
Daniel R. Reynolds,
David J. Gardner,
Carol S. Woodward
Abstract:
Large-scale multiphysics simulations are computationally challenging due to the coupling of multiple processes with widely disparate time scales. The advent of exascale computing systems exacerbates these challenges, since these enable ever increasing size and complexity. Recently, there has been renewed interest in developing multirate methods as a means to handle the large range of time scales,…
▽ More
Large-scale multiphysics simulations are computationally challenging due to the coupling of multiple processes with widely disparate time scales. The advent of exascale computing systems exacerbates these challenges, since these enable ever increasing size and complexity. Recently, there has been renewed interest in developing multirate methods as a means to handle the large range of time scales, as these methods may afford greater accuracy and efficiency than more traditional approaches of using IMEX and low-order operator splitting schemes. However, there have been few performance studies that compare different classes of multirate integrators on complex application problems. We study the performance of several newly developed multirate infinitesimal (MRI) methods, implemented in the SUNDIALS solver package, on two reacting flow model problems built on structured mesh frameworks. The first model revisits the work of Emmet et al. (2014) on a compressible reacting flow problem with complex chemistry that is implemented using BoxLib but where we now include comparisons between a new explicit MRI scheme with the multirate spectral deferred correction (SDC) methods in the original paper. The second problem uses the same complex chemistry as the first problem, combined with a simplified flow model, but run at a large spatial scale where explicit methods become infeasible due to stability constraints. Two recently developed implicit-explicit MRI multirate methods are tested. These methods rely on advanced features of the AMReX framework on which the model is built, such as multilevel grids and multilevel preconditioners. The results from these two problems show that MRI multirate methods can offer significant performance benefits on complex multiphysics application problems and that these methods may be combined with advanced spatial discretization to compound the advantages of both.
△ Less
Submitted 6 November, 2022;
originally announced November 2022.
-
ARKODE: a flexible IVP solver infrastructure for one-step methods
Authors:
Daniel R. Reynolds,
David J. Gardner,
Carol S. Woodward,
Rujeko Chinomona
Abstract:
We describe the ARKODE library of one-step time integration methods for ordinary differential equation (ODE) initial-value problems (IVPs). In addition to providing standard explicit and diagonally implicit Runge--Kutta methods, ARKODE also supports one-step methods designed to treat additive splittings of the IVP, including implicit-explicit (ImEx) additive Runge--Kutta methods and multirate infi…
▽ More
We describe the ARKODE library of one-step time integration methods for ordinary differential equation (ODE) initial-value problems (IVPs). In addition to providing standard explicit and diagonally implicit Runge--Kutta methods, ARKODE also supports one-step methods designed to treat additive splittings of the IVP, including implicit-explicit (ImEx) additive Runge--Kutta methods and multirate infinitesimal (MRI) methods. We present the role of ARKODE within the SUNDIALS suite of time integration and nonlinear solver libraries, the core ARKODE infrastructure for utilities common to large classes of one-step methods, as well as its use of ``time stepper'' modules enabling easy incorporation of novel algorithms into the library. Numerical results show example problems of increasing complexity, highlighting the algorithmic flexibility afforded through this infrastructure, and include a larger multiphysics application leveraging multiple algorithmic features from ARKODE and SUNDIALS.
△ Less
Submitted 21 December, 2022; v1 submitted 27 May, 2022;
originally announced May 2022.
-
Performance of Low Synchronization Orthogonalization Methods in Anderson Accelerated Fixed Point Solvers
Authors:
Shelby Lockhart,
David J. Gardner,
Carol S. Woodward,
Stephen Thomas,
Luke N. Olson
Abstract:
Anderson Acceleration (AA) is a method to accelerate the convergence of fixed point iterations for nonlinear, algebraic systems of equations. Due to the requirement of solving a least squares problem at each iteration and a reliance on modified Gram-Schmidt for updating the iteration space, AA requires extra costly synchronization steps for global reductions. Moreover, the number of reductions in…
▽ More
Anderson Acceleration (AA) is a method to accelerate the convergence of fixed point iterations for nonlinear, algebraic systems of equations. Due to the requirement of solving a least squares problem at each iteration and a reliance on modified Gram-Schmidt for updating the iteration space, AA requires extra costly synchronization steps for global reductions. Moreover, the number of reductions in each iteration depends on the size of the iteration space. In this work, we introduce three low synchronization orthogonalization algorithms into AA within SUNDIALS that reduce the total number of global reductions per iteration to a constant of 2 or 3, independent of the size of the iteration space. A performance study demonstrates the reduced time required by the new algorithms at large processor counts with CPUs and demonstrates the predicted performance on multi-GPU architectures. Most importantly, we provide convergence and timing data for multiple numerical experiments to demonstrate reliability of the algorithms within AA and improved performance at parallel strong-scaling limits.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
Implicit multirate GARK methods
Authors:
Steven Roberts,
John Loffeld,
Arash Sarshar,
Carol S. Woodward,
Adrian Sandu
Abstract:
This work considers multirate generalized-structure additively partitioned Runge-Kutta (MrGARK) methods for solving stiff systems of ordinary differential equations (ODEs) with multiple time scales. These methods treat different partitions of the system with different timesteps for a more targeted and efficient solution compared to monolithic single rate approaches. With implicit methods used acro…
▽ More
This work considers multirate generalized-structure additively partitioned Runge-Kutta (MrGARK) methods for solving stiff systems of ordinary differential equations (ODEs) with multiple time scales. These methods treat different partitions of the system with different timesteps for a more targeted and efficient solution compared to monolithic single rate approaches. With implicit methods used across all partitions, methods must find a balance between stability and the cost of solving nonlinear equations for the stages. In order to characterize this important trade-off, we explore multirate coupling strategies, problems for assessing linear stability, and techniques to efficiently implement Newton iterations for stage equations. Unlike much of the existing multirate stability analysis which is limited in scope to particular methods, we present general statements on stability and describe fundamental limitations for certain types of multirate schemes. New implicit multirate methods up to fourth order are derived, and their accuracy and efficiency properties are verified with numerical tests.
△ Less
Submitted 30 November, 2020; v1 submitted 30 October, 2019;
originally announced October 2019.
-
Evaluation of Implicit-Explicit Additive Runge-Kutta Integrators for the HOMME-NH Dynamical Core
Authors:
Christopher J. Vogl,
Andrew Steyer,
Daniel R. Reynolds,
Paul A. Ullrich,
Carol S. Woodward
Abstract:
The nonhydrostatic High Order Method Modeling Environment (HOMME-NH) atmospheric dynamical core supports acoustic waves that propagate significantly faster than the advective wind speed, thus greatly limiting the timestep size that can be used with standard explicit time-integration methods. Resolving acoustic waves is unnecessary for accurate climate and weather prediction. This numerical stiffne…
▽ More
The nonhydrostatic High Order Method Modeling Environment (HOMME-NH) atmospheric dynamical core supports acoustic waves that propagate significantly faster than the advective wind speed, thus greatly limiting the timestep size that can be used with standard explicit time-integration methods. Resolving acoustic waves is unnecessary for accurate climate and weather prediction. This numerical stiffness is addressed herein by considering implicit-explicit additive Runge-Kutta (ARK IMEX) methods that can treat the acoustic waves in a stable manner without requiring implicit treatment of non-stiff modes. Various ARK IMEX methods are evaluated for their efficiency in producing accurate solutions, ability to take large timestep sizes, and sensitivity to grid cell length ratio. Both the Gravity Wave test and Baroclinic Instability test from the 2012 Dynamical Core Model Intercomparison Project (DCMIP) are used to recommend 5 of the 27 ARK IMEX methods for use in HOMME-NH.
△ Less
Submitted 4 December, 2019; v1 submitted 22 April, 2019;
originally announced April 2019.
-
Research and Education in Computational Science and Engineering
Authors:
Ulrich Rüde,
Karen Willcox,
Lois Curfman McInnes,
Hans De Sterck,
George Biros,
Hans Bungartz,
James Corones,
Evin Cramer,
James Crowley,
Omar Ghattas,
Max Gunzburger,
Michael Hanke,
Robert Harrison,
Michael Heroux,
Jan Hesthaven,
Peter Jimack,
Chris Johnson,
Kirk E. Jordan,
David E. Keyes,
Rolf Krause,
Vipin Kumar,
Stefan Mayer,
Juan Meza,
Knut Martin Mørken,
J. Tinsley Oden
, et al. (8 additional authors not shown)
Abstract:
Over the past two decades the field of computational science and engineering (CSE) has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that…
▽ More
Over the past two decades the field of computational science and engineering (CSE) has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that neither theory nor experiment alone is equipped to answer. CSE provides scientists and engineers of all persuasions with algorithmic inventions and software systems that transcend disciplines and scales. Carried on a wave of digital technology, CSE brings the power of parallelism to bear on troves of data. Mathematics-based advanced computing has become a prevalent means of discovery and innovation in essentially all areas of science, engineering, technology, and society; and the CSE community is at the core of this transformation. However, a combination of disruptive developments---including the architectural complexity of extreme-scale computing, the data revolution that engulfs the planet, and the specialization required to follow the applications to new frontiers---is redefining the scope and reach of the CSE endeavor. This report describes the rapid expansion of CSE and the challenges to sustaining its bold advances. The report also presents strategies and directions for CSE research and education for the next decade.
△ Less
Submitted 31 December, 2017; v1 submitted 8 October, 2016;
originally announced October 2016.