subscribe to arXiv mailings

Dynamical Measure Transport and Neural PDE Solvers for Sampling

Authors: Jingtong Sun, Julius Berner, Lorenz Richter, Marius Zeinhofer, Johannes Müller, Kamyar Azizzadenesheli, Anima Anandkumar

Abstract: The task of sampling from a probability density can be approached as transporting a tractable density function to the target, known as dynamical measure transport. In this work, we tackle it through a principled unified framework using deterministic or stochastic evolutions described by partial differential equations (PDEs). This framework incorporates prior trajectory-based sampling methods, such… ▽ More The task of sampling from a probability density can be approached as transporting a tractable density function to the target, known as dynamical measure transport. In this work, we tackle it through a principled unified framework using deterministic or stochastic evolutions described by partial differential equations (PDEs). This framework incorporates prior trajectory-based sampling methods, such as diffusion models or Schrödinger bridges, without relying on the concept of time-reversals. Moreover, it allows us to propose novel numerical methods for solving the transport task and thus sampling from complicated targets without the need for the normalization constant or data samples. We employ physics-informed neural networks (PINNs) to approximate the respective PDE solutions, implying both conceptional and computational advantages. In particular, PINNs allow for simulation- and discretization-free optimization and can be trained very efficiently, leading to significantly better mode coverage in the sampling task compared to alternative methods. Moreover, they can readily be fine-tuned with Gauss-Newton methods to achieve high accuracy in sampling. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.03356 [pdf, other]

AI Driven Laser Parameter Search: Inverse Design of Photonic Surfaces using Greedy Surrogate-based Optimization

Authors: Luka Grbcic, Minok Park, Juliane Müller, Vassilia Zorba, Wibe Albert de Jong

Abstract: Photonic surfaces designed with specific optical characteristics are becoming increasingly important for use in in various energy harvesting and storage systems. , In this study, we develop a surrogate-based optimization approach for designing such surfaces. The surrogate-based optimization framework employs the Random Forest algorithm and uses a greedy, prediction-based exploration strategy to id… ▽ More Photonic surfaces designed with specific optical characteristics are becoming increasingly important for use in in various energy harvesting and storage systems. , In this study, we develop a surrogate-based optimization approach for designing such surfaces. The surrogate-based optimization framework employs the Random Forest algorithm and uses a greedy, prediction-based exploration strategy to identify the laser fabrication parameters that minimize the discrepancy relative to a user-defined target optical characteristics. We demonstrate the approach on two synthetic benchmarks and two specific cases of photonic surface inverse design targets. It exhibits superior performance when compared to other optimization algorithms across all benchmarks. Additionally, we demonstrate a technique of inverse design warm starting for changed target optical characteristics which enhances the performance of the introduced approach. △ Less

Submitted 20 June, 2024; originally announced July 2024.

arXiv:2406.04163 [pdf, ps, other]

Essentially Sharp Estimates on the Entropy Regularization Error in Discrete Discounted Markov Decision Processes

Authors: Johannes Müller, Semih Cayci

Abstract: We study the error introduced by entropy regularization of infinite-horizon discrete discounted Markov decision processes. We show that this error decreases exponentially in the inverse regularization strength both in a weighted KL-divergence and in value with a problem-specific exponent. We provide a lower bound matching our upper bound up to a polynomial factor. Our proof relies on the correspon… ▽ More We study the error introduced by entropy regularization of infinite-horizon discrete discounted Markov decision processes. We show that this error decreases exponentially in the inverse regularization strength both in a weighted KL-divergence and in value with a problem-specific exponent. We provide a lower bound matching our upper bound up to a polynomial factor. Our proof relies on the correspondence of the solutions of entropy-regularized Markov decision processes with gradient flows of the unregularized reward with respect to a Riemannian metric common in natural policy gradient methods. Further, this correspondence allows us to identify the limit of the gradient flow as the generalized maximum entropy optimal policy, thereby characterizing the implicit bias of the Kakade gradient flow which corresponds to a time-continuous version of the natural policy gradient method. We use this to show that for entropy-regularized natural policy gradient methods the overall error decays exponentially in the square root of the number of iterations improving existing sublinear guarantees. △ Less

Submitted 25 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

Comments: 26 pages, 1 figure

MSC Class: 37N40; 65K05; 90C05; 90C40; 90C53

arXiv:2405.03588 [pdf, other]

Effective Quadratic Error Bounds for Floating-Point Algorithms Computing the Hypotenuse Function

Authors: Jean-Michel Muller, Bruno Salvy

Abstract: We provide tools to help automate the error analysis of algorithms that evaluate simple functions over the floating-point numbers. The aim is to obtain tight relative error bounds for these algorithms, expressed as a function of the unit round-off. Due to the discrete nature of the set of floating-point numbers, the largest errors are often intrinsically "arithmetic" in the sense that their appear… ▽ More We provide tools to help automate the error analysis of algorithms that evaluate simple functions over the floating-point numbers. The aim is to obtain tight relative error bounds for these algorithms, expressed as a function of the unit round-off. Due to the discrete nature of the set of floating-point numbers, the largest errors are often intrinsically "arithmetic" in the sense that their appearance may depend on specific bit patterns in the binary representations of intermediate variables, which may be present only for some precisions. We focus on generic (i.e., parameterized by the precision) and analytic over-estimations that still capture the correlations between the errors made at each step of the algorithms. Using methods from computer algebra, which we adapt to the particular structure of the polynomial systems that encode the errors, we obtain bounds with a linear term in the unit round-off that is sharp in manycases. An explicit quadratic bound is given, rather than the $O()$-estimate that is more common in this area. This is particularly important when using low precision formats, which are increasingly common in modern processors. Using this approach, we compare five algorithms for computing the hypotenuse function, ranging from elementary to quite challenging. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2404.00053 [pdf, other]

Adaptive Computing for Scale-up Problems

Authors: Hilary Egan, Kevin Patrick Griffin, Marc T. Henry de Frahan, Juliane Mueller, Deepthi Vaidhynatha, Dylan Wald, Rohit Chintala, Olga A. Doronina, Ryan King, Jibonananda Sanyal, Marc Day

Abstract: Adaptive Computing is an application-agnostic outer loop framework to strategically deploy simulations and experiments to guide decision making for scale-up analysis. Resources are allocated over successive batches, which makes the allocation adaptive to some objective such as optimization or model training. The framework enables the characterization and management of uncertainties associated with… ▽ More Adaptive Computing is an application-agnostic outer loop framework to strategically deploy simulations and experiments to guide decision making for scale-up analysis. Resources are allocated over successive batches, which makes the allocation adaptive to some objective such as optimization or model training. The framework enables the characterization and management of uncertainties associated with predictive models of complex systems when scale-up questions lead to significant model extrapolation. A key feature of this framework is the ability to explicitly utilize user-specified uncertainty priors, which we call model-specific local trust estimates, that are provided directly together with the problem specification and exploited in adaptive sampling strategies. A multi-fidelity model hierarchy is supported to allow trade-offs in accuracy and data acquisition cost while exploring the search space given a specified budget of potentially distributed, heterogeneous resources. We discuss application of this framework to problems in the renewable energy space, including biofuels production, material synthesis, perovskite crystal growth, and building electrical loads. △ Less

Submitted 25 March, 2024; originally announced April 2024.

Comments: 11 pages, 5 figures

arXiv:2403.19448 [pdf, other]

Fisher-Rao Gradient Flows of Linear Programs and State-Action Natural Policy Gradients

Authors: Johannes Müller, Semih Çaycı, Guido Montúfar

Abstract: Kakade's natural policy gradient method has been studied extensively in the last years showing linear convergence with and without regularization. We study another natural gradient method which is based on the Fisher information matrix of the state-action distributions and has received little attention from the theoretical side. Here, the state-action distributions follow the Fisher-Rao gradient f… ▽ More Kakade's natural policy gradient method has been studied extensively in the last years showing linear convergence with and without regularization. We study another natural gradient method which is based on the Fisher information matrix of the state-action distributions and has received little attention from the theoretical side. Here, the state-action distributions follow the Fisher-Rao gradient flow inside the state-action polytope with respect to a linear potential. Therefore, we study Fisher-Rao gradient flows of linear programs more generally and show linear convergence with a rate that depends on the geometry of the linear program. Equivalently, this yields an estimate on the error induced by entropic regularization of the linear program which improves existing results. We extend these results and show sublinear convergence for perturbed Fisher-Rao gradient flows and natural gradient flows up to an approximation error. In particular, these general results cover the case of state-action natural policy gradients. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 27 pages, 4 figures, under review

MSC Class: 65K05; 90C05; 90C08; 90C40; 90C53

arXiv:2402.07318 [pdf, other]

Position: Optimization in SciML Should Employ the Function Space Geometry

Authors: Johannes Müller, Marius Zeinhofer

Abstract: Scientific machine learning (SciML) is a relatively new field that aims to solve problems from different fields of natural sciences using machine learning tools. It is well-documented that the optimizers commonly used in other areas of machine learning perform poorly on many SciML problems. We provide an infinite-dimensional view on optimization problems encountered in scientific machine learning… ▽ More Scientific machine learning (SciML) is a relatively new field that aims to solve problems from different fields of natural sciences using machine learning tools. It is well-documented that the optimizers commonly used in other areas of machine learning perform poorly on many SciML problems. We provide an infinite-dimensional view on optimization problems encountered in scientific machine learning and advocate for the paradigm first optimize, then discretize for their solution. This amounts to first choosing an appropriate infinite-dimensional algorithm which is then discretized in a second step. To illustrate this point, we show that recently proposed state-of-the-art algorithms for SciML applications can be derived within this framework. As the infinite-dimensional viewpoint is presently underdeveloped in scientific machine learning, we formalize it here and advocate for its use in SciML in the development of efficient optimization algorithms. △ Less

Submitted 28 May, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

Comments: 9 pages main body, 2 figures, 1 table

Journal ref: Proceedings of the 41 st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

arXiv:2402.00169 [pdf, ps, other]

Algorithms for $p$-adic Heights on Hyperelliptic Curves of Arbitrary Reduction

Authors: Francesca Bianchi, Enis Kaya, J. Steffen Müller

Abstract: In this paper, we develop an algorithm for computing Coleman--Gross (and hence Nekovář) $p$-adic heights on hyperelliptic curves over number fields with arbitrary reduction type above $p$. This height is defined as a sum of local heights at each finite place and we use algorithms for Vologodsky integrals, developed by Katz and the second-named author, to compute the local heights above $p$. We als… ▽ More In this paper, we develop an algorithm for computing Coleman--Gross (and hence Nekovář) $p$-adic heights on hyperelliptic curves over number fields with arbitrary reduction type above $p$. This height is defined as a sum of local heights at each finite place and we use algorithms for Vologodsky integrals, developed by Katz and the second-named author, to compute the local heights above $p$. We also discuss an alternative method to compute these for odd degree genus 2 curves via $p$-adic sigma functions, via work of the first-named author. For both approaches one needs to choose a splitting of the Hodge filtration. A canonical choice for this is due to Blakestad in the case of an odd degree curve of genus $2$ that has semistable ordinary reduction at $p$. We provide an algorithm to compute Blakestad's splitting, which is conjecturally the unit root splitting for the action of Frobenius. We give several numerical examples, including the first worked quadratic Chabauty example in the literature for a curve with bad reduction. △ Less

Submitted 31 January, 2024; originally announced February 2024.

MSC Class: 14G40 (Primary) 11G50; 11S80; 14G05; 14F30 (Secondary)

arXiv:2310.15049 [pdf, ps, other]

Coleman-Gross Heights and $p$-adic Néron Functions on Jacobians of Genus $2$ Curves

Authors: Francesca Bianchi, Enis Kaya, J. Steffen Müller

Abstract: We develop a theory of $p$-adic Néron functions on abelian varieties, depending on various auxiliary choices, and show that the global $p$-adic height functions constructed by Mazur and Tate can be decomposed into a sum of $p$-adic Néron functions if the same auxiliary choices are made. We also decompose the $p$-adic height constructed by Coleman and Gross, and extended to arbitrary reduction by C… ▽ More We develop a theory of $p$-adic Néron functions on abelian varieties, depending on various auxiliary choices, and show that the global $p$-adic height functions constructed by Mazur and Tate can be decomposed into a sum of $p$-adic Néron functions if the same auxiliary choices are made. We also decompose the $p$-adic height constructed by Coleman and Gross, and extended to arbitrary reduction by Colmez and Besser, into a sum of local height functions for Jacobians of odd degree genus $2$ curves. We show that this local height function is equal to the $p$-adic Néron function with the same auxiliary choices, regardless of the reduction type of the curve. This extends work of Balakrishnan and Besser for elliptic curves. When the curve has semistable reduction and the reduction of the Jacobian is ordinary, we also describe the $p$-adic Néron function that arises from the canonical Mazur-Tate splitting explicitly in terms of a generalisation of the $p$-adic sigma function constructed by Blakestad. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: 38 pages

MSC Class: Primary: 11G50; 14G40. Secondary: 14H42; 14K25; 11S40; 11S80

arXiv:2307.15787 [pdf, ps, other]

Computing p-adic heights on hyperelliptic curves

Authors: Stevan Gajović, J. Steffen Müller

Abstract: We describe an algorithm to compute the local Coleman-Gross p-adic height at p on a hyperelliptic curve. Previously, this was only possible using an algorithm due to Balakrishnan and Besser, which was limited to odd degree. While we follow their general strategy, our algorithm is significantly faster and simpler and works for both odd and even degree. We discuss a precision analysis and an impleme… ▽ More We describe an algorithm to compute the local Coleman-Gross p-adic height at p on a hyperelliptic curve. Previously, this was only possible using an algorithm due to Balakrishnan and Besser, which was limited to odd degree. While we follow their general strategy, our algorithm is significantly faster and simpler and works for both odd and even degree. We discuss a precision analysis and an implementation in SageMath. Our work has several applications, also discussed in this article. These include various versions of the quadratic Chabauty method, and numerical evidence for a p-adic version of the conjecture of Birch and Swinnerton-Dyer in cases where this was not previously possible. △ Less

Submitted 10 January, 2024; v1 submitted 28 July, 2023; originally announced July 2023.

Comments: 24 pages

arXiv:2307.15781 [pdf, ps, other]

Linear quadratic Chabauty

Authors: Stevan Gajović, J. Steffen Müller

Abstract: We present a new quadratic Chabauty method to compute the integral points on certain even degree hyperelliptic curves. Our approach relies on a nontrivial degree zero divisor supported at the two points at infinity to restrict the $p$-adic height to a linear function; we can then express this restriction in terms of holomorphic Coleman integrals under the standard quadratic Chabauty assumption. Th… ▽ More We present a new quadratic Chabauty method to compute the integral points on certain even degree hyperelliptic curves. Our approach relies on a nontrivial degree zero divisor supported at the two points at infinity to restrict the $p$-adic height to a linear function; we can then express this restriction in terms of holomorphic Coleman integrals under the standard quadratic Chabauty assumption. Then we use this linear relation to extract the integral points on the curve. We also generalize our method to integral points over number fields. Our method is significantly simpler and faster than all other existing versions of the quadratic Chabauty method. We give examples over $\Q$ and $\Q(\sqrt{7})$. △ Less

Submitted 2 August, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

Comments: 20 pages

arXiv:2307.08370 [pdf, other]

doi 10.1098/rsif.2023.0409

Parameter estimation for contact tracing in graph-based models

Authors: Augustine Okolie, Johannes Müller, Mirjam Kretzschmar

Abstract: We adopt a maximum-likelihood framework to estimate parameters of a stochastic susceptible-infected-recovered (SIR) model with contact tracing on a rooted random tree. Given the number of detectees per index case, our estimator allows to determine the degree distribution of the random tree as well as the tracing probability. Since we do not discover all infectees via contact tracing, this estimati… ▽ More We adopt a maximum-likelihood framework to estimate parameters of a stochastic susceptible-infected-recovered (SIR) model with contact tracing on a rooted random tree. Given the number of detectees per index case, our estimator allows to determine the degree distribution of the random tree as well as the tracing probability. Since we do not discover all infectees via contact tracing, this estimation is non-trivial. To keep things simple and stable, we develop an approximation suited for realistic situations (contract tracing probability small, or the probability for the detection of index cases small). In this approximation, the only epidemiological parameter entering the estimator is $R_0$. The estimator is tested in a simulation study and is furthermore applied to covid-19 contact tracing data from India. The simulation study underlines the efficiency of the method. For the empirical covid-19 data, we compare different degree distributions and perform a sensitivity analysis. We find that particularly a power-law and a negative binomial degree distribution fit the data well and that the tracing probability is rather large. The sensitivity analysis shows no strong dependency of the estimates on the reproduction number. Finally, we discuss the relevance of our findings. △ Less

Submitted 22 November, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

Comments: 24 pages, 8 figures, 3 tables

MSC Class: 92D30

Journal ref: Royal Society Interface 2023

arXiv:2302.13163 [pdf, other]

Achieving High Accuracy with PINNs via Energy Natural Gradients

Authors: Johannes Müller, Marius Zeinhofer

Abstract: We propose energy natural gradient descent, a natural gradient method with respect to a Hessian-induced Riemannian metric as an optimization algorithm for physics-informed neural networks (PINNs) and the deep Ritz method. As a main motivation we show that the update direction in function space resulting from the energy natural gradient corresponds to the Newton direction modulo an orthogonal proje… ▽ More We propose energy natural gradient descent, a natural gradient method with respect to a Hessian-induced Riemannian metric as an optimization algorithm for physics-informed neural networks (PINNs) and the deep Ritz method. As a main motivation we show that the update direction in function space resulting from the energy natural gradient corresponds to the Newton direction modulo an orthogonal projection onto the model's tangent space. We demonstrate experimentally that energy natural gradient descent yields highly accurate solutions with errors several orders of magnitude smaller than what is obtained when training PINNs with standard optimizers like gradient descent or Adam, even when those are allowed significantly more computation time. △ Less

Submitted 15 August, 2023; v1 submitted 25 February, 2023; originally announced February 2023.

Comments: Published version

arXiv:2301.11193 [pdf, ps, other]

Linear and quadratic Chabauty for affine hyperbolic curves

Authors: Marius Leonhardt, Martin Lüdtke, J. Steffen Müller

Abstract: We give sufficient conditions for finiteness of linear and quadratic refined Chabauty-Kim loci of affine hyperbolic curves. We achieve this by constructing depth $\leq 2$ quotients of the fundamental group, following a construction of Balakrishnan-Dogra in the projective case. We also apply Betts' machinery of weight filtrations to give unconditional explicit upper bounds on the number of S-integr… ▽ More We give sufficient conditions for finiteness of linear and quadratic refined Chabauty-Kim loci of affine hyperbolic curves. We achieve this by constructing depth $\leq 2$ quotients of the fundamental group, following a construction of Balakrishnan-Dogra in the projective case. We also apply Betts' machinery of weight filtrations to give unconditional explicit upper bounds on the number of S-integral points when our conditions are satisfied. △ Less

Submitted 27 July, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

Comments: 20 pages; comments welcome

MSC Class: 14G05 (Primary) 11G30; 11D45 (Secondary)

arXiv:2212.09922 [pdf, ps, other]

On the cohomology of the ramified PEL unitary Rapoport-Zink space of signature $(1,n-1)$

Authors: Joseph Muller

Abstract: In this paper, we study the cohomology of the ramified PEL unitary Rapoport-Zink space of signature $(1,n-1)$ by using the Bruhat-Tits stratification on its special fiber. As such, we apply the same method that we developped for the unramified case in two previous papers. More precisely, we first investigate the cohomology of a given closed Bruhat-Tits stratum. It is isomorphic to a generalized De… ▽ More In this paper, we study the cohomology of the ramified PEL unitary Rapoport-Zink space of signature $(1,n-1)$ by using the Bruhat-Tits stratification on its special fiber. As such, we apply the same method that we developped for the unramified case in two previous papers. More precisely, we first investigate the cohomology of a given closed Bruhat-Tits stratum. It is isomorphic to a generalized Deligne-Lusztig variety which is in general not smooth, and is associated to a finite group of symplectic similitudes. We determine the weights of the Frobenius and most of the unipotent representations occuring in its cohomology. This computation involves the spectral sequence associated to a stratification by classical Deligne-Lusztig varieties, which are parabolically induced from Coxeter varieties of smaller symplectic groups. In particular, all the unipotent representations contribute to only two cuspidal series. Then, we introduce the analytical tubes of the closed Bruhat-Tits strata, which give an open cover of the generic fiber of the Rapoport-Zink space. Using the associated Čech spectral sequence, we prove that certain cohomology groups of the Rapoport-Zink space at hyperspecial level fail to be admissible if $n$ is large enough. Eventually, when $n=2$ in the split case, when $n=3$ and when $n=4$ in the non-split case, we give a complete description of the cohomology of the supersingular locus of the associated Shimura variety at hyperspecial level, in terms of automorphic representations. In particular, certain automorphic representations occur with a multiplicity depending on $p$. Thus, our computations in the ramified case recover all the main features of the unramified case, despite new technical difficulties due to the closed Bruhat-Tits strata not being smooth. △ Less

Submitted 30 December, 2022; v1 submitted 19 December, 2022; originally announced December 2022.

Comments: 54 pages. The previous version contained some mistakes. They are now fixed in this new version. The last section and the main theorem have been modified accordingly. Comments are welcome!

MSC Class: 14G35; 11G18; 14F20; 20C33

arXiv:2211.09439 [pdf, other]

Algebraic optimization of sequential decision problems

Authors: Mareike Dressler, Marina Garrote-López, Guido Montúfar, Johannes Müller, Kemal Rose

Abstract: We study the optimization of the expected long-term reward in finite partially observable Markov decision processes over the set of stationary stochastic policies. In the case of deterministic observations, also known as state aggregation, the problem is equivalent to optimizing a linear objective subject to quadratic constraints. We characterize the feasible set of this problem as the intersectio… ▽ More We study the optimization of the expected long-term reward in finite partially observable Markov decision processes over the set of stationary stochastic policies. In the case of deterministic observations, also known as state aggregation, the problem is equivalent to optimizing a linear objective subject to quadratic constraints. We characterize the feasible set of this problem as the intersection of a product of affine varieties of rank one matrices and a polytope. Based on this description, we obtain bounds on the number of critical points of the optimization problem. Finally, we conduct experiments in which we solve the KKT equations or the Lagrange equations over different boundary components of the feasible set, and compare the result to the theoretical bounds and to other constrained optimization methods. △ Less

Submitted 17 November, 2022; originally announced November 2022.

Comments: 19 pages, 3 figures

MSC Class: 62R01; 90C23; 90C40

arXiv:2211.03372 [pdf, ps, other]

Computing torsion subgroups of Jacobians of hyperelliptic curves of genus 3

Authors: J. Steffen Müller, Berno Reitsma

Abstract: We introduce an algorithm to compute the rational torsion subgroup of the Jacobian of a hyperelliptic curve of genus 3 over the rationals. We apply a Magma implementation of our algorithm to a database of curves with low discriminant due to Sutherland as well as a list of curves with small coefficients. In the process, we find several torsion structures not previously described in the literature.… ▽ More We introduce an algorithm to compute the rational torsion subgroup of the Jacobian of a hyperelliptic curve of genus 3 over the rationals. We apply a Magma implementation of our algorithm to a database of curves with low discriminant due to Sutherland as well as a list of curves with small coefficients. In the process, we find several torsion structures not previously described in the literature. The algorithm is a generalisation of an algorithm for genus 2 due to Stoll, which we extend to abelian varieties satisfying certain conditions. The idea is to compute p-adic torsion lifts of points over finite fields using the Kummer variety and to check whether they are rational using heights. Both have been made explicit for Jacobians of hyperelliptic curves of genus 3 by Stoll. This article is partially based on the second-named author's Master thesis. △ Less

Submitted 16 March, 2023; v1 submitted 7 November, 2022; originally announced November 2022.

Comments: Fixed typos; to appear in Research in Number Theory

MSC Class: 11G10; 11G30; 14H40; 14K15

arXiv:2211.02105 [pdf, other]

doi 10.1007/s41884-023-00106-z

Geometry and convergence of natural policy gradient methods

Authors: Johannes Müller, Guido Montúfar

Abstract: We study the convergence of several natural policy gradient (NPG) methods in infinite-horizon discounted Markov decision processes with regular policy parametrizations. For a variety of NPGs and reward functions we show that the trajectories in state-action space are solutions of gradient flows with respect to Hessian geometries, based on which we obtain global convergence guarantees and convergen… ▽ More We study the convergence of several natural policy gradient (NPG) methods in infinite-horizon discounted Markov decision processes with regular policy parametrizations. For a variety of NPGs and reward functions we show that the trajectories in state-action space are solutions of gradient flows with respect to Hessian geometries, based on which we obtain global convergence guarantees and convergence rates. In particular, we show linear convergence for unregularized and regularized NPG flows with the metrics proposed by Kakade and Morimura and co-authors by observing that these arise from the Hessian geometries of conditional entropy and entropy respectively. Further, we obtain sublinear convergence rates for Hessian geometries arising from other convex functions like log-barriers. Finally, we interpret the discrete-time NPG methods with regularized rewards as inexact Newton methods if the NPG is defined with respect to the Hessian geometry of the regularizer. This yields local quadratic convergence rates of these methods for step size equal to the penalization strength. △ Less

Submitted 3 November, 2022; originally announced November 2022.

Comments: 33 pages, 5 figures, under review

MSC Class: 90C40; 53B12; 90C53

arXiv:2209.05439 [pdf, ps, other]

A Fourier integral formula for logarithmic energy

Authors: Leonhard Frerick, Jürgen Müller, Tobias Thomaser

Abstract: A formula which expresses logarithmic energy of Borel measures on R^n in terms of the Fourier transforms of the measures is established and some applications are given. In addition, using similar techniques a (known) formula for Riesz energy is reinvented. A formula which expresses logarithmic energy of Borel measures on R^n in terms of the Fourier transforms of the measures is established and some applications are given. In addition, using similar techniques a (known) formula for Riesz energy is reinvented. △ Less

Submitted 8 November, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

Comments: 14 pages

MSC Class: 31A15; 31B15

arXiv:2206.15273 [pdf, ps, other]

doi 10.1007/s41884-022-00067-9

Invariance Properties of the Natural Gradient in Overparametrised Systems

Authors: Jesse van Oostrum, Johannes Müller, Nihat Ay

Abstract: The natural gradient field is a vector field that lives on a model equipped with a distinguished Riemannian metric, e.g. the Fisher-Rao metric, and represents the direction of steepest ascent of an objective function on the model with respect to this metric. In practice, one tries to obtain the corresponding direction on the parameter space by multiplying the ordinary gradient by the inverse of th… ▽ More The natural gradient field is a vector field that lives on a model equipped with a distinguished Riemannian metric, e.g. the Fisher-Rao metric, and represents the direction of steepest ascent of an objective function on the model with respect to this metric. In practice, one tries to obtain the corresponding direction on the parameter space by multiplying the ordinary gradient by the inverse of the Gram matrix associated with the metric. We refer to this vector on the parameter space as the natural parameter gradient. In this paper we study when the pushforward of the natural parameter gradient is equal to the natural gradient. Furthermore we investigate the invariance properties of the natural parameter gradient. Both questions are addressed in an overparametrised setting. △ Less

Submitted 30 June, 2022; originally announced June 2022.

Journal ref: Information Geometry, Springer, 2022

arXiv:2206.07449 [pdf, other]

Self-Assessment for Single-Object Tracking in Clutter Using Subjective Logic

Authors: Thomas Griebel, Johannes Müller, Paul Geisler, Charlotte Hermann, Martin Herrmann, Michael Buchholz, Klaus Dietmayer

Abstract: Reliable tracking algorithms are essential for automated driving. However, the existing consistency measures are not sufficient to meet the increasing safety demands in the automotive sector. Therefore, this work presents a novel method for self-assessment of single-object tracking in clutter based on Kalman filtering and subjective logic. A key feature of the approach is that it additionally prov… ▽ More Reliable tracking algorithms are essential for automated driving. However, the existing consistency measures are not sufficient to meet the increasing safety demands in the automotive sector. Therefore, this work presents a novel method for self-assessment of single-object tracking in clutter based on Kalman filtering and subjective logic. A key feature of the approach is that it additionally provides a measure of the collected statistical evidence in its online reliability scores. In this way, various aspects of reliability, such as the correctness of the assumed measurement noise, detection probability, and clutter rate, can be monitored in addition to the overall assessment based on the available evidence. Here, we present a mathematical derivation of the reference distribution used in our self-assessment module for our studied problem. Moreover, we introduce a formula that describes how a threshold should be chosen for the degree of conflict, the subjective logic comparison measure used for the reliability decision making. Our approach is evaluated in a challenging simulation scenario designed to model adverse weather conditions. The simulations show that our method can significantly improve the reliability checking of single-object tracking in clutter in several aspects. △ Less

Submitted 15 June, 2022; originally announced June 2022.

Comments: Accepted for presentation at the 2022 IEEE 25th International Conference on Information Fusion (FUSION), July 4 - 7, 2022, Linköping, Sweden

arXiv:2205.14744 [pdf, ps, other]

Rational points on $X^+_0(125)$

Authors: Vishal Arul, J. Steffen Müller

Abstract: We compute the rational points on the Atkin-Lehner quotient $X^+_0(125)$ using the quadratic Chabauty method. Our work completes the study of exceptional rational points on the curves $X^+_0(N)$ of genus between 2 and 6. Together with the work of several authors, this completes the proof of a conjecture of Galbraith. We compute the rational points on the Atkin-Lehner quotient $X^+_0(125)$ using the quadratic Chabauty method. Our work completes the study of exceptional rational points on the curves $X^+_0(N)$ of genus between 2 and 6. Together with the work of several authors, this completes the proof of a conjecture of Galbraith. △ Less

Submitted 26 December, 2022; v1 submitted 29 May, 2022; originally announced May 2022.

Comments: 8 pages; minor changes following referee report

MSC Class: 14g05; 11g30; 11g18

arXiv:2205.14098 [pdf, other]

Solving infinite-horizon POMDPs with memoryless stochastic policies in state-action space

Authors: Johannes Müller, Guido Montúfar

Abstract: Reward optimization in fully observable Markov decision processes is equivalent to a linear program over the polytope of state-action frequencies. Taking a similar perspective in the case of partially observable Markov decision processes with memoryless stochastic policies, the problem was recently formulated as the optimization of a linear objective subject to polynomial constraints. Based on thi… ▽ More Reward optimization in fully observable Markov decision processes is equivalent to a linear program over the polytope of state-action frequencies. Taking a similar perspective in the case of partially observable Markov decision processes with memoryless stochastic policies, the problem was recently formulated as the optimization of a linear objective subject to polynomial constraints. Based on this we present an approach for Reward Optimization in State-Action space (ROSA). We test this approach experimentally in maze navigation tasks. We find that ROSA is computationally efficient and can yield stability improvements over other existing methods. △ Less

Submitted 27 May, 2022; originally announced May 2022.

Comments: Accepted as an extended abstract at RLDM 2022, 5 pages, 2 figures

arXiv:2204.07331 [pdf, other]

Accelerating Noisy VQE Optimization with Gaussian Processes

Authors: Juliane Mueller, Wim Lavrijsen, Costin Iancu, Wibe de Jong

Abstract: Hybrid variational quantum algorithms, which combine a classical optimizer with evaluations on a quantum chip, are the most promising candidates to show quantum advantage on current noisy, intermediate-scale quantum (NISQ) devices. The classical optimizer is required to perform well in the presence of noise in the objective function evaluations, or else it becomes the weakest link in the algorithm… ▽ More Hybrid variational quantum algorithms, which combine a classical optimizer with evaluations on a quantum chip, are the most promising candidates to show quantum advantage on current noisy, intermediate-scale quantum (NISQ) devices. The classical optimizer is required to perform well in the presence of noise in the objective function evaluations, or else it becomes the weakest link in the algorithm. We introduce the use of Gaussian Processes (GP) as surrogate models to reduce the impact of noise and to provide high quality seeds to escape local minima, whether real or noise-induced. We build this as a framework on top of local optimizations, for which we choose Implicit Filtering (ImFil) in this study. ImFil is a state-of-the-art, gradient-free method, which in comparative studies has been shown to outperform on noisy VQE problems. The result is a new method: "GP+ImFil". We show that when noise is present, the GP+ImFil approach finds results closer to the true global minimum in fewer evaluations than standalone ImFil, and that it works particularly well for larger dimensional problems. Using GP to seed local searches in a multi-modal landscape shows mixed results: although it is capable of improving on ImFil standalone, it does not do so consistently and would only be preferred over other, more exhaustive, multistart methods if resources are constrained. △ Less

Submitted 3 August, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

Comments: 11 pages, 14 figures

arXiv:2203.01987 [pdf, other]

doi 10.1145/3517746

OptiTrap: Optimal Trap Trajectories for Acoustic Levitation Displays

Authors: Viktorija Paneva, Arthur Fleig, Diego Martínez Plasencia, Timm Faulwasser, Jörg Müller

Abstract: Acoustic levitation has recently demonstrated the ability to create volumetric content by trapping and quickly moving particles along reference paths to reveal shapes in mid-air. However, the problem of specifying physically feasible trap trajectories to display desired shapes remains unsolved. Even if only the final shape is of interest to the content creator, the trap trajectories need to determ… ▽ More Acoustic levitation has recently demonstrated the ability to create volumetric content by trapping and quickly moving particles along reference paths to reveal shapes in mid-air. However, the problem of specifying physically feasible trap trajectories to display desired shapes remains unsolved. Even if only the final shape is of interest to the content creator, the trap trajectories need to determine where and when the traps need to be, for the particle to reveal the intended shape. We propose OptiTrap, the first structured numerical approach to compute trap trajectories for acoustic levitation displays. Our approach generates trap trajectories that are physically feasible and nearly time-optimal, and reveal generic mid-air shapes, given only a reference path (i.e., a shape with no time information). We provide a multi-dimensional model of the acoustic forces around a trap to model the trap-particle system dynamics and compute optimal trap trajectories by formulating and solving a non-linear path following problem. We formulate our approach and evaluate it, demonstrating how OptiTrap consistently produces feasible and nearly optimal paths, with increases in size, frequency, and accuracy of the shapes rendered, allowing us to demonstrate larger and more complex shapes than ever shown to date. △ Less

Submitted 3 March, 2022; originally announced March 2022.

Comments: 14 pages, 9 figures, 4 tables. To appear in the ACM Transaction on Graphics journal

MSC Class: 35Q93 ACM Class: I.3.0

arXiv:2202.12441 [pdf, other]

Long-Term Missing Value Imputation for Time Series Data Using Deep Neural Networks

Authors: Jangho Park, Juliane Muller, Bhavna Arora, Boris Faybishenko, Gilberto Pastorello, Charuleka Varadharajan, Reetik Sahu, Deborah Agarwal

Abstract: We present an approach that uses a deep learning model, in particular, a MultiLayer Perceptron (MLP), for estimating the missing values of a variable in multivariate time series data. We focus on filling a long continuous gap (e.g., multiple months of missing daily observations) rather than on individual randomly missing observations. Our proposed gap filling algorithm uses an automated method for… ▽ More We present an approach that uses a deep learning model, in particular, a MultiLayer Perceptron (MLP), for estimating the missing values of a variable in multivariate time series data. We focus on filling a long continuous gap (e.g., multiple months of missing daily observations) rather than on individual randomly missing observations. Our proposed gap filling algorithm uses an automated method for determining the optimal MLP model architecture, thus allowing for optimal prediction performance for the given time series. We tested our approach by filling gaps of various lengths (three months to three years) in three environmental datasets with different time series characteristics, namely daily groundwater levels, daily soil moisture, and hourly Net Ecosystem Exchange. We compared the accuracy of the gap-filled values obtained with our approach to the widely-used R-based time series gap filling methods ImputeTS and mtsdi. The results indicate that using an MLP for filling a large gap leads to better results, especially when the data behave nonlinearly. Thus, our approach enables the use of datasets that have a large gap in one variable, which is common in many long-term environmental monitoring observations. △ Less

Submitted 24 February, 2022; originally announced February 2022.

arXiv:2201.10229 [pdf, ps, other]

Cohomology of the basic unramified PEL unitary Rapoport-Zink space of signature $(1,n-1)$

Authors: Joseph Muller

Abstract: In this paper, we study the cohomology of the unitary unramified PEL Rapoport-Zink space of signature $(1,n-1)$ at maximal level. Our method revolves around the spectral sequence associated to the open cover by the analytical tubes of the closed Bruhat-Tits strata in the special fiber, which were constructed by Vollaard and Wedhorn. The cohomology of these strata, which are isomorphic to generaliz… ▽ More In this paper, we study the cohomology of the unitary unramified PEL Rapoport-Zink space of signature $(1,n-1)$ at maximal level. Our method revolves around the spectral sequence associated to the open cover by the analytical tubes of the closed Bruhat-Tits strata in the special fiber, which were constructed by Vollaard and Wedhorn. The cohomology of these strata, which are isomorphic to generalized Deligne-Lusztig varieties, has been computed in arXiv:2110.00614 [math.RT]. This spectral sequence allows us to prove the semisimplicity of the Frobenius action and to describe the inertial supports of the irreducible subquotients in the individual cohomology groups. In particular, when $n\geq 5$ no such subquotient is supercuspidal, but when $1\leq n \leq 4$ supercuspidal subquotients do occur. Moreover, we also prove that the cohomology groups need not be admissible in general. Via $p$-adic uniformization, we relate the cohomology of the Rapoport-Zink space to the cohomology of the basic stratum of a Shimura variety with no level at $p$. In the case $n=3$ or $4$, we give a complete description of the cohomology of the basic stratum in terms of automorphic representations. In particular, some automorphic representations occur with multiplicities dependent on $p$. △ Less

Submitted 30 December, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

Comments: 66 pages. The previous version contained some mistakes. They are now fixed in this new version. The last section and the main theorem have been modified accordingly. Comments are welcome!

MSC Class: 14G35; 11G18; 14F20; 11E95; 14L05

arXiv:2112.04230 [pdf, ps, other]

On isospectral metric graphs

Authors: Pavel Kurasov, Jacob Muller

Abstract: A new class of isospectral graphs is presented. These graphs are isospectral with respect to both the normalised Laplacian on the discrete graph and the standard differential Laplacian on the corresponding metric graph. The new class of graphs is obtained by gluing together subgraphs with the Steklov maps possessing special properties. It turns out that isospectrality is related to the degeneracy… ▽ More A new class of isospectral graphs is presented. These graphs are isospectral with respect to both the normalised Laplacian on the discrete graph and the standard differential Laplacian on the corresponding metric graph. The new class of graphs is obtained by gluing together subgraphs with the Steklov maps possessing special properties. It turns out that isospectrality is related to the degeneracy of the Steklov eigenvalues. △ Less

Submitted 8 December, 2021; originally announced December 2021.

Comments: 18 pages

MSC Class: 34L05; 58C40

arXiv:2112.03873 [pdf, ps, other]

p-adic adelic metrics and Quadratic Chabauty I

Authors: Amnon Besser, J. Steffen Müller, Padmavathi Srinivasan

Abstract: We give a new construction of $p$-adic heights on varieties over number fields using $p$-adic Arakelov theory. In analogy with Zhang's construction of real-valued heights in terms of adelic metrics, these heights are given in terms of $p$-adic adelic metrics on line bundles. In particular, we describe a construction of canonical $p$-adic heights on abelian varieties and we show that we recover the… ▽ More We give a new construction of $p$-adic heights on varieties over number fields using $p$-adic Arakelov theory. In analogy with Zhang's construction of real-valued heights in terms of adelic metrics, these heights are given in terms of $p$-adic adelic metrics on line bundles. In particular, we describe a construction of canonical $p$-adic heights on abelian varieties and we show that we recover the canonical Mazur--Tate height and, for Jacobians, the height constructed by Coleman and Gross. Our main application is a new and simplified approach to the Quadratic Chabauty method for the computation of rational points on certain curves over the rationals, by pulling back the canonical height on the Jacobian with respect to a carefully chosen line bundle. We show that our construction allows us to reprove, without using $p$-adic Hodge theory or arithmetic fundamental groups, several results due to Balakrishnan and Dogra. Our method also extends to primes $p$ of bad reduction. One consequence of our work is that for any canonical height ($p$-adic or $\mathbb{R}$-valued) on an abelian variety (and hence on pull-backs to other varieties), the local contribution at a finite prime $q$ can be constructed using $q$-analytic methods. △ Less

Submitted 23 November, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

Comments: Updates include a p-adic Arakelov-theory proof that the method extends when $r>g$ and $r<g+\mathrm{rank}(\mathrm{NS}(J))-1$ (Sections 3.4 and 7.2), comparison to Colmez's work (Sections 4.5 and 9.3), an updated comparison to Balakrishnan and Dogra's work (Section 8), and a new unified q-analytic construction of local heights (p-adic and real) on abelian varieties at a finite prime q (Section 9.2)

arXiv:2111.05637 [pdf, other]

Uniform Convergence Guarantees for the Deep Ritz Method for Nonlinear Problems

Authors: Patrick Dondl, Johannes Müller, Marius Zeinhofer

Abstract: We provide convergence guarantees for the Deep Ritz Method for abstract variational energies. Our results cover non-linear variational problems such as the $p$-Laplace equation or the Modica-Mortola energy with essential or natural boundary conditions. Under additional assumptions, we show that the convergence is uniform across % bounded families of right-hand sides. We provide convergence guarantees for the Deep Ritz Method for abstract variational energies. Our results cover non-linear variational problems such as the $p$-Laplace equation or the Modica-Mortola energy with essential or natural boundary conditions. Under additional assumptions, we show that the convergence is uniform across % bounded families of right-hand sides. △ Less

Submitted 10 November, 2021; originally announced November 2021.

Comments: 13 pages, 2 figures. arXiv admin note: text overlap with arXiv:2103.01007

MSC Class: 65M12; 65N30; 68T07

arXiv:2110.07409 [pdf, other]

The Geometry of Memoryless Stochastic Policy Optimization in Infinite-Horizon POMDPs

Authors: Johannes Müller, Guido Montúfar

Abstract: We consider the problem of finding the best memoryless stochastic policy for an infinite-horizon partially observable Markov decision process (POMDP) with finite state and action spaces with respect to either the discounted or mean reward criterion. We show that the (discounted) state-action frequencies and the expected cumulative reward are rational functions of the policy, whereby the degree is… ▽ More We consider the problem of finding the best memoryless stochastic policy for an infinite-horizon partially observable Markov decision process (POMDP) with finite state and action spaces with respect to either the discounted or mean reward criterion. We show that the (discounted) state-action frequencies and the expected cumulative reward are rational functions of the policy, whereby the degree is determined by the degree of partial observability. We then describe the optimization problem as a linear optimization problem in the space of feasible state-action frequencies subject to polynomial constraints that we characterize explicitly. This allows us to address the combinatorial and geometric complexity of the optimization problem using recent tools from polynomial optimization. In particular, we estimate the number of critical points and use the polynomial programming description of reward maximization to solve a navigation problem in a grid world. △ Less

Submitted 29 April, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

Comments: Camera ready version for ICLR 2022, 45 pages, 8 figures

MSC Class: 90C40; 93E20; 49M37; 90C23

Journal ref: Proceedings of the 10th International Conference on Learning Representations (ICLR 2022)

arXiv:2110.00614 [pdf, ps, other]

doi 10.1017/nmj.2022.39

Cohomology of the Bruhat-Tits strata in the unramified unitary Rapoport-Zink space of signature $(1,n-1)$

Authors: Joseph Muller

Abstract: In [Inventiones mathematicae, 184 (2011)], Vollaard and Wedhorn defined a stratification on the special fiber of the unitary unramified PEL Rapoport-Zink space with signature $(1,n-1)$. They constructed an isomorphism between the closure of a stratum, called a closed Bruhat-Tits stratum, and a Deligne-Lusztig variety which is not of classical type. In this paper, we describe the $\ell$-adic cohomo… ▽ More In [Inventiones mathematicae, 184 (2011)], Vollaard and Wedhorn defined a stratification on the special fiber of the unitary unramified PEL Rapoport-Zink space with signature $(1,n-1)$. They constructed an isomorphism between the closure of a stratum, called a closed Bruhat-Tits stratum, and a Deligne-Lusztig variety which is not of classical type. In this paper, we describe the $\ell$-adic cohomology groups over $\overline{\mathbb Q_{\ell}}$ of these Deligne-Lusztig varieties, where $\ell \not = p$. The computations involve the spectral sequence associated to the Ekedahl-Oort stratification of a closed Bruhat-Tits stratum, which translates into a stratification by Coxeter varieties whose cohomology is known. Eventually, we find out that the irreducible representations of the finite unitary group which appear inside the cohomology contribute to only two different unipotent Harish-Chandra series, one of them belonging to the principal series. △ Less

Submitted 26 November, 2022; v1 submitted 1 October, 2021; originally announced October 2021.

Comments: 29 pages. Final version published online in Nagoya Mathematical Journal

MSC Class: 14F20; 20C33; 11E95; 14L05

arXiv:2110.00443 [pdf, other]

doi 10.1145/3524122

Optimal Feedback Control for Modeling Human-Computer Interaction

Authors: Florian Fischer, Arthur Fleig, Markus Klar, Jörg Müller

Abstract: Optimal feedback control (OFC) is a theory from the motor control literature that explains how humans move their body to achieve a certain goal, e.g., pointing with the finger. OFC is based on the assumption that humans aim to control their body optimally, within the constraints imposed by body, environment, and task. In this paper, we explain how this theory can be applied to understanding Human-… ▽ More Optimal feedback control (OFC) is a theory from the motor control literature that explains how humans move their body to achieve a certain goal, e.g., pointing with the finger. OFC is based on the assumption that humans aim to control their body optimally, within the constraints imposed by body, environment, and task. In this paper, we explain how this theory can be applied to understanding Human-Computer Interaction (HCI) in the case of pointing. We propose that the human body and computer dynamics can be interpreted as a single dynamical system. The system state is controlled by the user via muscle control signals, and estimated from observations. Between-trial variability arises from signal-dependent control noise and observation noise. We compare four different models from optimal control theory and evaluate to what degree these models can replicate movements in the case of mouse pointing. We introduce a procedure to identify parameters that best explain observed user behavior. To support HCI researchers in simulating, analyzing, and optimizing interaction movements, we provide the Python toolbox OFC4HCI. We conclude that OFC presents a powerful framework for HCI to understand and simulate motion of the human body and of the interface on a moment by moment basis. △ Less

Submitted 20 April, 2022; v1 submitted 1 October, 2021; originally announced October 2021.

Comments: 66 pages, 21 figures, two appendices

arXiv:2109.02319 [pdf]

doi 10.1007/s40072-021-00200-2

An improved characterisation of regular generalised functions of white noise and an application to singular SPDEs

Authors: Martin Grothaus, Jan Müller, Andreas Nonnenmacher

Abstract: A characterisation of the spaces $\mathcal{G}_K$ and $\mathcal{G}_K'$ introduced in Grothaus et al. (Methods Funct Anal Topol 3(2):46-64, 1997) and Potthoff and Timpel (Potential Anal 4(6):637-654, 1995) is given. A first characterisation of these spaces provided in Grothaus et al. (Methods Funct Anal Topol 3(2):46-64, 1997) uses the concepts of holomorphy on infinite dimensional spaces. We, inste… ▽ More A characterisation of the spaces $\mathcal{G}_K$ and $\mathcal{G}_K'$ introduced in Grothaus et al. (Methods Funct Anal Topol 3(2):46-64, 1997) and Potthoff and Timpel (Potential Anal 4(6):637-654, 1995) is given. A first characterisation of these spaces provided in Grothaus et al. (Methods Funct Anal Topol 3(2):46-64, 1997) uses the concepts of holomorphy on infinite dimensional spaces. We, instead, give a characterisation in terms of U-functionals, i.e., classic holomorphic function on the one dimensional field of complex numbers. We apply our new characterisation to derive new results concerning a stochastic transport equation and the stochastic heat equation with multiplicative noise. △ Less

Submitted 6 September, 2021; originally announced September 2021.

MSC Class: 60G20; 60H40; 60H17

Journal ref: Stochastics and Partial Differential Equations: Analysis and Computations (2021)

arXiv:2105.02550 [pdf, ps, other]

Notes on Exact Boundary Values in Residual Minimisation

Authors: Johannes Müller, Marius Zeinhofer

Abstract: We analyse the difference in convergence mode using exact versus penalised boundary values for the residual minimisation of PDEs with neural network type ansatz functions, as is commonly done in the context of physics informed neural networks. It is known that using an $L^2$ boundary penalty leads to a loss of regularity of $3/2$ meaning that approximation in $H^2$ yields a priori estimates in… ▽ More We analyse the difference in convergence mode using exact versus penalised boundary values for the residual minimisation of PDEs with neural network type ansatz functions, as is commonly done in the context of physics informed neural networks. It is known that using an $L^2$ boundary penalty leads to a loss of regularity of $3/2$ meaning that approximation in $H^2$ yields a priori estimates in $H^{1/2}$. These notes demonstrate how this loss of regularity can be circumvented if the functions in the ansatz class satisfy the boundary values exactly. Furthermore, it is shown that in this case, the loss function provides a consistent a posteriori error estimator in $H^2$ norm made by the residual minimisation method. We provide analogue results for linear time dependent problems and discuss the implications of measuring the residual in Sobolev norms. △ Less

Submitted 5 September, 2022; v1 submitted 6 May, 2021; originally announced May 2021.

Comments: 3rd Annual Conference on Mathematical and Scientific Machine Learning (MSML22), 12 pages, no figures

arXiv:2103.05751 [pdf, other]

BROOD: Bilevel and Robust Optimization and Outlier Detection for Efficient Tuning of High-Energy Physics Event Generators

Authors: Wenjing Wang, Mohan Krishnamoorthy, Juliane Muller, Stephen Mrenna, Holger Schulz, Xiangyang Ju, Sven Leyffer, Zachary Marshall

Abstract: The parameters in Monte Carlo (MC) event generators are tuned on experimental measurements by evaluating the goodness of fit between the data and the MC predictions. The relative importance of each measurement is adjusted manually in an often time-consuming, iterative process to meet different experimental needs. In this work, we introduce several optimization formulations and algorithms with new… ▽ More The parameters in Monte Carlo (MC) event generators are tuned on experimental measurements by evaluating the goodness of fit between the data and the MC predictions. The relative importance of each measurement is adjusted manually in an often time-consuming, iterative process to meet different experimental needs. In this work, we introduce several optimization formulations and algorithms with new decision criteria for streamlining and automating this process. These algorithms are designed for two formulations: bilevel optimization and robust optimization. Both formulations are applied to the datasets used in the ATLAS A14 tune and to the dedicated hadronization datasets generated by the sherpa generator, respectively. The corresponding tuned generator parameters are compared using three metrics. We compare the quality of our automatic tunes to the published ATLAS A14 tune. Moreover, we analyze the impact of a pre-processing step that excludes data that cannot be described by the physics models used in the MC event generators. △ Less

Submitted 11 March, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

Comments: 87 pages, Submission to SciPost

arXiv:2103.01007 [pdf, ps, other]

Error Estimates for the Deep Ritz Method with Boundary Penalty

Authors: Johannes Müller, Marius Zeinhofer

Abstract: We estimate the error of the Deep Ritz Method for linear elliptic equations. For Dirichlet boundary conditions, we estimate the error when the boundary values are imposed through the boundary penalty method. Our results apply to arbitrary sets of ansatz functions and estimate the error in dependence of the optimization accuracy, the approximation capabilities of the ansatz class and -- in the case… ▽ More We estimate the error of the Deep Ritz Method for linear elliptic equations. For Dirichlet boundary conditions, we estimate the error when the boundary values are imposed through the boundary penalty method. Our results apply to arbitrary sets of ansatz functions and estimate the error in dependence of the optimization accuracy, the approximation capabilities of the ansatz class and -- in the case of Dirichlet boundary values -- the penalization strength $λ$. To the best of our knowledge, our results are presently the only ones in the literature that treat the case of Dirichlet boundary conditions in full generality, i.e., without a lower order term that leads to coercivity on all of $H^1(Ω)$. Further, we discuss the implications of our results for ansatz classes which are given through ReLU networks and the relation to existing estimates for finite element functions. For high dimensional problems our results show that the favourable approximation capabilities of neural networks for smooth functions are inherited by the Deep Ritz Method. △ Less

Submitted 5 September, 2022; v1 submitted 1 March, 2021; originally announced March 2021.

Comments: 3rd Annual Conference on Mathematical and Scientific Machine Learning (MSML22), 20 pages, no figures

MSC Class: 65N15; 68T07

arXiv:2101.01862 [pdf, ps, other]

Quadratic Chabauty for modular curves: Algorithms and examples

Authors: Jennifer S. Balakrishnan, Netan Dogra, Jan Steffen Müller, Jan Tuitman, Jan Vonk

Abstract: We describe how the quadratic Chabauty method may be applied to explicitly determine the set of rational points on modular curves of genus $g>1$ whose Jacobians have Mordell--Weil rank $g$. This extends our previous work on the split Cartan curve of level 13 and allows us to consider modular curves that may have few known rational points or nontrivial local height contributions at primes of bad re… ▽ More We describe how the quadratic Chabauty method may be applied to explicitly determine the set of rational points on modular curves of genus $g>1$ whose Jacobians have Mordell--Weil rank $g$. This extends our previous work on the split Cartan curve of level 13 and allows us to consider modular curves that may have few known rational points or nontrivial local height contributions at primes of bad reduction. We illustrate our algorithms with a number of examples where we determine the set of rational points on several modular curves of genus 2 and 3: this includes Atkin--Lehner quotients $X_0^+(N)$ of prime level $N$, the curve $X_{S_4}(13)$, as well as a few other curves relevant to Mazur's Program B. We also describe the computation of rational points on the genus 6 non-split Cartan modular curve $X_{\textrm{ns}} ^+ (17)$. △ Less

Submitted 7 March, 2023; v1 submitted 5 January, 2021; originally announced January 2021.

Comments: Updated following referee's comments. To appear in Comp. Math

arXiv:2007.08777 [pdf, other]

A direct reconstruction algorithm for the anisotropic inverse conductivity problem based on Calderón's method in the plane

Authors: Rashmi Murthy, Yi-Hsuan Lin, Kwancheol Shin, Jennifer L. Mueller

Abstract: A direct reconstruction algorithm based on Calderón's linearization method for the reconstruction of isotropic conductivities is proposed for anisotropic conductivities in two-dimensions. To overcome the non-uniqueness of the anisotropic inverse conductivity problem, the entries of the unperturbed anisotropic tensors are assumed known \emph{a priori}, and it remains to reconstruct the multiplicati… ▽ More A direct reconstruction algorithm based on Calderón's linearization method for the reconstruction of isotropic conductivities is proposed for anisotropic conductivities in two-dimensions. To overcome the non-uniqueness of the anisotropic inverse conductivity problem, the entries of the unperturbed anisotropic tensors are assumed known \emph{a priori}, and it remains to reconstruct the multiplicative scalar field. The quasi-conformal map in the plane facilitates the Calderón-based approach for anisotropic conductivities. The method is demonstrated on discontinuous radially symmetric conductivities of high and low contrast. △ Less

Submitted 17 July, 2020; originally announced July 2020.

arXiv:2007.00550 [pdf, ps, other]

Kalman Filter Meets Subjective Logic: A Self-Assessing Kalman Filter Using Subjective Logic

Authors: Thomas Griebel, Johannes Müller, Michael Buchholz, Klaus Dietmayer

Abstract: Self-assessment is a key to safety and robustness in automated driving. In order to design safer and more robust automated driving functions, the goal is to self-assess the performance of each module in a whole automated driving system. One crucial component in automated driving systems is the tracking of surrounding objects, where the Kalman filter is the most fundamental tracking algorithm. For… ▽ More Self-assessment is a key to safety and robustness in automated driving. In order to design safer and more robust automated driving functions, the goal is to self-assess the performance of each module in a whole automated driving system. One crucial component in automated driving systems is the tracking of surrounding objects, where the Kalman filter is the most fundamental tracking algorithm. For Kalman filters, some classical online consistency measures exist for self-assessment, which are based on classical probability theory. However, these classical approaches lack the ability to measure the explicit statistical uncertainty within the self-assessment, which is an important quality measure, particularly, if only a small number of samples is available for the self-assessment. In this work, we propose a novel online self-assessment method using subjective logic, which is a modern extension of probabilistic logic that explicitly models the statistical uncertainty. Thus, by embedding classical Kalman filtering into subjective logic, our method additionally features an explicit measure for statistical uncertainty in the self-assessment. △ Less

Submitted 1 July, 2020; originally announced July 2020.

Comments: Accepted for presentation at the 2020 IEEE 23rd International Conference on Information Fusion (FUSION), July 6 - 9, 2020, Virtual Conference

arXiv:2006.15091 [pdf, ps, other]

The Krein-von Neumann extension for Schrödinger operators on metric graphs

Authors: Jacob Muller, Jonathan Rohleder

Abstract: The Krein-von Neumann extension is studied for Schrödinger operators on metric graphs. Among other things, its vertex conditions are expressed explicitly, and its relation to other self-adjoint vertex conditions (e.g. continuity-Kirchhoff) is explored. A variational characterisation for its positive eigenvalues is obtained. Based on this, the behaviour of its eigenvalues under perturbations of the… ▽ More The Krein-von Neumann extension is studied for Schrödinger operators on metric graphs. Among other things, its vertex conditions are expressed explicitly, and its relation to other self-adjoint vertex conditions (e.g. continuity-Kirchhoff) is explored. A variational characterisation for its positive eigenvalues is obtained. Based on this, the behaviour of its eigenvalues under perturbations of the metric graph is investigated, and so-called surgery principles are established. Moreover, isoperimetric eigenvalue inequalities are obtained. △ Less

Submitted 17 December, 2020; v1 submitted 26 June, 2020; originally announced June 2020.

arXiv:2005.06291 [pdf, other]

doi 10.1145/3313831.3376409

Levitation Simulator: Prototyping Ultrasonic Levitation Interfaces in Virtual Reality

Authors: Viktorija Paneva, Myroslav Bachynskyi, Jörg Müller

Abstract: We present the Levitation Simulator, a system that enables researchers and designers to iteratively develop and prototype levitation interface ideas in Virtual Reality. This includes user tests and formal experiments. We derive a model of the movement of a levitating particle in such an interface. Based on this, we develop an interactive simulation of the levitation interface in VR, which exhibits… ▽ More We present the Levitation Simulator, a system that enables researchers and designers to iteratively develop and prototype levitation interface ideas in Virtual Reality. This includes user tests and formal experiments. We derive a model of the movement of a levitating particle in such an interface. Based on this, we develop an interactive simulation of the levitation interface in VR, which exhibits the dynamical properties of the real interface. The results of a Fitts' Law pointing study show that the Levitation Simulator enables performance, comparable to the real prototype. We developed the first two interactive games, dedicated for levitation interfaces: LeviShooter and BeadBounce, in the Levitation Simulator, and then implemented them on the real interface. Our results indicate that participants experienced similar levels of user engagement when playing the games, in the two environments. We share our Levitation Simulator as Open Source, thereby democratizing levitation research, without the need for a levitation apparatus. △ Less

Submitted 13 May, 2020; originally announced May 2020.

Comments: 12 pages, 14 figures, CHI'20

Journal ref: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (2020)

arXiv:2001.07389 [pdf, ps, other]

Mixing operators with prescribed unimodular eigenvectors

Authors: Hans-Peter Beise, Leonhard Frerick, Jürgen Müller

Abstract: For arbitrary closed countable subsets $Z$ of the unit circle examples of topologically mixing operators on Hilbert spaces are given which have a densely spanning set of unimodular eigenvectors with eigenvalues restricted to $Z$. In particular, these operators cannot be ergodic in the Gaussian sense. For arbitrary closed countable subsets $Z$ of the unit circle examples of topologically mixing operators on Hilbert spaces are given which have a densely spanning set of unimodular eigenvectors with eigenvalues restricted to $Z$. In particular, these operators cannot be ergodic in the Gaussian sense. △ Less

Submitted 16 September, 2022; v1 submitted 21 January, 2020; originally announced January 2020.

Comments: 8 pages

MSC Class: 47A16

Journal ref: Ergodic Theory and Dynamical Systems, 42 (2022), 1-8

arXiv:2001.02952 [pdf, ps, other]

Dynamics of the Taylor shift on Bergman spaces

Authors: Jürgen Müller, Maike Thelen

Abstract: The Taylor (backward) shift on Bergman spaces $A^p(\om)$ for general open sets $\om$ in the extended complex plane shows rich variety concerning its dynamical behaviour. Different aspects are worked out, where in the case $p<2$ a recent result of Bayart and Matheron plays a central role. The Taylor (backward) shift on Bergman spaces $A^p(\om)$ for general open sets $\om$ in the extended complex plane shows rich variety concerning its dynamical behaviour. Different aspects are worked out, where in the case $p<2$ a recent result of Bayart and Matheron plays a central role. △ Less

Submitted 9 January, 2020; originally announced January 2020.

MSC Class: 47A35 (Primary) 47A16 (Secondary)

arXiv:1912.03937 [pdf, ps, other]

Deep Ritz revisited

Authors: Johannes Müller, Marius Zeinhofer

Abstract: Recently, progress has been made in the application of neural networks to the numerical analysis of partial differential equations (PDEs). In the latter the variational formulation of the Poisson problem is used in order to obtain an objective function - a regularised Dirichlet energy - that was used for the optimisation of some neural networks. In this notes we use the notion of $Γ$-convergence t… ▽ More Recently, progress has been made in the application of neural networks to the numerical analysis of partial differential equations (PDEs). In the latter the variational formulation of the Poisson problem is used in order to obtain an objective function - a regularised Dirichlet energy - that was used for the optimisation of some neural networks. In this notes we use the notion of $Γ$-convergence to show that ReLU networks of growing architecture that are trained with respect to suitably regularised Dirichlet energies converge to the true solution of the Poisson problem. We discuss how this approach generalises to arbitrary variational problems under certain universality assumptions of neural networks and see that this covers some nonlinear stationary PDEs like the $p$-Laplace. △ Less

Submitted 10 January, 2020; v1 submitted 9 December, 2019; originally announced December 2019.

Comments: 10 pages, work in progress, corrected typos in the second version

arXiv:1912.02272 [pdf, other]

doi 10.1016/j.cpc.2020.107663

Multivariate Rational Approximation

Authors: Anthony P. Austin, Mohan Krishnamoorthy, Sven Leyffer, Stephen Mrenna, Juliane Muller, Holger Schulz

Abstract: We present two approaches for computing rational approximations to multivariate functions, motivated by their effectiveness as surrogate models for high-energy physics (HEP) applications. Our first approach builds on the Stieltjes process to efficiently and robustly compute the coefficients of the rational approximation. Our second approach is based on an optimization formulation that allows us to… ▽ More We present two approaches for computing rational approximations to multivariate functions, motivated by their effectiveness as surrogate models for high-energy physics (HEP) applications. Our first approach builds on the Stieltjes process to efficiently and robustly compute the coefficients of the rational approximation. Our second approach is based on an optimization formulation that allows us to include structural constraints on the rational approximation, resulting in a semi-infinite optimization problem that we solve using an outer approximation approach. We present results for synthetic and real-life HEP data, and we compare the approximation quality of our approaches with that of traditional polynomial approximations. △ Less

Submitted 2 December, 2019; originally announced December 2019.

MSC Class: 2010 MSC: 41A20; 41A63; 65D15

arXiv:1910.12755 [pdf, ps, other]

Two recent p-adic approaches towards the (effective) Mordell conjecture

Authors: Jennifer S. Balakrishnan, Alex J. Best, Francesca Bianchi, Brian Lawrence, J. Steffen Müller, Nicholas Triantafillou, Jan Vonk

Abstract: We give an introductory account of two recent approaches towards an effective proof of the Mordell conjecture, due to Lawrence--Venkatesh and Kim. The latter method, which is usually called the method of Chabauty--Kim or non-abelian Chabauty in the literature, has the advantage that in some cases it has been turned into an effective method to determine the set of rational points on a curve, and we… ▽ More We give an introductory account of two recent approaches towards an effective proof of the Mordell conjecture, due to Lawrence--Venkatesh and Kim. The latter method, which is usually called the method of Chabauty--Kim or non-abelian Chabauty in the literature, has the advantage that in some cases it has been turned into an effective method to determine the set of rational points on a curve, and we illustrate this by presenting three new examples of modular curves where this set can be determined. △ Less

Submitted 19 January, 2020; v1 submitted 28 October, 2019; originally announced October 2019.

Comments: Corrected Theorem 6.3 and its proof. Added short discussion of non-proper hyperbolic curves

arXiv:1910.09599 [pdf, ps, other]

On the space-time expressivity of ResNets

Authors: Johannes Müller

Abstract: Residual networks (ResNets) are a deep learning architecture that substantially improved the state of the art performance in certain supervised learning tasks. Since then, they have received continuously growing attention. ResNets have a recursive structure $x_{k+1} = x_k + R_k(x_k)$ where $R_k$ is a neural network called a residual block. This structure can be seen as the Euler discretisation of… ▽ More Residual networks (ResNets) are a deep learning architecture that substantially improved the state of the art performance in certain supervised learning tasks. Since then, they have received continuously growing attention. ResNets have a recursive structure $x_{k+1} = x_k + R_k(x_k)$ where $R_k$ is a neural network called a residual block. This structure can be seen as the Euler discretisation of an associated ordinary differential equation (ODE) which is called a neural ODE. Recently, ResNets were proposed as the space-time approximation of ODEs which are not of this neural type. To elaborate this connection we show that by increasing the number of residual blocks as well as their expressivity the solution of an arbitrary ODE can be approximated in space and time simultaneously by deep ReLU ResNets. Further, we derive estimates on the complexity of the residual blocks required to obtain a prescribed accuracy under certain regularity assumptions. △ Less

Submitted 27 February, 2020; v1 submitted 21 October, 2019; originally announced October 2019.

Comments: Extended abstract of master's thesis; presented at the ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations; full version of the thesis available under https://freidok.uni-freiburg.de/data/151788

arXiv:1910.06860 [pdf, ps, other]

doi 10.1016/j.mbs.2020.108320

Exact and approximate formulas for contact tracing on random trees

Authors: Augustine Okolie, Johannes Müller

Abstract: We consider a stochastic susceptible-infected-recovered (SIR) model with contact tracing on random trees and on the configuration model. On a rooted tree, where initially all individuals are susceptible apart from the root which is infected, we are able to find exact formulas for the distribution of the infectious period. Thereto, we show how to extend the existing theory for contact tracing in ho… ▽ More We consider a stochastic susceptible-infected-recovered (SIR) model with contact tracing on random trees and on the configuration model. On a rooted tree, where initially all individuals are susceptible apart from the root which is infected, we are able to find exact formulas for the distribution of the infectious period. Thereto, we show how to extend the existing theory for contact tracing in homogeneously mixing populations to trees. Based on these formulas, we discuss the influence of randomness in the tree and the basic reproduction. We find the well known results for the homogeneously mixing case as a limit of the present model (tree-shaped contact graph). Furthermore, we develop approximate mean field equations for the dynamics on trees, and - using the message passing method - also for the configuration model. The interpretation and implications of the results are discussed. △ Less

Submitted 22 February, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

Comments: 24 pages, 9 figures

MSC Class: 92Bxx (Primary) 92B05; 05Cxx (Secondary)

Journal ref: Mathematical Biosciences, 108320 (2020)

arXiv:1910.04653 [pdf, ps, other]

Explicit quadratic Chabauty over number fields

Authors: Jennifer S. Balakrishnan, Amnon Besser, Francesca Bianchi, J. Steffen Müller

Abstract: We generalize the explicit quadratic Chabauty techniques for integral points on odd degree hyperelliptic curves and for rational points on genus 2 bielliptic curves to arbitrary number fields using restriction of scalars. This is achieved by combining equations coming from Siksek's extension of classical Chabauty with equations defined in terms of p-adic heights attached to independent continuous… ▽ More We generalize the explicit quadratic Chabauty techniques for integral points on odd degree hyperelliptic curves and for rational points on genus 2 bielliptic curves to arbitrary number fields using restriction of scalars. This is achieved by combining equations coming from Siksek's extension of classical Chabauty with equations defined in terms of p-adic heights attached to independent continuous idele class characters. We give several examples to show the practicality of our methods. △ Less

Submitted 15 June, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

Comments: Fixed minor issues following the referee's suggestions; 33 pages

MSC Class: Primary 11G30; Secondary 11S80; 11Y50; 14G40

Showing 1–50 of 127 results for author: Müller, J