You are currently browsing the category archive for the ‘math.MP’ category.

This post is an unofficial sequel to one of my first blog posts from 2007, which was entitled “Quantum mechanics and Tomb Raider“.

One of the oldest and most famous allegories is Plato’s allegory of the cave. This allegory centers around a group of people chained to a wall in a cave that cannot see themselves or each other, but only the two-dimensional shadows of themselves cast on the wall in front of them by some light source they cannot directly see. Because of this, they identify reality with this two-dimensional representation, and have significant conceptual difficulties in trying to view themselves (or the world as a whole) as three-dimensional, until they are freed from the cave and able to venture into the sunlight.

There is a similar conceptual difficulty when trying to understand Einstein’s theory of special relativity (and more so for general relativity, but let us focus on special relativity for now). We are very much accustomed to thinking of reality as a three-dimensional space endowed with a Euclidean geometry that we traverse through in time, but in order to have the clearest view of the universe of special relativity it is better to think of reality instead as a four-dimensional spacetime that is endowed instead with a Minkowski geometry, which mathematically is similar to a (four-dimensional) Euclidean space but with a crucial change of sign in the underlying metric. Indeed, whereas the distance {ds} between two points in Euclidean space {{\bf R}^3} is given by the three-dimensional Pythagorean theorem

\displaystyle  ds^2 = dx^2 + dy^2 + dz^2

under some standard Cartesian coordinate system {(x,y,z)} of that space, and the distance {ds} in a four-dimensional Euclidean space {{\bf R}^4} would be similarly given by

\displaystyle  ds^2 = dx^2 + dy^2 + dz^2 + du^2

under a standard four-dimensional Cartesian coordinate system {(x,y,z,u)}, the spacetime interval {ds} in Minkowski space is given by

\displaystyle  ds^2 = dx^2 + dy^2 + dz^2 - c^2 dt^2

(though in many texts the opposite sign convention {ds^2 = -dx^2 -dy^2 - dz^2 + c^2dt^2} is preferred) in spacetime coordinates {(x,y,z,t)}, where {c} is the speed of light. The geometry of Minkowski space is then quite similar algebraically to the geometry of Euclidean space (with the sign change replacing the traditional trigonometric functions {\sin, \cos, \tan}, etc. by their hyperbolic counterparts {\sinh, \cosh, \tanh}, and with various factors involving “{c}” inserted in the formulae), but also has some qualitative differences to Euclidean space, most notably a causality structure connected to light cones that has no obvious counterpart in Euclidean space.

That said, the analogy between Minkowski space and four-dimensional Euclidean space is strong enough that it serves as a useful conceptual aid when first learning special relativity; for instance the excellent introductory text “Spacetime physics” by Taylor and Wheeler very much adopts this view. On the other hand, this analogy doesn’t directly address the conceptual problem mentioned earlier of viewing reality as a four-dimensional spacetime in the first place, rather than as a three-dimensional space that objects move around in as time progresses. Of course, part of the issue is that we aren’t good at directly visualizing four dimensions in the first place. This latter problem can at least be easily addressed by removing one or two spatial dimensions from this framework – and indeed many relativity texts start with the simplified setting of only having one spatial dimension, so that spacetime becomes two-dimensional and can be depicted with relative ease by spacetime diagrams – but still there is conceptual resistance to the idea of treating time as another spatial dimension, since we clearly cannot “move around” in time as freely as we can in space, nor do we seem able to easily “rotate” between the spatial and temporal axes, the way that we can between the three coordinate axes of Euclidean space.

With this in mind, I thought it might be worth attempting a Plato-type allegory to reconcile the spatial and spacetime views of reality, in a way that can be used to describe (analogues of) some of the less intuitive features of relativity, such as time dilation, length contraction, and the relativity of simultaneity. I have (somewhat whimsically) decided to place this allegory in a Tolkienesque fantasy world (similarly to how my previous allegory to describe quantum mechanics was phrased in a world based on the computer game “Tomb Raider”). This is something of an experiment, and (like any other analogy) the allegory will not be able to perfectly capture every aspect of the phenomenon it is trying to represent, so any feedback to improve the allegory would be appreciated.

Read the rest of this entry »

Let {u: {\bf R}^3 \rightarrow {\bf R}^3} be a divergence-free vector field, thus {\nabla \cdot u = 0}, which we interpret as a velocity field. In this post we will proceed formally, largely ignoring the analytic issues of whether the fields in question have sufficient regularity and decay to justify the calculations. The vorticity field {\omega: {\bf R}^3 \rightarrow {\bf R}^3} is then defined as the curl of the velocity:

\displaystyle  \omega = \nabla \times u.

(From a differential geometry viewpoint, it would be more accurate (especially in other dimensions than three) to define the vorticity as the exterior derivative {\omega = d(g \cdot u)} of the musical isomorphism {g \cdot u} of the Euclidean metric {g} applied to the velocity field {u}; see these previous lecture notes. However, we will not need this geometric formalism in this post.)

Assuming suitable regularity and decay hypotheses of the velocity field {u}, it is possible to recover the velocity from the vorticity as follows. From the general vector identity {\nabla \times \nabla \times X = \nabla(\nabla \cdot X) - \Delta X} applied to the velocity field {u}, we see that

\displaystyle  \nabla \times \omega = -\Delta u

and thus (by the commutativity of all the differential operators involved)

\displaystyle  u = - \nabla \times \Delta^{-1} \omega.

Using the Newton potential formula

\displaystyle  -\Delta^{-1} \omega(x) := \frac{1}{4\pi} \int_{{\bf R}^3} \frac{\omega(y)}{|x-y|}\ dy

and formally differentiating under the integral sign, we obtain the Biot-Savart law

\displaystyle  u(x) = \frac{1}{4\pi} \int_{{\bf R}^3} \frac{\omega(y) \times (x-y)}{|x-y|^3}\ dy. \ \ \ \ \ (1)

This law is of fundamental importance in the study of incompressible fluid equations, such as the Euler equations

\displaystyle  \partial_t u + (u \cdot \nabla) u = -\nabla p; \quad \nabla \cdot u = 0

since on applying the curl operator one obtains the vorticity equation

\displaystyle  \partial_t \omega + (u \cdot \nabla) \omega = (\omega \cdot \nabla) u \ \ \ \ \ (2)

and then by substituting (1) one gets an autonomous equation for the vorticity field {\omega}. Unfortunately, this equation is non-local, due to the integration present in (1).

In a recent work, it was observed by Elgindi that in a certain regime, the Biot-Savart law can be approximated by a more “low rank” law, which makes the non-local effects significantly simpler in nature. This simplification was carried out in spherical coordinates, and hinged on a study of the invertibility properties of a certain second order linear differential operator in the latitude variable {\theta}; however in this post I would like to observe that the approximation can also be seen directly in Cartesian coordinates from the classical Biot-Savart law (1). As a consequence one can also initiate the beginning of Elgindi’s analysis in constructing somewhat regular solutions to the Euler equations that exhibit self-similar blowup in finite time, though I have not attempted to execute the entirety of the analysis in this setting.

Elgindi’s approximation applies under the following hypotheses:

  • (i) (Axial symmetry without swirl) The velocity field {u} is assumed to take the form

    \displaystyle  u(x_1,x_2,x_3) = ( u_r(r,x_3) \frac{x_1}{r}, u_r(r,x_3) \frac{x_2}{r}, u_3(r,x_3) ) \ \ \ \ \ (3)

    for some functions {u_r, u_3: [0,+\infty) \times {\bf R} \rightarrow {\bf R}} of the cylindrical radial variable {r := \sqrt{x_1^2+x_2^2}} and the vertical coordinate {x_3}. As a consequence, the vorticity field {\omega} takes the form

    \displaystyle  \omega(x_1,x_2,x_3) = (\omega_{r3}(r,x_3) \frac{x_2}{r}, \omega_{r3}(r,x_3) \frac{-x_1}{r}, 0) \ \ \ \ \ (4)

    where {\omega_{r3}: [0,+\infty) \times {\bf R} \rightarrow {\bf R}} is the field

    \displaystyle  \omega_{r3} = \partial_r u_3 - \partial_3 u_r.

  • (ii) (Odd symmetry) We assume that {u_3(r,-x_3) = -u_3(r,x_3)} and {u_r(r,-x_3)=u_r(r,x_3)}, so that {\omega_{r3}(r,-x_3)=\omega_{r3}(r,x_3)}.

A model example of a divergence-free vector field obeying these properties (but without good decay at infinity) is the linear vector field

\displaystyle  X(x) = (x_1, x_2, -2x_3) \ \ \ \ \ (5)

which is of the form (3) with {u_r(r,x_3) = r} and {u_3(r,x_3) = -2x_3}. The associated vorticity {\omega} vanishes.

We can now give an illustration of Elgindi’s approximation:

Proposition 1 (Elgindi’s approximation) Under the above hypotheses (and assuing suitable regularity and decay), we have the pointwise bounds

\displaystyle  u(x) = \frac{1}{2} {\mathcal L}_{12}(\omega)(|x|) X(x) + O( |x| \|\omega\|_{L^\infty({\bf R}^3)} )

for any {x \in {\bf R}^3}, where {X} is the vector field (5), and {{\mathcal L}_{12}(\omega): {\bf R}^+ \rightarrow {\bf R}} is the scalar function

\displaystyle  {\mathcal L}_{12}(\omega)(\rho) := \frac{3}{4\pi} \int_{|y| \geq \rho} \frac{r y_3}{|y|^5} \omega_{r3}(r,y_3)\ dy.

Thus under the hypotheses (i), (ii), and assuming that {\omega} is slowly varying, we expect {u} to behave like the linear vector field {X} modulated by a radial scalar function. In applications one needs to control the error in various function spaces instead of pointwise, and with {\omega} similarly controlled in other function space norms than the {L^\infty} norm, but this proposition already gives a flavour of the approximation. If one uses spherical coordinates

\displaystyle  \omega_{r3}( \rho \cos \theta, \rho \sin \theta ) = \Omega( \rho, \theta )

then we have (using the spherical change of variables formula {dy = \rho^2 \cos \theta d\rho d\theta d\phi} and the odd nature of {\Omega})

\displaystyle  {\mathcal L}_{12}(\omega) = L_{12}(\Omega),

where

\displaystyle L_{12}(\Omega)(\rho) = 3 \int_\rho^\infty \int_0^{\pi/2} \frac{\Omega(r, \theta) \sin(\theta) \cos^2(\theta)}{r}\ d\theta dr

is the operator introduced in Elgindi’s paper.

Proof: By a limiting argument we may assume that {x} is non-zero, and we may normalise {\|\omega\|_{L^\infty({\bf R}^3)}=1}. From the triangle inequality we have

\displaystyle  \int_{|y| \leq 10|x|} \frac{\omega(y) \times (x-y)}{|x-y|^3}\ dy \leq \int_{|y| \leq 10|x|} \frac{1}{|x-y|^2}\ dy

\displaystyle  \leq \int_{|z| \leq 11 |x|} \frac{1}{|z|^2}\ dz

\displaystyle  = O( |x| )

and hence by (1)

\displaystyle  u(x) = \frac{1}{4\pi} \int_{|y| > 10|x|} \frac{\omega(y) \times (x-y)}{|x-y|^3}\ dy + O(|x|).

In the regime {|y| > 2|x|} we may perform the Taylor expansion

\displaystyle  \frac{x-y}{|x-y|^3} = \frac{x-y}{|y|^3} (1 - \frac{2 x \cdot y}{|y|^2} + \frac{|x|^2}{|y|^2})^{-3/2}

\displaystyle  = \frac{x-y}{|y|^3} (1 + \frac{3 x \cdot y}{|y|^2} + O( \frac{|x|^2}{|y|^2} ) )

\displaystyle  = -\frac{y}{|y|^3} + \frac{x}{|y|^3} - \frac{3 (x \cdot y) y}{|y|^5} + O( \frac{|x|^2}{|y|^4} ).

Since

\displaystyle  \int_{|y| > 10|x|} \frac{|x|^2}{|y|^4}\ dy = O(|x|)

we see from the triangle inequality that the error term contributes {O(|x|)} to {u(x)}. We thus have

\displaystyle  u(x) = -A_0(x) + A_1(x) - 3A'_1(x) + O(|x|)

where {A_0} is the constant term

\displaystyle  A_0 := \int_{|y| > 10|x|} \frac{\omega(y) \times y}{|y|^3}\ dy,

and {A_1, A'_1} are the linear term

\displaystyle  A_1 := \int_{|y| > 10|x|} \frac{\omega(y) \times x}{|y|^3}\ dy,

\displaystyle  A'_1 := \int_{|y| > 10|x|} (x \cdot y) \frac{\omega(y) \times y}{|y|^5}\ dy.

By the hypotheses (i), (ii), we have the symmetries

\displaystyle  \omega(y_1,y_2,-y_3) = - \omega(y_1,y_2,y_3) \ \ \ \ \ (6)

and

\displaystyle  \omega(-y_1,-y_2,y_3) = - \omega(y_1,y_2,y_3) \ \ \ \ \ (7)

and hence also

\displaystyle  \omega(-y_1,-y_2,-y_3) = \omega(y_1,y_2,y_3). \ \ \ \ \ (8)

The even symmetry (8) ensures that the integrand in {A_0} is odd, so {A_0} vanishes. The symmetry (6) or (7) similarly ensures that {\int_{|y| > 10|x|} \frac{\omega(y)}{|y|^3}\ dy = 0}, so {A_1} vanishes. Since {\int_{|x| < y \leq 10|x|} \frac{|x \cdot y| |y|}{|y|^5}\ dy = O( |x| )}, we conclude that

\displaystyle  \omega(x) = -3\int_{|y| \geq |x|} (x \cdot y) \frac{\omega(y) \times y}{|y|^5}\ dy + O(|x|).

Using (4), the right-hand side is

\displaystyle  -3\int_{|y| \geq |x|} (x_1 y_1 + x_2 y_2 + x_3 y_3) \frac{\omega_{r3}(r,y_3) (-y_1 y_3, -y_2 y_3, y_1^2+y_2^2)}{r|y|^5}\ dy

\displaystyle + O(|x|)

where {r := \sqrt{y_1^2+y_2^2}}. Because of the odd nature of {\omega_{r3}}, only those terms with one factor of {y_3} give a non-vanishing contribution to the integral. Using the rotation symmetry {(y_1,y_2,y_3) \mapsto (-y_2,y_1,y_3)} we also see that any term with a factor of {y_1 y_2} also vanishes. We can thus simplify the above expression as

\displaystyle  -3\int_{|y| \geq |x|} \frac{\omega_{r3}(r,y_3) (-x_1 y_1^2 y_3, -x_2 y_2^2 y_3, x_3 (y_1^2+y_2^2) y_3)}{r|y|^5}\ dy + O(|x|).

Using the rotation symmetry {(y_1,y_2,y_3) \mapsto (-y_2,y_1,y_3)} again, we see that the term {y_1^2} in the first component can be replaced by {y_2^2} or by {\frac{1}{2} (y_1^2+y_2^2) = \frac{r^2}{2}}, and similarly for the {y_2^2} term in the second component. Thus the above expression is

\displaystyle  \frac{3}{2} \int_{|y| \geq |x|} \frac{\omega_{r3}(r,y_3) (x_1 , x_2, -2x_3) r y_3}{|y|^5}\ dy + O(|x|)

giving the claim. \Box

Example 2 Consider the divergence-free vector field {u := \nabla \times \psi}, where the vector potential {\psi} takes the form

\displaystyle  \psi(x_1,x_2,x_3) := (x_2 x_3, -x_1 x_3, 0) \eta(|x|)

for some bump function {\eta: {\bf R} \rightarrow {\bf R}} supported in {(0,+\infty)}. We can then calculate

\displaystyle  u(x_1,x_2,x_3) = X(x) \eta(|x|) + (x_1 x_3, x_2 x_3, -x_1^2-x_2^2) \frac{\eta'(|x|) x_3}{|x|}.

and

\displaystyle  \omega(x_1,x_2,x_3) = (-6x_2 x_3, 6x_1 x_3, 0) \frac{\eta'(|x|)}{|x|} + (-x_2 x_3, x_1 x_3, 0) \eta''(|x|).

In particular the hypotheses (i), (ii) are satisfied with

\displaystyle  \omega_{r3}(r,x_3) = - 6 \eta'(|x|) \frac{x_3 r}{|x|} - \eta''(|x|) x_3 r.

One can then calculate

\displaystyle  L_{12}(\omega)(\rho) = -\frac{3}{4\pi} \int_{|y| \geq \rho} (6\frac{\eta'(|y|)}{|y|^6} + \frac{\eta''(|y|)}{|y|^5}) r^2 y_3^2\ dy

\displaystyle  = -\frac{2}{5} \int_\rho^\infty 6\eta'(s) + s\eta''(s)\ ds

\displaystyle  = 2\eta(\rho) + \frac{2}{5} \rho \eta'(\rho).

If we take the specific choice

\displaystyle  \eta(\rho) = \varphi( \rho^\alpha )

where {\varphi} is a fixed bump function supported some interval {[c,C] \subset (0,+\infty)} and {\alpha>0} is a small parameter (so that {\eta} is spread out over the range {\rho \in [c^{1/\alpha},C^{1/\alpha}]}), then we see that

\displaystyle  \| \omega \|_{L^\infty} = O( \alpha )

(with implied constants allowed to depend on {\varphi}),

\displaystyle  L_{12}(\omega)(\rho) = 2\eta(\rho) + O(\alpha),

and

\displaystyle  u = X(x) \eta(|x|) + O( \alpha |x| ),

which is completely consistent with Proposition 1.

One can use this approximation to extract a plausible ansatz for a self-similar blowup to the Euler equations. We let {\alpha>0} be a small parameter and let {\omega_{rx_3}} be a time-dependent vorticity field obeying (i), (ii) of the form

\displaystyle  \omega_{rx_3}(t,r,x_3) \approx \alpha \Omega( t, R ) \mathrm{sgn}(x_3)

where {R := |x|^\alpha = (r^2+x_3^2)^{\alpha/2}} and {\Omega: {\bf R} \times [0,+\infty) \rightarrow {\bf R}} is a smooth field to be chosen later. Admittedly the signum function {\mathrm{sgn}} is not smooth at {x_3}, but let us ignore this issue for now (to rigorously make an ansatz one will have to smooth out this function a little bit; Elgindi uses the choice {(|\sin \theta| \cos^2 \theta)^{\alpha/3} \mathrm{sgn}(x_3)}, where {\theta := \mathrm{arctan}(x_3/r)}). With this ansatz one may compute

\displaystyle  {\mathcal L}_{12}(\omega(t))(\rho) \approx \frac{3\alpha}{2\pi} \int_{|y| \geq \rho; y_3 \geq 0} \Omega(t,R) \frac{r y_3}{|y|^5}\ dy

\displaystyle  = \alpha \int_\rho^\infty \Omega(t, s^\alpha) \frac{ds}{s}

\displaystyle  = \int_{\rho^\alpha}^\infty \Omega(t,s) \frac{ds}{s}.

By Proposition 1, we thus expect to have the approximation

\displaystyle  u(t,x) \approx \frac{1}{2} \int_{|x|^\alpha}^\infty \Omega(t,s) \frac{ds}{s} X(x).

We insert this into the vorticity equation (2). The transport term {(u \cdot \nabla) \omega} will be expected to be negligible because {R}, and hence {\omega_{rx_3}}, is slowly varying (the discontinuity of {\mathrm{sgn}(x_3)} will not be encountered because the vector field {X} is parallel to this singularity). The modulating function {\frac{1}{2} \int_{|x|^\alpha}^\infty \Omega(t,s) \frac{ds}{s}} is similarly slowly varying, so derivatives falling on this function should be lower order. Neglecting such terms, we arrive at the approximation

\displaystyle  (\omega \cdot \nabla) u \approx \frac{1}{2} \int_{|x|^\alpha}^\infty \Omega(t,s) \frac{ds}{s} \omega

and so in the limit {\alpha \rightarrow 0} we expect obtain a simple model equation for the evolution of the vorticity envelope {\Omega}:

\displaystyle  \partial_t \Omega(t,R) = \frac{1}{2} \int_R^\infty \Omega(t,S) \frac{dS}{S} \Omega(t,R).

If we write {L(t,R) := \int_R^\infty \Omega(t,S)\frac{dS}{S}} for the logarithmic primitive of {\Omega}, then we have {\Omega = - R \partial_R L} and hence

\displaystyle  \partial_t (R \partial_R L) = \frac{1}{2} L (R \partial_R L)

which integrates to the Ricatti equation

\displaystyle  \partial_t L = \frac{1}{4} L^2

which can be explicitly solved as

\displaystyle  L(t,R) = \frac{2}{f(R) - t/2}

where {f(R)} is any function of {R} that one pleases. (In Elgindi’s work a time dilation is used to remove the unsightly factor of {1/2} appearing here in the denominator.) If for instance we set {f(R) = 1+R}, we obtain the self-similar solution

\displaystyle  L(t,R) = \frac{2}{1+R-t/2}

and then on applying {-R \partial_R}

\displaystyle  \Omega(t,R) = \frac{2R}{(1+R-t/2)^2}.

Thus, we expect to be able to construct a self-similar blowup to the Euler equations with a vorticity field approximately behaving like

\displaystyle  \omega(t,x) \approx \alpha \frac{2R}{(1+R-t/2)^2} \mathrm{sgn}(x_3) (\frac{x_2}{r}, -\frac{x_1}{r}, 0)

and velocity field behaving like

\displaystyle  u(t,x) \approx \frac{1}{1+R-t/2} X(x).

In particular, {u} would be expected to be of regularity {C^{1,\alpha}} (and smooth away from the origin), and blows up in (say) {L^\infty} norm at time {t/2 = 1}, and one has the self-similarity

\displaystyle  u(t,x) = (1-t/2)^{\frac{1}{\alpha}-1} u( 0, \frac{x}{(1-t/2)^{1/\alpha}} )

and

\displaystyle  \omega(t,x) = (1-t/2)^{-1} \omega( 0, \frac{x}{(1-t/2)^{1/\alpha}} ).

A self-similar solution of this approximate shape is in fact constructed rigorously in Elgindi’s paper (using spherical coordinates instead of the Cartesian approach adopted here), using a nonlinear stability analysis of the above ansatz. It seems plausible that one could also carry out this stability analysis using this Cartesian coordinate approach, although I have not tried to do this in detail.

I was recently asked to contribute a short comment to Nature Reviews Physics, as part of a series of articles on fluid dynamics on the occasion of the 200th anniversary (this August) of the birthday of George Stokes.  My contribution is now online as “Searching for singularities in the Navier–Stokes equations“, where I discuss the global regularity problem for Navier-Stokes and my thoughts on how one could try to construct a solution that blows up in finite time via an approximately discretely self-similar “fluid computer”.  (The rest of the series does not currently seem to be available online, but I expect they will become so shortly.)

 

This coming fall quarter, I am teaching a class on topics in the mathematical theory of incompressible fluid equations, focusing particularly on the incompressible Euler and Navier-Stokes equations. These two equations are by no means the only equations used to model fluids, but I will focus on these two equations in this course to narrow the focus down to something manageable. I have not fully decided on the choice of topics to cover in this course, but I would probably begin with some core topics such as local well-posedness theory and blowup criteria, conservation laws, and construction of weak solutions, then move on to some topics such as boundary layers and the Prandtl equations, the Euler-Poincare-Arnold interpretation of the Euler equations as an infinite dimensional geodesic flow, and some discussion of the Onsager conjecture. I will probably also continue to more advanced and recent topics in the winter quarter.

In this initial set of notes, we begin by reviewing the physical derivation of the Euler and Navier-Stokes equations from the first principles of Newtonian mechanics, and specifically from Newton’s famous three laws of motion. Strictly speaking, this derivation is not needed for the mathematical analysis of these equations, which can be viewed if one wishes as an arbitrarily chosen system of partial differential equations without any physical motivation; however, I feel that the derivation sheds some insight and intuition on these equations, and is also worth knowing on purely intellectual grounds regardless of its mathematical consequences. I also find it instructive to actually see the journey from Newton’s law

\displaystyle F = ma

to the seemingly rather different-looking law

\displaystyle \partial_t u + (u \cdot \nabla) u = -\nabla p + \nu \Delta u

\displaystyle \nabla \cdot u = 0

for incompressible Navier-Stokes (or, if one drops the viscosity term {\nu \Delta u}, the Euler equations).

Our discussion in this set of notes is physical rather than mathematical, and so we will not be working at mathematical levels of rigour and precision. In particular we will be fairly casual about interchanging summations, limits, and integrals, we will manipulate approximate identities {X \approx Y} as if they were exact identities (e.g., by differentiating both sides of the approximate identity), and we will not attempt to verify any regularity or convergence hypotheses in the expressions being manipulated. (The same holds for the exercises in this text, which also do not need to be justified at mathematical levels of rigour.) Of course, once we resume the mathematical portion of this course in subsequent notes, such issues will be an important focus of careful attention. This is a basic division of labour in mathematical modeling: non-rigorous heuristic reasoning is used to derive a mathematical model from physical (or other “real-life”) principles, but once a precise model is obtained, the analysis of that model should be completely rigorous if at all possible (even if this requires applying the model to regimes which do not correspond to the original physical motivation of that model). See the discussion by John Ball quoted at the end of these slides of Gero Friesecke for an expansion of these points.

Note: our treatment here will differ slightly from that presented in many fluid mechanics texts, in that it will emphasise first-principles derivations from many-particle systems, rather than relying on bulk laws of physics, such as the laws of thermodynamics, which we will not cover here. (However, the derivations from bulk laws tend to be more robust, in that they are not as reliant on assumptions about the particular interactions between particles. In particular, the physical hypotheses we assume in this post are probably quite a bit stronger than the minimal assumptions needed to justify the Euler or Navier-Stokes equations, which can hold even in situations in which one or more of the hypotheses assumed here break down.)

Read the rest of this entry »

The 2014 Fields medallists have just been announced as (in alphabetical order of surname) Artur Avila, Manjul Bhargava, Martin Hairer, and Maryam Mirzakhani (see also these nice video profiles for the winners, which is a new initiative of the IMU and the Simons foundation). This time four years ago, I wrote a blog post discussing one result from each of the 2010 medallists; I thought I would try to repeat the exercise here, although the work of the medallists this time around is a little bit further away from my own direct area of expertise than last time, and so my discussion will unfortunately be a bit superficial (and possibly not completely accurate) in places. As before, I am picking these results based on my own idiosyncratic tastes, and they should not be viewed as necessarily being the “best” work of these medallists. (See also the press releases for Avila, Bhargava, Hairer, and Mirzakhani.)

Artur Avila works in dynamical systems and in the study of Schrödinger operators. The work of Avila that I am most familiar with is his solution with Svetlana Jitormiskaya of the ten martini problem of Kac, the solution to which (according to Barry Simon) he offered ten martinis for, hence the name. (The problem had also been previously posed in the work of Azbel and of Hofstadter.) The problem involves perhaps the simplest example of a Schrödinger operator with non-trivial spectral properties, namely the almost Mathieu operator {H^{\lambda,\alpha}_\omega: \ell^2({\bf Z}) \rightarrow \ell^2({\bf Z})} defined for parameters {\alpha,\omega \in {\bf R}/{\bf Z}} and {\lambda>0} by a discrete one-dimensional Schrödinger operator with cosine potential:

\displaystyle (H^{\lambda,\alpha}_\omega u)_n := u_{n+1} + u_{n-1} + 2\lambda (\cos 2\pi(\theta+n\alpha)) u_n.

This is a bounded self-adjoint operator and thus has a spectrum {\sigma( H^{\lambda,\alpha}_\omega )} that is a compact subset of the real line; it arises in a number of physical contexts, most notably in the theory of the integer quantum Hall effect, though I will not discuss these applications here. Remarkably, the structure of this spectrum depends crucially on the Diophantine properties of the frequency {\alpha}. For instance, if {\alpha = p/q} is a rational number, then the operator is periodic with period {q}, and then basic (discrete) Floquet theory tells us that the spectrum is simply the union of {q} (possibly touching) intervals. But for irrational {\alpha} (in which case the spectrum is independent of the phase {\theta}), the situation is much more fractal in nature, for instance in the critical case {\lambda=1} the spectrum (as a function of {\alpha}) gives rise to the Hofstadter butterfly. The “ten martini problem” asserts that for every irrational {\alpha} and every choice of coupling constant {\lambda > 0}, the spectrum is homeomorphic to a Cantor set. Prior to the work of Avila and Jitormiskaya, there were a number of partial results on this problem, notably the result of Puig establishing Cantor spectrum for a full measure set of parameters {(\lambda,\alpha)}, as well as results requiring a perturbative hypothesis, such as {\lambda} being very small or very large. The result was also already known for {\alpha} being either very close to rational (i.e. a Liouville number) or very far from rational (a Diophantine number), although the analyses for these two cases failed to meet in the middle, leaving some cases untreated. The argument uses a wide variety of existing techniques, both perturbative and non-perturbative, to attack this problem, as well as an amusing argument by contradiction: they assume (in certain regimes) that the spectrum fails to be a Cantor set, and use this hypothesis to obtain additional Lipschitz control on the spectrum (as a function of the frequency {\alpha}), which they can then use (after much effort) to improve existing arguments and conclude that the spectrum was in fact Cantor after all!

Manjul Bhargava produces amazingly beautiful mathematics, though most of it is outside of my own area of expertise. One part of his work that touches on an area of my own interest (namely, random matrix theory) is his ongoing work with many co-authors on modeling (both conjecturally and rigorously) the statistics of various key number-theoretic features of elliptic curves (such as their rank, their Selmer group, or their Tate-Shafarevich groups). For instance, with Kane, Lenstra, Poonen, and Rains, Manjul has proposed a very general random matrix model that predicts all of these statistics (for instance, predicting that the {p}-component of the Tate-Shafarevich group is distributed like the cokernel of a certain random {p}-adic matrix, very much in the spirit of the Cohen-Lenstra heuristics discussed in this previous post). But what is even more impressive is that Manjul and his coauthors have been able to verify several non-trivial fragments of this model (e.g. showing that certain moments have the predicted asymptotics), giving for the first time non-trivial upper and lower bounds for various statistics, for instance obtaining lower bounds on how often an elliptic curve has rank {0} or rank {1}, leading most recently (in combination with existing work of Gross-Zagier and of Kolyvagin, among others) to his amazing result with Skinner and Zhang that at least {66\%} of all elliptic curves over {{\bf Q}} (ordered by height) obey the Birch and Swinnerton-Dyer conjecture. Previously it was not even known that a positive proportion of curves obeyed the conjecture. This is still a fair ways from resolving the conjecture fully (in particular, the situation with the presumably small number of curves of rank {2} and higher is still very poorly understood, and the theory of Gross-Zagier and Kolyvagin that this work relies on, which was initially only available for {{\bf Q}}, has only been extended to totally real number fields thus far, by the work of Zhang), but it certainly does provide hope that the conjecture could be within reach in a statistical sense at least.

Martin Hairer works in at the interface between probability and partial differential equations, and in particular in the theory of stochastic differential equations (SDEs). The result of his that is closest to my own interests is his remarkable demonstration with Jonathan Mattingly of unique invariant measure for the two-dimensional stochastically forced Navier-Stokes equation

\displaystyle \partial_t u + (u \cdot \nabla u) = \nu \Delta u - \nabla p + \xi

\displaystyle \nabla \cdot u = 0

on the two-torus {({\bf R}/{\bf Z})^2}, where {\xi} is a Gaussian field that forces a fixed set of frequencies. It is expected that for any reasonable choice of initial data, the solution to this equation should asymptotically be distributed according to Kolmogorov’s power law, as discussed in this previous post. This is still far from established rigorously (although there are some results in this direction for dyadic models, see e.g. this paper of Cheskidov, Shvydkoy, and Friedlander). However, Hairer and Mattingly were able to show that there was a unique probability distribution to almost every initial data would converge to asymptotically; by the ergodic theorem, this is equivalent to demonstrating the existence and uniqueness of an invariant measure for the flow. Existence can be established using standard methods, but uniqueness is much more difficult. One of the standard routes to uniqueness is to establish a “strong Feller property” that enforces some continuity on the transition operators; among other things, this would mean that two ergodic probability measures with intersecting supports would in fact have a non-trivial common component, contradicting the ergodic theorem (which forces different ergodic measures to be mutually singular). Since all ergodic measures for Navier-Stokes can be seen to contain the origin in their support, this would give uniqueness. Unfortunately, the strong Feller property is unlikely to hold in the infinite-dimensional phase space for Navier-Stokes; but Hairer and Mattingly develop a clean abstract substitute for this property, which they call the asymptotic strong Feller property, which is again a regularity property on the transition operator; this in turn is then demonstrated by a careful application of Malliavin calculus.

Maryam Mirzakhani has mostly focused on the geometry and dynamics of Teichmuller-type moduli spaces, such as the moduli space of Riemann surfaces with a fixed genus and a fixed number of cusps (or with a fixed number of boundaries that are geodesics of a prescribed length). These spaces have an incredibly rich structure, ranging from geometric structure (such as the Kahler geometry given by the Weil-Petersson metric), to dynamical structure (through the action of the mapping class group on this and related spaces), to algebraic structure (viewing these spaces as algebraic varieties), and are thus connected to many other objects of interest in geometry and dynamics. For instance, by developing a new recursive formula for the Weil-Petersson volume of this space, Mirzakhani was able to asymptotically count the number of simple prime geodesics of length up to some threshold {L} in a hyperbolic surface (or more precisely, she obtained asymptotics for the number of such geodesics in a given orbit of the mapping class group); the answer turns out to be polynomial in {L}, in contrast to the much larger class of non-simple prime geodesics, whose asymptotics are exponential in {L} (the “prime number theorem for geodesics”, developed in a classic series of works by Delsart, Huber, Selberg, and Margulis); she also used this formula to establish a new proof of a conjecture of Witten on intersection numbers that was first proven by Kontsevich. More recently, in two lengthy papers with Eskin and with Eskin-Mohammadi, Mirzakhani established rigidity theorems for the action of {SL_2({\bf R})} on such moduli spaces that are close analogues of Ratner’s celebrated rigidity theorems for unipotently generated groups (discussed in this previous blog post). Ratner’s theorems are already notoriously difficult to prove, and rely very much on the polynomial stability properties of unipotent flows; in this even more complicated setting, the unipotent flows are no longer tractable, and Mirzakhani instead uses a recent “exponential drift” method of Benoist and Quint as a substitute. Ratner’s theorems are incredibly useful for all sorts of problems connected to homogeneous dynamics, and the analogous theorems established by Mirzakhani, Eskin, and Mohammadi have a similarly broad range of applications, for instance in counting periodic billiard trajectories in rational polygons.

Many fluid equations are expected to exhibit turbulence in their solutions, in which a significant portion of their energy ends up in high frequency modes. A typical example arises from the three-dimensional periodic Navier-Stokes equations

\displaystyle \partial_t u + u \cdot \nabla u = \nu \Delta u - \nabla p + f

\displaystyle \nabla \cdot u = 0

where {u: {\bf R} \times {\bf R}^3/{\bf Z}^3 \rightarrow {\bf R}^3} is the velocity field, {f: {\bf R} \times {\bf R}^3/{\bf Z}^3 \rightarrow {\bf R}^3} is a forcing term, {p: {\bf R} \times {\bf R}^3/{\bf Z}^3 \rightarrow {\bf R}} is a pressure field, and {\nu > 0} is the viscosity. To study the dynamics of energy for this system, we first pass to the Fourier transform

\displaystyle \hat u(t,k) := \int_{{\bf R}^3/{\bf Z}^3} u(t,x) e^{-2\pi i k \cdot x}

so that the system becomes

\displaystyle \partial_t \hat u(t,k) + 2\pi \sum_{k = k_1 + k_2} (\hat u(t,k_1) \cdot ik_2) \hat u(t,k_2) =

\displaystyle - 4\pi^2 \nu |k|^2 \hat u(t,k) + 2\pi ik \hat p(t,k) + \hat f(t,k) \ \ \ \ \ (1)

 

\displaystyle k \cdot \hat u(t,k) = 0.

We may normalise {u} (and {f}) to have mean zero, so that {\hat u(t,0)=0}. Then we introduce the dyadic energies

\displaystyle E_N(t) := \sum_{|k| \sim N} |\hat u(t,k)|^2

where {N \geq 1} ranges over the powers of two, and {|k| \sim N} is shorthand for {N \leq |k| < 2N}. Taking the inner product of (1) with {\hat u(t,k)}, we obtain the energy flow equation

\displaystyle \partial_t E_N = \sum_{N_1,N_2} \Pi_{N,N_1,N_2} - D_N + F_N \ \ \ \ \ (2)

 

where {N_1,N_2} range over powers of two, {\Pi_{N,N_1,N_2}} is the energy flow rate

\displaystyle \Pi_{N,N_1,N_2} := -2\pi \sum_{k=k_1+k_2: |k| \sim N, |k_1| \sim N_1, |k_2| \sim N_2}

\displaystyle (\hat u(t,k_1) \cdot ik_2) (\hat u(t,k) \cdot \hat u(t,k_2)),

{D_N} is the energy dissipation rate

\displaystyle D_N := 4\pi^2 \nu \sum_{|k| \sim N} |k|^2 |\hat u(t,k)|^2

and {F_N} is the energy injection rate

\displaystyle F_N := \sum_{|k| \sim N} \hat u(t,k) \cdot \hat f(t,k).

The Navier-Stokes equations are notoriously difficult to solve in general. Despite this, Kolmogorov in 1941 was able to give a convincing heuristic argument for what the distribution of the dyadic energies {E_N} should become over long times, assuming that some sort of distributional steady state is reached. It is common to present this argument in the form of dimensional analysis, but one can also give a more “first principles” form Kolmogorov’s argument, which I will do here. Heuristically, one can divide the frequency scales {N} into three regimes:

  • The injection regime in which the energy injection rate {F_N} dominates the right-hand side of (2);
  • The energy flow regime in which the flow rates {\Pi_{N,N_1,N_2}} dominate the right-hand side of (2); and
  • The dissipation regime in which the dissipation {D_N} dominates the right-hand side of (2).

If we assume a fairly steady and smooth forcing term {f}, then {\hat f} will be supported on the low frequency modes {k=O(1)}, and so we heuristically expect the injection regime to consist of the low scales {N=O(1)}. Conversely, if we take the viscosity {\nu} to be small, we expect the dissipation regime to only occur for very large frequencies {N}, with the energy flow regime occupying the intermediate frequencies.

We can heuristically predict the dividing line between the energy flow regime. Of all the flow rates {\Pi_{N,N_1,N_2}}, it turns out in practice that the terms in which {N_1,N_2 = N+O(1)} (i.e., interactions between comparable scales, rather than widely separated scales) will dominate the other flow rates, so we will focus just on these terms. It is convenient to return back to physical space, decomposing the velocity field {u} into Littlewood-Paley components

\displaystyle u_N(t,x) := \sum_{|k| \sim N} \hat u(t,k) e^{2\pi i k \cdot x}

of the velocity field {u(t,x)} at frequency {N}. By Plancherel’s theorem, this field will have an {L^2} norm of {E_N(t)^{1/2}}, and as a naive model of turbulence we expect this field to be spread out more or less uniformly on the torus, so we have the heuristic

\displaystyle |u_N(t,x)| = O( E_N(t)^{1/2} ),

and a similar heuristic applied to {\nabla u_N} gives

\displaystyle |\nabla u_N(t,x)| = O( N E_N(t)^{1/2} ).

(One can consider modifications of the Kolmogorov model in which {u_N} is concentrated on a lower-dimensional subset of the three-dimensional torus, leading to some changes in the numerology below, but we will not consider such variants here.) Since

\displaystyle \Pi_{N,N_1,N_2} = - \int_{{\bf R}^3/{\bf Z}^3} u_N \cdot ( (u_{N_1} \cdot \nabla) u_{N_2} )\ dx

we thus arrive at the heuristic

\displaystyle \Pi_{N,N_1,N_2} = O( N_2 E_N^{1/2} E_{N_1}^{1/2} E_{N_2}^{1/2} ).

Of course, there is the possibility that due to significant cancellation, the energy flow is significantly less than {O( N E_N(t)^{3/2} )}, but we will assume that cancellation effects are not that significant, so that we typically have

\displaystyle \Pi_{N,N_1,N_2} \sim N_2 E_N^{1/2} E_{N_1}^{1/2} E_{N_2}^{1/2} \ \ \ \ \ (3)

 

or (assuming that {E_N} does not oscillate too much in {N}, and {N_1,N_2} are close to {N})

\displaystyle \Pi_{N,N_1,N_2} \sim N E_N^{3/2}.

On the other hand, we clearly have

\displaystyle D_N \sim \nu N^2 E_N.

We thus expect to be in the dissipation regime when

\displaystyle N \gtrsim \nu^{-1} E_N^{1/2} \ \ \ \ \ (4)

 

and in the energy flow regime when

\displaystyle 1 \lesssim N \lesssim \nu^{-1} E_N^{1/2}. \ \ \ \ \ (5)

 

Now we study the energy flow regime further. We assume a “statistically scale-invariant” dynamics in this regime, in particular assuming a power law

\displaystyle E_N \sim A N^{-\alpha} \ \ \ \ \ (6)

 

for some {A,\alpha > 0}. From (3), we then expect an average asymptotic of the form

\displaystyle \Pi_{N,N_1,N_2} \approx A^{3/2} c_{N,N_1,N_2} (N N_1 N_2)^{1/3 - \alpha/2} \ \ \ \ \ (7)

 

for some structure constants {c_{N,N_1,N_2} \sim 1} that depend on the exact nature of the turbulence; here we have replaced the factor {N_2} by the comparable term {(N N_1 N_2)^{1/3}} to make things more symmetric. In order to attain a steady state in the energy flow regime, we thus need a cancellation in the structure constants:

\displaystyle \sum_{N_1,N_2} c_{N,N_1,N_2} (N N_1 N_2)^{1/3 - \alpha/2} \approx 0. \ \ \ \ \ (8)

 

On the other hand, if one is assuming statistical scale invariance, we expect the structure constants to be scale-invariant (in the energy flow regime), in that

\displaystyle c_{\lambda N, \lambda N_1, \lambda N_2} = c_{N,N_1,N_2} \ \ \ \ \ (9)

 

for dyadic {\lambda > 0}. Also, since the Euler equations conserve energy, the energy flows {\Pi_{N,N_1,N_2}} symmetrise to zero,

\displaystyle \Pi_{N,N_1,N_2} + \Pi_{N,N_2,N_1} + \Pi_{N_1,N,N_2} + \Pi_{N_1,N_2,N} + \Pi_{N_2,N,N_1} + \Pi_{N_2,N_1,N} = 0,

which from (7) suggests a similar cancellation among the structure constants

\displaystyle c_{N,N_1,N_2} + c_{N,N_2,N_1} + c_{N_1,N,N_2} + c_{N_1,N_2,N} + c_{N_2,N,N_1} + c_{N_2,N_1,N} \approx 0.

Combining this with the scale-invariance (9), we see that for fixed {N}, we may organise the structure constants {c_{N,N_1,N_2}} for dyadic {N_1,N_2} into sextuples which sum to zero (including some degenerate tuples of order less than six). This will automatically guarantee the cancellation (8) required for a steady state energy distribution, provided that

\displaystyle \frac{1}{3} - \frac{\alpha}{2} = 0

or in other words

\displaystyle \alpha = \frac{2}{3};

for any other value of {\alpha}, there is no particular reason to expect this cancellation (8) to hold. Thus we are led to the heuristic conclusion that the most stable power law distribution for the energies {E_N} is the {2/3} law

\displaystyle E_N \sim A N^{-2/3} \ \ \ \ \ (10)

 

or in terms of shell energies, we have the famous Kolmogorov 5/3 law

\displaystyle \sum_{|k| = k_0 + O(1)} |\hat u(t,k)|^2 \sim A k_0^{-5/3}.

Given that frequency interactions tend to cascade from low frequencies to high (if only because there are so many more high frequencies than low ones), the above analysis predicts a stablising effect around this power law: scales at which a law (6) holds for some {\alpha > 2/3} are likely to lose energy in the near-term, while scales at which a law (6) hold for some {\alpha< 2/3} are conversely expected to gain energy, thus nudging the exponent of power law towards {2/3}.

We can solve for {A} in terms of energy dissipation as follows. If we let {N_*} be the frequency scale demarcating the transition from the energy flow regime (5) to the dissipation regime (4), we have

\displaystyle N_* \sim \nu^{-1} E_{N_*}

and hence by (10)

\displaystyle N_* \sim \nu^{-1} A N_*^{-2/3}.

On the other hand, if we let {\epsilon := D_{N_*}} be the energy dissipation at this scale {N_*} (which we expect to be the dominant scale of energy dissipation), we have

\displaystyle \epsilon \sim \nu N_*^2 E_N \sim \nu N_*^2 A N_*^{-2/3}.

Some simple algebra then lets us solve for {A} and {N_*} as

\displaystyle N_* \sim (\frac{\epsilon}{\nu^3})^{1/4}

and

\displaystyle A \sim \epsilon^{2/3}.

Thus, we have the Kolmogorov prediction

\displaystyle \sum_{|k| = k_0 + O(1)} |\hat u(t,k)|^2 \sim \epsilon^{2/3} k_0^{-5/3}

for

\displaystyle 1 \lesssim k_0 \lesssim (\frac{\epsilon}{\nu^3})^{1/4}

with energy dissipation occuring at the high end {k_0 \sim (\frac{\epsilon}{\nu^3})^{1/4}} of this scale, which is counterbalanced by the energy injection at the low end {k_0 \sim 1} of the scale.

As in the previous post, all computations here are at the formal level only.

In the previous blog post, the Euler equations for inviscid incompressible fluid flow were interpreted in a Lagrangian fashion, and then Noether’s theorem invoked to derive the known conservation laws for these equations. In a bit more detail: starting with Lagrangian space {{\cal L} = ({\bf R}^n, \hbox{vol})} and Eulerian space {{\cal E} = ({\bf R}^n, \eta, \hbox{vol})}, we let {M} be the space of volume-preserving, orientation-preserving maps {\Phi: {\cal L} \rightarrow {\cal E}} from Lagrangian space to Eulerian space. Given a curve {\Phi: {\bf R} \rightarrow M}, we can define the Lagrangian velocity field {\dot \Phi: {\bf R} \times {\cal L} \rightarrow T{\cal E}} as the time derivative of {\Phi}, and the Eulerian velocity field {u := \dot \Phi \circ \Phi^{-1}: {\bf R} \times {\cal E} \rightarrow T{\cal E}}. The volume-preserving nature of {\Phi} ensures that {u} is a divergence-free vector field:

\displaystyle  \nabla \cdot u = 0. \ \ \ \ \ (1)

If we formally define the functional

\displaystyle  J[\Phi] := \frac{1}{2} \int_{\bf R} \int_{{\cal E}} |u(t,x)|^2\ dx dt = \frac{1}{2} \int_R \int_{{\cal L}} |\dot \Phi(t,x)|^2\ dx dt

then one can show that the critical points of this functional (with appropriate boundary conditions) obey the Euler equations

\displaystyle  [\partial_t + u \cdot \nabla] u = - \nabla p

\displaystyle  \nabla \cdot u = 0

for some pressure field {p: {\bf R} \times {\cal E} \rightarrow {\bf R}}. As discussed in the previous post, the time translation symmetry of this functional yields conservation of the Hamiltonian

\displaystyle  \frac{1}{2} \int_{{\cal E}} |u(t,x)|^2\ dx = \frac{1}{2} \int_{{\cal L}} |\dot \Phi(t,x)|^2\ dx;

the rigid motion symmetries of Eulerian space give conservation of the total momentum

\displaystyle  \int_{{\cal E}} u(t,x)\ dx

and total angular momentum

\displaystyle  \int_{{\cal E}} x \wedge u(t,x)\ dx;

and the diffeomorphism symmetries of Lagrangian space give conservation of circulation

\displaystyle  \int_{\Phi(\gamma)} u^*

for any closed loop {\gamma} in {{\cal L}}, or equivalently pointwise conservation of the Lagrangian vorticity {\Phi^* \omega = \Phi^* du^*}, where {u^*} is the {1}-form associated with the vector field {u} using the Euclidean metric {\eta} on {{\cal E}}, with {\Phi^*} denoting pullback by {\Phi}.

It turns out that one can generalise the above calculations. Given any self-adjoint operator {A} on divergence-free vector fields {u: {\cal E} \rightarrow {\bf R}}, we can define the functional

\displaystyle  J_A[\Phi] := \frac{1}{2} \int_{\bf R} \int_{{\cal E}} u(t,x) \cdot A u(t,x)\ dx dt;

as we shall see below the fold, critical points of this functional (with appropriate boundary conditions) obey the generalised Euler equations

\displaystyle  [\partial_t + u \cdot \nabla] Au + (\nabla u) \cdot Au= - \nabla \tilde p \ \ \ \ \ (2)

\displaystyle  \nabla \cdot u = 0

for some pressure field {\tilde p: {\bf R} \times {\cal E} \rightarrow {\bf R}}, where {(\nabla u) \cdot Au} in coordinates is {\partial_i u_j Au_j} with the usual summation conventions. (When {A=1}, {(\nabla u) \cdot Au = \nabla(\frac{1}{2} |u|^2)}, and this term can be absorbed into the pressure {\tilde p}, and we recover the usual Euler equations.) Time translation symmetry then gives conservation of the Hamiltonian

\displaystyle  \frac{1}{2} \int_{{\cal E}} u(t,x) \cdot A u(t,x)\ dx.

If the operator {A} commutes with rigid motions on {{\cal E}}, then we have conservation of total momentum

\displaystyle  \int_{{\cal E}} Au(t,x)\ dx

and total angular momentum

\displaystyle  \int_{{\cal E}} x \wedge Au(t,x)\ dx,

and the diffeomorphism symmetries of Lagrangian space give conservation of circulation

\displaystyle  \int_{\Phi(\gamma)} (Au)^*

or pointwise conservation of the Lagrangian vorticity {\Phi^* \theta := \Phi^* d(Au)^*}. These applications of Noether’s theorem proceed exactly as the previous post; we leave the details to the interested reader.

One particular special case of interest arises in two dimensions {n=2}, when {A} is the inverse derivative {A = |\nabla|^{-1} = (-\Delta)^{-1/2}}. The vorticity {\theta = d(Au)^*} is a {2}-form, which in the two-dimensional setting may be identified with a scalar. In coordinates, if we write {u = (u_1,u_2)}, then

\displaystyle  \theta = \partial_{x_1} |\nabla|^{-1} u_2 - \partial_{x_2} |\nabla|^{-1} u_1.

Since {u} is also divergence-free, we may therefore write

\displaystyle  u = (- \partial_{x_2} \psi, \partial_{x_1} \psi )

where the stream function {\psi} is given by the formula

\displaystyle  \psi = |\nabla|^{-1} \theta.

If we take the curl of the generalised Euler equation (2), we obtain (after some computation) the surface quasi-geostrophic equation

\displaystyle  [\partial_t + u \cdot \nabla] \theta = 0 \ \ \ \ \ (3)

\displaystyle  u = (-\partial_{x_2} |\nabla|^{-1} \theta, \partial_{x_1} |\nabla|^{-1} \theta).

This equation has strong analogies with the three-dimensional incompressible Euler equations, and can be viewed as a simplified model for that system; see this paper of Constantin, Majda, and Tabak for details.

Now we can specialise the general conservation laws derived previously to this setting. The conserved Hamiltonian is

\displaystyle  \frac{1}{2} \int_{{\bf R}^2} u\cdot |\nabla|^{-1} u\ dx = \frac{1}{2} \int_{{\bf R}^2} \theta \psi\ dx = \frac{1}{2} \int_{{\bf R}^2} \theta |\nabla|^{-1} \theta\ dx

(a law previously observed for this equation in the abovementioned paper of Constantin, Majda, and Tabak). As {A} commutes with rigid motions, we also have (formally, at least) conservation of momentum

\displaystyle  \int_{{\bf R}^2} Au\ dx

(which up to trivial transformations is also expressible in impulse form as {\int_{{\bf R}^2} \theta x\ dx}, after integration by parts), and conservation of angular momentum

\displaystyle  \int_{{\bf R}^2} x \wedge Au\ dx

(which up to trivial transformations is {\int_{{\bf R}^2} \theta |x|^2\ dx}). Finally, diffeomorphism invariance gives pointwise conservation of Lagrangian vorticity {\Phi^* \theta}, thus {\theta} is transported by the flow (which is also evident from (3). In particular, all integrals of the form {\int F(\theta)\ dx} for a fixed function {F} are conserved by the flow.

Read the rest of this entry »

Mathematicians study a variety of different mathematical structures, but perhaps the structures that are most commonly associated with mathematics are the number systems, such as the integers {{\bf Z}} or the real numbers {{\bf R}}. Indeed, the use of number systems is so closely identified with the practice of mathematics that one sometimes forgets that it is possible to do mathematics without explicit reference to any concept of number. For instance, the ancient Greeks were able to prove many theorems in Euclidean geometry, well before the development of Cartesian coordinates and analytic geometry in the seventeenth century, or the formal constructions or axiomatisations of the real number system that emerged in the nineteenth century (not to mention precursor concepts such as zero or negative numbers, whose very existence was highly controversial, if entertained at all, to the ancient Greeks). To do this, the Greeks used geometric operations as substitutes for the arithmetic operations that would be more familiar to modern mathematicians. For instance, concatenation of line segments or planar regions serves as a substitute for addition; the operation of forming a rectangle out of two line segments would serve as a substitute for multiplication; the concept of similarity can be used as a substitute for ratios or division; and so forth.

A similar situation exists in modern physics. Physical quantities such as length, mass, momentum, charge, and so forth are routinely measured and manipulated using the real number system {{\bf R}} (or related systems, such as {{\bf R}^3} if one wishes to measure a vector-valued physical quantity such as velocity). Much as analytic geometry allows one to use the laws of algebra and trigonometry to calculate and prove theorems in geometry, the identification of physical quantities with numbers allows one to express physical laws and relationships (such as Einstein’s famous mass-energy equivalence {E=mc^2}) as algebraic (or differential) equations, which can then be solved and otherwise manipulated through the extensive mathematical toolbox that has been developed over the centuries to deal with such equations.

However, as any student of physics is aware, most physical quantities are not represented purely by one or more numbers, but instead by a combination of a number and some sort of unit. For instance, it would be a category error to assert that the length of some object was a number such as {10}; instead, one has to say something like “the length of this object is {10} yards”, combining both a number {10} and a unit (in this case, the yard). Changing the unit leads to a change in the numerical value assigned to this physical quantity, even though no physical change to the object being measured has occurred. For instance, if one decides to use feet as the unit of length instead of yards, then the length of the object is now {30} feet; if one instead uses metres, the length is now {9.144} metres; and so forth. But nothing physical has changed when performing this change of units, and these lengths are considered all equal to each other:

\displaystyle  10 \hbox{ yards } = 30 \hbox{ feet } = 9.144 \hbox{ metres}.

It is then common to declare that while physical quantities and units are not, strictly speaking, numbers, they should be manipulated using the laws of algebra as if they were numerical quantities. For instance, if an object travels {10} metres in {5} seconds, then its speed should be

\displaystyle  (10 m) / (5 s) = 2 ms^{-1}

where we use the usual abbreviations of {m} and {s} for metres and seconds respectively. Similarly, if the speed of light {c} is {c=299 792 458 ms^{-1}} and an object has mass {10 kg}, then Einstein’s mass-energy equivalence {E=mc^2} then tells us that the energy-content of this object is

\displaystyle  (10 kg) (299 792 458 ms^{-1})^2 \approx 8.99 \times 10^{17} kg m^2 s^{-2}.

Note that the symbols {kg, m, s} are being manipulated algebraically as if they were mathematical variables such as {x} and {y}. By collecting all these units together, we see that every physical quantity gets assigned a unit of a certain dimension: for instance, we see here that the energy {E} of an object can be given the unit of {kg m^2 s^{-2}} (more commonly known as a Joule), which has the dimension of {M L^2 T^{-2}} where {M, L, T} are the dimensions of mass, length, and time respectively.

There is however one important limitation to the ability to manipulate “dimensionful” quantities as if they were numbers: one is not supposed to add, subtract, or compare two physical quantities if they have different dimensions, although it is acceptable to multiply or divide two such quantities. For instance, if {m} is a mass (having the units {M}) and {v} is a speed (having the units {LT^{-1}}), then it is physically “legitimate” to form an expression such as {\frac{1}{2} mv^2}, but not an expression such as {m+v} or {m-v}; in a similar spirit, statements such as {m=v} or {m\geq v} are physically meaningless. This combines well with the mathematical distinction between vector, scalar, and matrix quantities, which among other things prohibits one from adding together two such quantities if their vector or matrix type are different (e.g. one cannot add a scalar to a vector, or a vector to a matrix), and also places limitations on when two such quantities can be multiplied together. A related limitation, which is not always made explicit in physics texts, is that transcendental mathematical functions such as {\sin} or {\exp} should only be applied to arguments that are dimensionless; thus, for instance, if {v} is a speed, then {\hbox{arctanh}(v)} is not physically meaningful, but {\hbox{arctanh}(v/c)} is (this particular quantity is known as the rapidity associated to this speed).

These limitations may seem like a weakness in the mathematical modeling of physical quantities; one may think that one could get a more “powerful” mathematical framework if one were allowed to perform dimensionally inconsistent operations, such as add together a mass and a velocity, add together a vector and a scalar, exponentiate a length, etc. Certainly there is some precedent for this in mathematics; for instance, the formalism of Clifford algebras does in fact allow one to (among other things) add vectors with scalars, and in differential geometry it is quite common to formally apply transcendental functions (such as the exponential function) to a differential form (for instance, the Liouville measure {\frac{1}{n!} \omega^n} of a symplectic manifold can be usefully thought of as a component of the exponential {\exp(\omega)} of the symplectic form {\omega}).

However, there are several reasons why it is advantageous to retain the limitation to only perform dimensionally consistent operations. One is that of error correction: one can often catch (and correct for) errors in one’s calculations by discovering a dimensional inconsistency, and tracing it back to the first step where it occurs. Also, by performing dimensional analysis, one can often identify the form of a physical law before one has fully derived it. For instance, if one postulates the existence of a mass-energy relationship involving only the mass of an object {m}, the energy content {E}, and the speed of light {c}, dimensional analysis is already sufficient to deduce that the relationship must be of the form {E = \alpha mc^2} for some dimensionless absolute constant {\alpha}; the only remaining task is then to work out the constant of proportionality {\alpha}, which requires physical arguments beyond that provided by dimensional analysis. (This is a simple instance of a more general application of dimensional analysis known as the Buckingham {\pi} theorem.)

The use of units and dimensional analysis has certainly been proven to be very effective tools in physics. But one can pose the question of whether it has a properly grounded mathematical foundation, in order to settle any lingering unease about using such tools in physics, and also in order to rigorously develop such tools for purely mathematical purposes (such as analysing identities and inequalities in such fields of mathematics as harmonic analysis or partial differential equations).

The example of Euclidean geometry mentioned previously offers one possible approach to formalising the use of dimensions. For instance, one could model the length of a line segment not by a number, but rather by the equivalence class of all line segments congruent to the original line segment (cf. the Frege-Russell definition of a number). Similarly, the area of a planar region can be modeled not by a number, but by the equivalence class of all regions that are equidecomposable with the original region (one can, if one wishes, restrict attention here to measurable sets in order to avoid Banach-Tarski-type paradoxes, though that particular paradox actually only arises in three and higher dimensions). As mentioned before, it is then geometrically natural to multiply two lengths to form an area, by taking a rectangle whose line segments have the stated lengths, and using the area of that rectangle as a product. This geometric picture works well for units such as length and volume that have a spatial geometric interpretation, but it is less clear how to apply it for more general units. For instance, it does not seem geometrically natural (or, for that matter, conceptually helpful) to envision the equation {E=mc^2} as the assertion that the energy {E} is the volume of a rectangular box whose height is the mass {m} and whose length and width is given by the speed of light {c}.

But there are at least two other ways to formalise dimensionful quantities in mathematics, which I will discuss below the fold. The first is a “parametric” model in which dimensionful objects are modeled as numbers (or vectors, matrices, etc.) depending on some base dimensional parameters (such as units of length, mass, and time, or perhaps a coordinate system for space or spacetime), and transforming according to some representation of a structure group that encodes the range of these parameters; this type of “coordinate-heavy” model is often used (either implicitly or explicitly) by physicists in order to efficiently perform calculations, particularly when manipulating vector or tensor-valued quantities. The second is an “abstract” model in which dimensionful objects now live in an abstract mathematical space (e.g. an abstract vector space), in which only a subset of the operations available to general-purpose number systems such as {{\bf R}} or {{\bf R}^3} are available, namely those operations which are “dimensionally consistent” or invariant (or more precisely, equivariant) with respect to the action of the underlying structure group. This sort of “coordinate-free” approach tends to be the one which is preferred by pure mathematicians, particularly in the various branches of modern geometry, in part because it can lead to greater conceptual clarity, as well as results of great generality; it is also close to the more informal practice of treating mathematical manipulations that do not preserve dimensional consistency as being physically meaningless.

Read the rest of this entry »

Things are pretty quiet here during the holiday season, but one small thing I have been working on recently is a set of notes on special relativity that I will be working through in a few weeks with some bright high school students here at our local math circle.  I have only two hours to spend with this group, and it is unlikely that we will reach the end of the notes (in which I derive the famous mass-energy equivalence relation E=mc^2, largely following Einstein’s original derivation as discussed in this previous blog post); instead we will probably spend a fair chunk of time on related topics which do not actually require special relativity per se, such as spacetime diagrams, the Doppler shift effect, and an analysis of my airport puzzle.  This will be my first time doing something of this sort (in which I will be spending as much time interacting directly with the students as I would lecturing);  I’m not sure exactly how it will play out, being a little outside of my usual comfort zone of undergraduate and graduate teaching, but am looking forward to finding out how it goes.   (In particular, it may end up that the discussion deviates somewhat from my prepared notes.)

The material covered in my notes is certainly not new, but I ultimately decided that it was worth putting up here in case some readers here had any corrections or other feedback to contribute (which, as always, would be greatly appreciated).

[Dec 24 and then Jan 21: notes updated, in response to comments.]

Way back in 2007, I wrote a blog post giving Einstein’s derivation of his famous equation {E=mc^2} for the rest energy of a body with mass {m}. (Throughout this post, mass is used to refer to the invariant mass (also known as rest mass) of an object.) This derivation used a number of physical assumptions, including the following:

  1. The two postulates of special relativity: firstly, that the laws of physics are the same in every inertial reference frame, and secondly that the speed of light in vacuum is equal {c} in every such inertial frame.
  2. Planck’s relation and de Broglie’s law for photons, relating the frequency, energy, and momentum of such photons together.
  3. The law of conservation of energy, and the law of conservation of momentum, as well as the additivity of these quantities (i.e. the energy of a system is the sum of the energy of its components, and similarly for momentum).
  4. The Newtonian approximations {E \approx E_0 + \frac{1}{2} m|v|^2}, {p \approx m v} to energy and momentum at low velocities.

The argument was one-dimensional in nature, in the sense that only one of the three spatial dimensions was actually used in the proof.

As was pointed out in comments in the previous post by Laurens Gunnarsen, this derivation has the curious feature of needing some laws from quantum mechanics (specifically, the Planck and de Broglie laws) in order to derive an equation in special relativity (which does not ostensibly require quantum mechanics). One can then ask whether one can give a derivation that does not require such laws. As pointed out in previous comments, one can use the representation theory of the Lorentz group {SO(d,1)} to give a nice derivation that avoids any quantum mechanics, but it now needs at least two spatial dimensions instead of just one. I decided to work out this derivation in a way that does not explicitly use representation theory (although it is certainly lurking beneath the surface). The concept of momentum is only barely used in this derivation, and the main ingredients are now reduced to the following:

  1. The two postulates of special relativity;
  2. The law of conservation of energy (and the additivity of energy);
  3. The Newtonian approximation {E \approx E_0 + \frac{1}{2} m|v|^2} at low velocities.

The argument (which uses a little bit of calculus, but is otherwise elementary) is given below the fold. Whereas Einstein’s original argument considers a mass emitting two photons in several different reference frames, the argument here considers a large mass breaking up into two equal smaller masses. Viewing this situation in different reference frames gives a functional equation for the relationship between energy, mass, and velocity, which can then be solved using some calculus, using the Newtonian approximation as a boundary condition, to give the famous {E=mc^2} formula.

Disclaimer: As with the previous post, the arguments here are physical arguments rather than purely mathematical ones, and thus do not really qualify as a rigorous mathematical argument, due to the implicit use of a number of physical and metaphysical hypotheses beyond the ones explicitly listed above. (But it would be difficult to say anything non-tautological at all about the physical world if one could rely solely on {100\%} rigorous mathematical reasoning.)

Read the rest of this entry »

Archives