11
$\begingroup$

It is common in Physics to speak of the variation in the context of the calculus of variations. In particular, given an action $S$ that is a functional we talk about the variation $\delta S$.

In the context of Classical Mechanics one can make sense of the variational principle in the following way: we consider the configuration manifold $M$ of the system of particles and we consider the Lagrangian as $L : TM\to \mathbb{R}$ defined on the tangent bundle.

In that case, we define the action $S$ as the functional defined on paths $\gamma : [a,b]\to \mathbb{R}\to M$ by

$$S[\gamma]=\int_a^bL(\gamma(t),\gamma'(t))dt.$$

Then we consider a variation on the path as a parametrized family of paths $\phi : [a,b]\times (-\epsilon,\epsilon)\to M$ so that $\phi(t,s)=\gamma_s(t)$.

In that case we get a function

$$S[\gamma_s]=\int_a^b L(\phi(t,s),\partial_t\phi(t,s))dt$$

So that we can study the extremum of this $\mathbb{R}\to \mathbb{R}$ function by differentiating

$$\dfrac{dS[\gamma_s]}{ds}=\dfrac{d}{ds}\int_a^bL(\phi(t,s),\partial_t\phi(t,s))dt$$

and this can be resolved using a chart $(U,q)$ on $M$ which lifts to a chart $(TU,(q,\dot{q}))$ on $TM$.

Although rigorous this doesn't define the variation operation. Currently I'm studying General Relativity, and in the context of Classical Field Theory this variation concept appeared a lot.

Indeed when deriving Einstein's equations from the Einstein-Hilbert action one computes the variaton $\delta S$ and sets it to zero. This includes computing $\delta R_{ab}$ for example.

In other words, this variation operation acts on the action functional, so that the variational principle becomes $\delta S = 0$, however it also acts on $C^\infty$ function on the manifold like $\delta R_{ab}$.

My question here is: in this context of General Relativity and Classical Field Theory, how does one rigorously define the variation operation $\delta$? How can it act both on functionals and functions? How does it relate to the traditional approach from Classical Mechanics that I outlined above?

$\endgroup$
4
  • $\begingroup$ I feel like your question is similar to for a function $f(x)$, why can you write $df$ and $dx$ for some differential operator $d$. I can't imagine the variational operator $\delta$ being any different from the normal one $d$, except for the space in which it lives. $\endgroup$
    – Aaron
    Commented Mar 10, 2017 at 15:47
  • $\begingroup$ But a function like the one you speak of is defined on a smooth manifold $M$ and has a precise definition of $df$. We define $df$ to be the covector field such that $df(X)= Xf$ for all vector fields $X$. We can't do the same for $S$, since $S$ isn't defined on a smooth manifold, but at a set of functions. In principle this set of functions could be endowed with such structure, but I believe it would be highly not natural and cumbersome. So I believe the precise way to make sense of $\delta$ is a little bit different. That is my point here. $\endgroup$
    – Gold
    Commented Mar 10, 2017 at 16:31
  • $\begingroup$ I don't believe that the function space $S$ is structureless. We have addition on functions at least, so $S$ is a vector space. In physics, we assume functions can converge to each other. This requires some notion of norm. We also typically assume $S$ is complete under this norm. Hence $S$ is a complete normed vector (Banach) space, which admits a Frechet (functional) derivative. Admittedly I am unfamiliar with the formal mathematics, so this lies short of a full answer. $\endgroup$
    – Aaron
    Commented Mar 10, 2017 at 17:18
  • 2
    $\begingroup$ That's not quite the point IMHO. The action $S$ is a functional. It is defined on the space of paths $C^\infty(I,M)$ where $I = [a,b]\subset \mathbb{R}$. Now, the space $C^\infty(I,M)$ doesn't carry the structure of a vector space. $M$ doesn't necessarily have operations over points. Since for each $t\in [a,b]$, $\gamma(t)\in M$ it is not clear how we can form $\lambda \gamma + \alpha$, since pointwise operations aren't available. The action is a map $S : C^\infty(I,M)\to \mathbb{R}$ and since $C^\infty(I,M)$ isn't a vector space we can't talk about Frechet derivative of $S$. $\endgroup$
    – Gold
    Commented Mar 10, 2017 at 17:26

2 Answers 2

5
$\begingroup$

The user coconut was faster (and shorter), so may answer aims rather at expanding a bit on the technical aspects of calculus of variations for (local) classical field theory.

A physics book that does a pretty good job at defining variation of an action functional for field theory is General Relativity by Robert M. Wald (University of Chicago Press, 1984) - see Appendix E, pp. 450ff. However, since it still leaves a few technical details aside, I will outline the procedure below.

The idea is essentially the same as you wrote for classical mechanics. One understands field configurations (among these, the space-time metric $g$) as smooth sections $$\phi\in\Gamma^\infty(\pi):=\{\phi\in\mathscr{C}^\infty(M,E)\ |\ \pi\circ\phi=\mathrm{id}_M\}$$ of some fiber bundle $\pi:E\rightarrow M$ over the space-time manifold $M$. (Finite) field variations are then just smooth maps $\Phi:M\times I\rightarrow E$, where $I\subset\mathbb{R}$ is an open interval, such that $$\phi_s:=\Phi(\cdot,s)\in\Gamma^\infty(\pi)$$ for all $s\in I$. The latter means that $\phi_s(p)=\Phi(p,s)\in\pi^{-1}(p)$ for all $p\in M$, $s\in I$ - particularly, if (say) $0\in I$ and $\phi_0=\phi$, then an infinitesimal field variation around $\phi$ would be given at each $p\in M$ by $\delta\phi(p)=\left.\dfrac{\partial\Phi(p,s)}{\partial s}\right|_{s=0}$. It follows that $\delta\phi$ may be seen as a smooth section of the pullback $$\phi^*VE:=\{(p,X)\in M\times VE\ |\ \phi(p)=\pi_{TE}(X)\}$$ of the vertical bundle $$\pi_{TE}:VE:=\{X\in TE\ |\ T\pi(X)=0_{TM}\}\rightarrow E$$ under $\phi$, which may be seen as a "tangent vector" to $\Gamma^\infty(\pi)$ at $\phi$. Conversely, if you put a (complete) Riemannian metric on the fibers of $E$, you may use the exponential map associated to them to build a field variation from an infinitesimal one.

This is roughly the picture of $\Gamma^\infty(\pi)$ as an infinite-dimensional manifold (there are a few caveats which are briefly discussed in the technical appendix at the end of this answer but these are of no consequence in what follows).

To move from there to action variations, one first needs to outline what an action functional is. Recall that a functional on $\Gamma^\infty(\pi)$ is just a map $F:\Gamma^\infty(\pi)\rightarrow\mathbb{C}$. It turns out that an action is not a single functional, but a family of functionals $\{S_K\ |K\subset M\text{ compact}\}$ such that $S_K(\phi_1)=S_K(\phi_2)$ for all $\phi_1,\phi_2\in\Gamma^\infty(\pi)$ such that $\phi_1=\phi_2$ on $K$. The point here is to account for the (most often encountered) possibility that the Lagrangian density evaluated on $\phi$ is not integrable on the whole of $M$. If one requires in addition that each $S_K$ is local and depends on the derivatives of $\phi$ up to order (say) $r$ (in a precise sense that I will not define here), it follows that $$S_K(\phi)=\int_K L(p,\phi(p),\nabla\phi(p),\ldots,\nabla^r\phi(p))\mathrm{d}^n x\ ,$$ where the Lagrangian density $L$ is smooth on its arguments (we also require that $L$ does not depend on $K$, of course - this can be encoded into compatibility conditions among the $S_K$'s, whose details are irrelevant to us here).

I am being deliberately loose on the definition of (first- and higher-order) derivatives $\nabla^k\phi$ of smooth sections $\phi$ of $\pi$ (which encode, in the case of $\phi=g$, the curvature of $g$ and so on) since this requires the notion of jet bundles of $\pi$, which is slightly lengthy and will deviate us from our main goal here (I may add a few details on this later if you feel necessary to do so). Once this has all been set, the (finite) variation of $S_K$ corresponding to $\Phi$ is just $S_K(\phi_s)$, and the corresponding infinitesimal variation is just $$\delta S_K(\phi)=\left.\dfrac{d}{ds}\right|_{s=0}S_K(\phi_s)\ .$$ At this point, this becomes just standard differentiation under the integral sign, which is perfectly allowed under the above requirements.

A key step in computing $\delta S_K(\phi)$ is to show that fiber and base derivatives commute, i.e. $$\dfrac{\partial\nabla^k\Phi(p,s)}{\partial s}=\nabla^k \dfrac{\partial\Phi(p,s)}{\partial s}\ ,$$ so that $\nabla^k(\delta\phi)=\delta(\nabla^k\phi)$, for all $k\leq r$. In this way, you get the usual divergence terms which appear in standard treatments of calculus of variations. To get rid of those (when deriving the Euler-Lagrange equations, for instance), one may assume that the infinitesimal field variations are supported in the interior of $K$ (see the technical appendix below for more on this) when needed.

It is important to notice that the variational principle as expounded above is inherently local, so that the above considerations are actually independent of $K$.

(Technical appendix: if you want to have some sort of smooth manifold structure on $\Gamma^\infty(\pi)$, you need to specify which model vector space(s) you are employing. It turns out that you need to use $$\Gamma^\infty_c(\phi^*VE\rightarrow M):=\{X_\phi\in\Gamma^\infty(\phi^*VE\rightarrow M)\ |\ X_\phi\text{ has compact support}\}$$ as models, otherwise the resulting topology of $\Gamma^\infty(\pi)$ (using the exponential maps mentioned in the second paragraph above to build an atlas) is not guaranteed to be locally pathwise connected (this may fail if $M$ is not compact), hence not a manifold topology. This entails that one should restrict field variations $\Phi$ to be such that for each compact (= closed and bounded) subinterval $J\subset I$ there is a compact subset $K\subset M$ such that $\Phi(p,s)$ is constant on $(M\smallsetminus K)\times J$. The reason is that with the aforementioned atlas, these field variations become precisely the smooth curves of $\Gamma^\infty(\pi)$, and hence $\Gamma^\infty_c(\phi^*VE\rightarrow M)=T_\phi\Gamma^\infty(\pi)$ becomes the tangent space (without quotes) to $\Gamma^\infty(\pi)$ at $\phi$. Interestingly enough, these are precisely the kind of field variations needed to derive the Euler-Lagrange equations, which lends an additional weight to their importance which goes beyond the mere aesthetical requirement of consistency with a manifold structure on $\Gamma^\infty(\pi)$. Another technical detail is that if you use the standard (inductive limit) locally convex vector space topology of $\Gamma^\infty_c(\phi^*VE\rightarrow M)$ to induce the topology of $\Gamma^\infty(\pi)$ through the above atlas, you get a topological manifold structure (which, by the way, is the so-called Whitney topology on $\Gamma^\infty(\pi)$) but not a smooth one. For the latter, you need to use the final topology induced by the smooth curves $$\Xi:\mathbb{R}\rightarrow\Gamma^\infty_c(\phi^*VE\rightarrow M)$$ on $\Gamma^\infty_c(\phi^*VE\rightarrow M)$, which is finer than its standard one. The smooth curves $\Xi$ on $\Gamma^\infty_c(\phi^*VE\rightarrow M)$, on their turn, are smooth maps $\Xi:M\times I\rightarrow E$, where $I\subset\mathbb{R}$ is an open interval, such that $$X_s:=\Xi(\cdot,s)\in\Gamma^\infty_c(\phi^*VE\rightarrow M)$$ for all $s\in I$ such that for each compact subinterval $J\subset I$ there is a compact subset $K\subset M$ such that the support of $X_s$ is contained in $K$ for all $s\in J$. (recall that the support of a section $X$ of a vector bundle over $M$ is the smallest closed subset $\mathrm{supp}\,X$ of $M$ such that $X$ equals the zero section outside $\mathrm{supp}\,X$) For (many) more details on the procedure outlined here, see the book of Andreas Kriegl and Peter W. Michor, The Convenient Setting of Global Analysis (AMS, 1997))

$\endgroup$
3
  • $\begingroup$ Thanks for the answer. Is it really necessary to deal with the space of sections with a smooth manifold structure? Using the jet bundle formalism can't we avoid this and do everything in a finite dimensional manifold? I've heard about them when trying to make sense of the field Lagrangian in this question (physics.stackexchange.com/questions/143543/…), so I believe it is related to this new question about the variation. $\endgroup$
    – Gold
    Commented Mar 12, 2017 at 15:41
  • $\begingroup$ If you are only interested in the kinematical structure of field theoretical models from the viewpoint of field configurations and how to derive variational equations such as the Euler-Lagrange equations, indeed the jet bundle formalism suffices. The reason is that the notion of field variation we have defined is itself local. This is no longer enough if you want to study the spaces of solutions of variational equations. People working in the jet bundle formalism usually treat the field solution spaces formally, with little regard for the specific structure of the equations of motion. $\endgroup$ Commented Mar 12, 2017 at 23:33
  • $\begingroup$ Moreover, if you want to study field theory from the viewpoint of observables (= functionals), local functionals such as the action functional in some compact region are not closed under products. If you really want an algebra of observables, you are forced to deal with nonlocal functionals. This gets even worse if you want to have some sort of Poisson structure - you may try to identify solutions with initial data in order to get a local formula for the Poisson bracket, but that very identification is non-local and gets screwed up after quantization, specially for interacting models. $\endgroup$ Commented Mar 12, 2017 at 23:39
4
$\begingroup$

A definition can be given by direct generalization of the one on the question. Take the space of fields to be the space $C^\infty(X, M)$ of smooth functions between two manifolds. For any smooth functional $S:C^\infty(X, M)\to\mathbb{R}$ and any one-parameter family of fields $h:\mathbb{R}\times X\to M$ we can define a function $f_{S,h}\in C^\infty(\mathbb{R})$ by $f_{S,h}(t) = S(h_t)$. The variation $\delta S$ is the differential $df_{S,h}$ of this map.

In the case of mechanics, we are just specializing this definition for $X=[a,b]$. In general relativity, $X$ is the spacetime. $R_{\mu\nu}$ should be seen in this context as function over the manifold that assigns to any point $x\in X$ a functional $R_{\mu\nu}(x)$ of the metric.

$\endgroup$