0
$\begingroup$

Upon learning about the Lagrangian formulation of GR, where varying an action with respect to a metric (in order to, for instance, arrive at the Einstein field equations) is common, I can't help but wonder how this operation is even well defined. The reason is that you can write down a tensor in a number of different ways by raising and lowering its indices (or contracting them) with the metric itself, so that you can write it with any number of metric tensors multiplied in front of another tensor with the appropriate combination of upstairs/downstairs indices. This means that varying a tensor with respect to a metric can give a variety of different answers depending on how you write down your tensor (in particular on how many metric tensors that have been extracted out in the expression for the tensor).

Let me give a concrete example. Consider a rank (0,4) tensor $C_{\mu\nu\rho\sigma}$. We assume it satisfies the relation $$C_{\mu\nu\rho\sigma}=C_{\nu\mu\sigma\rho}\tag{1}$$ We can contract the first and third indices: $$C_{\nu\sigma}=g^{\mu\rho}C_{\mu\nu\rho\sigma}\tag{2}.$$ We can also form a scalar by contracting once more: $$C=g^{\nu\sigma}C_{\nu\sigma}\tag{3}.$$ Hence we also have $$C=g^{\nu\sigma}g^{\mu\rho}C_{\mu\nu\rho\sigma}\tag{4}$$

Now say we wish to calculate the expression $\frac{{\delta}C}{{\delta}g^{\alpha\beta}}$. If we substitute eq. 3 for $C$, and vary with respect to $g^{\alpha\beta}$ while keeping $C_{\nu\sigma}$ fixed, we would get $$\frac{{\delta}C}{{\delta}g^{\alpha\beta}}=C_{(\alpha\beta)}\tag{5},$$ where parentheses denote that the tensor is symmetrized. However, consider what happens if we substitute eq. 4 for $C$ and vary it while keeping $C_{\mu\nu\rho\sigma}$ fixed. Then using the product rule (since we now have two factors of the metric with $g^{\nu\sigma}$ and $g^{\mu\rho}$), exploiting eq. 1 and contracting using eq. 2, we now get $$\frac{{\delta}C}{{\delta}g^{\alpha\beta}}=2C_{(\alpha\beta)}\tag{6}.$$ Clearly, a contradiction between eq. 5 and 6. What's going on? How are you supposed to properly vary a tensor with respect to the metric? The key would seem to be to figure out what stays fixed during the variation. But the tensor I chose is a very general tensor that may not have any explicit dependence on the metric at all (you may have noticed that the Riemann tensor could easily be a candidate for $C_{\mu\nu\rho\sigma}$, but since the Riemann tensor in GR typically has an explicit dependence on the metric due to our choice of affine connection, I'll ignore that easy option). If such an explicit dependence existed, then we could simply use it to vary everything directly like we do when computing the variation of the Riemann tensor. But in general, how do you determine what should be fixed during the variation of a tensor, even when the tensor has no explicit dependence on the metric?

$\endgroup$
3
  • $\begingroup$ It seems you are arbitrarily "keeping" something fixed, and unsurprisingly find a contradiction. It's not related to tensor, if you consider $y = c\cdot x$, and derive $xy$ "keeping y fixed" you get the same inconsistency. $\endgroup$
    – fqq
    Commented Sep 3, 2022 at 13:01
  • $\begingroup$ @fqq See my main question at the bottom. No question the problem is with what should be fixed during the variation. The issue is therefore what I should do to figure out what should be fixed for any tensor. $\endgroup$ Commented Sep 3, 2022 at 13:05
  • $\begingroup$ Related: Variation of the metric with respect to the metric $\endgroup$
    – Qmechanic
    Commented Sep 3, 2022 at 13:18

3 Answers 3

1
$\begingroup$

The fact that you get a different result of a variation when you hold different quantities fixed is well-known (as pointed out by @fqq in the comments.) This is a feature of regular old Math-200 partial derivatives as well.

But as far as the calculus of variations goes, the resulting equations will be equivalent regardless of what we view as the "fundamental variables" of the system; and you can choose any set of "fundamental variables" you like, so long as they completely specify the configuration of the system. For example, suppose we have a Lagrangian involving the metric and a vector field $A^\mu$. We can construct the Euler-Lagrange equations as $$ \left( \frac{\delta \mathcal{L}}{\delta g_{\mu \nu}} \right)_{A^\mu} = 0 \qquad \left( \frac{\delta \mathcal{L}}{\delta A^\mu} \right)_{g_{\mu \nu}} = 0 \tag{1} $$ But alternately, we could rewrite the Lagrangian in terms of a one-form field $A_\mu = g_{\mu \nu} A^\nu$, and view the metric $g_{\mu \nu}$ and $A_\mu$ as the "fundamental fields"; in which case the Euler-Lagrange equations would be $$ \left( \frac{\delta \mathcal{L}}{\delta g_{\mu \nu}} \right)_{A_\mu} = 0 \qquad \left( \frac{\delta \mathcal{L}}{\delta A_\mu} \right)_{g_{\mu \nu}} = 0 \tag{2} $$ The resulting Euler-Lagrange equations (2) will not be identical to those found in (1), but they will be equivalent; in general, they can be written as linear combinations of each other. (Regular old Math-200 partial derivatives also have this feature.)

In the case you propose, however, viewing the singly-traced tensor $C_{\mu \nu}$ as a fundamental field in the equations of motion would not yield a set of equations that are equivalent to those obtained by viewing $C_{\mu \nu \rho \sigma}$; and those obtained by viewing $C_{\mu \nu}$ as fundamental would be an incomplete set of equations. This is because there are multiple possible choices of $C_{\mu \nu \rho \sigma}$ that correspond to a single value of $C_{\mu \nu}$ (for a fixed metric). So specifying $C_{\mu \nu}$ does not specify the fields completely, meaning that the resulting Euler-Lagrange equations do not enforce stationarity of the action in some of those "directions in field space". It'd be like varying the quantity $$ A^2 = A_\mu A^\mu $$ with respect to the vector norm $A^2$ instead of the vector itself. If you varied this quantity with respect to the vector norm, you'd find that you had to have $A^2 = 0$, but if you varied it with respect to $A_\mu$, you'd find that you had to have $A^\mu = 0$, a more restrictive condition.

$\endgroup$
3
  • $\begingroup$ Hm. That's interesting. Physically, it would make sense that eq. 1 and 2 are equivalent, but mathematically, I'm not sure why it would generally be true (for any tensor, not just a rank 1 field). For example, wouldn't the computation of the stress energy tensor (which is proportional to the variation of the matter action wrt the metric) depend on the choice of your fundamental fields (which surely has to be unique, due to its appearance in Einstein's field equations)? $\endgroup$ Commented Sep 3, 2022 at 13:58
  • $\begingroup$ @Physics2718: It would be the same up to terms that vanish according to the equations of motion; so the "on-shell" stress-energy tensor would still have the same values. This makes a particular difference in models with non-trivial curvature couplings (e.g. $A^\mu A^\nu R_{\mu \nu}$), where there are second derivatives of the fields in the stress-energy tensor. In that case, the stress-energies you get when viewing $A^\mu$ and $A_\mu$ as the "fundamental fields" generally differ by some multiple of the equation of motion for the vector field. $\endgroup$ Commented Sep 3, 2022 at 17:06
  • $\begingroup$ Thank you, that makes a lot of sense. $\endgroup$ Commented Sep 4, 2022 at 14:07
0
$\begingroup$

Apparent contradiction. It's seems to me that you get the same "contradiction" that you'd find computing the derivative of a function defined as

$f(x) = x^2 c$

if you define $c_1(x) = x \cdot c$, and you forget the dependence of $c_1$ from $x$ when you compute the derivative of $f(x)$ written in the form $ f(x) = x \cdot c_1(x)$,

$\dfrac{f}{x} = \dfrac{d}{dx} \left[ x c_1(x) \right] = c_1(x) + x \dfrac{d c_1(x)}{d x} = c x + x c = 2 c x$.

Derivative of tensor fields. About the variation, or better the derivative, of a tensor it really depends on what you're doing. If you're computing the derivative w.r.t. the independent variable $v$, differentiate w.r.t. that variable, keeping all the other independent variables constant.

In doing this, please remember that a tensor is not composed only by its components but the components are referred to elements of a base, that could be dependent of the variable that you're using for differentiation. Otherwise you're implicitly assuming that the elements of the basis are constant, that it's not the most general case, and you're likely getting wrong results.

$\endgroup$
0
$\begingroup$

I suggest taking a look at Derivative of the Lagrangian with respect to the metric tensor. The point is you have to be careful with your notation in what you’re considering to be independent variables. The relationship between two seemingly different answers is related by the chain rule. Currently, there are two different functions involved:

  • First, let $V_1$ be the set of pairs of tensor fields $(\gamma,H)$, of valence $(2,0)$ and $(0,2)$ respectively, such that $\gamma^{ab}=\gamma^{ba}$. Then, define $\Phi_1:V_1\to C^{\infty}(M)$ as the contraction $\Phi_1(\gamma,H)=\gamma^{ab}H_{ab}$.

  • Next, let $V_2$ be the set of pairs of tensor fields $(\gamma,K)$ of valence $(2,0)$ and $(0,4)$ respectively, such that $\gamma^{ab}=\gamma^{ba}$ and $K_{abcd}=K_{badc}$. Then, we define the map $\Phi_2:V_2\to C^{\infty}(M)$ by the contraction $\Phi_2(\gamma,K)=\gamma^{ac}\gamma^{bd}K_{abcd}$.

So you see, $\Phi_1$ and $\Phi_2$ are completely different mathematical functions: they eat different types of tensor fields, and contract them in different ways to produce a scalar field… in particular, $\Phi_1$ is linear in $\gamma$ while $\Phi_2$ is not. What is the same however is if you evaluate $\Phi_1,\Phi_2$ on specifically chosen tensor fields: fix a tensor field $C_{abcd}$, then evaluating $\Phi_1$ on $(g^{ab},C_{ab})$ gives the same result as evaluating $\Phi_2$ on $(g^{ab},C_{abcd})$. Let me repeat once again for emphasis: the functions $\Phi_1,\Phi_2$ are different, but once you compose them with appropriate maps, we get the same function of $g^{ab}$. Take a look at the link for a discussion of a similar issue.

Now, the ‘contradiction’ you talk about isn’t really a contradiction at all. It is just an unfortunate consequence of the time-honoured tradition in physics to use the same letter to denote two different functions! The correct statements are that \begin{align} \frac{\delta \Phi_1}{\delta\gamma^{ab}}\bigg|_{(g^{\cdot\cdot},C_{\cdot\cdot})}=C_{(ab)}\quad\text{and}\quad \frac{\delta\Phi_2}{\delta \gamma^{ab}}\bigg|_{(g^{\cdot\cdot},C_{\cdot\cdot\cdot\cdot})}=2 C_{(ab)}. \end{align} The fact that these are different is not at all a surprise, because $\Phi_1\neq\Phi_2$, so there’s obviously no reason to expect that the variations with respect to $\gamma^{ab}$ are equal (again, see the link).

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.