6
$\begingroup$

I am having trouble understanding when/why we can sometimes use partial derivatives in place of covariant derivatives for electrodynamics in a curved spacetime. And how to interpret / intuitively understand what is going on.

Below are lengthy details, but the summary is:

  1. Why can we swap covariant for partial derivatives in the source equations and the charge conservation equation?

  2. Also I'm having trouble wrapping my head around what exactly it means for Maxwell's equations to seem independent of the Christoffel symbols, yet light curves. And even more confusingly in terms of the potential suddenly we have to use the covariant derivative and the Ricci curvature even comes in explicitly! Are Maxwell's equations kind of "misleading" / "hiding" subtleties from us? What is going on?


In the wikipedia article Maxwell's equations in curved spacetime, it states without calculation that despite the use of partial derivatives, the equations are invariant under arbitrary curvilinear coordinate transformations. Because of symmetry, it is not too hard to see that:

$$ F_{ab} \, = \, \nabla_a A_b \, - \, \nabla_b A_a = (\partial_a A_b -{\Gamma^c}_{ab} A_c) - (\partial_b A_a - {\Gamma^c}_{ba} A_c) = \partial_a A_b - \partial_b A_a - {\Gamma^c}_{ab} A_c + {\Gamma^c}_{ba} A_c $$

And the last two terms cancel because the Christoffel symbols are symmetric in the lower indicies. Similarly, it is not too hard to see it also works for the Faraday–Gauss equation $$ \nabla_\lambda F_{\mu \nu} + \nabla_\mu F_{\nu \lambda} + \nabla_\nu F_{\lambda \mu} = \partial_\lambda F_{\mu \nu} + \partial _\mu F_{\nu \lambda} + \partial_\nu F_{\lambda \mu} = 0$$ if I expand out in Christoffel symbols using:

$$\nabla_\gamma F_{\alpha \beta} \, = \, \partial_\gamma F_{\alpha \beta} - {\Gamma^{\mu}}_{\alpha \gamma} F_{\mu \beta} - {\Gamma^{\mu}}_{\beta \gamma} F_{\alpha \mu}. $$

But I can't figure out how it works for the source equation:

$$\mathcal{D}^{\mu\nu} \, = \, \frac{1}{\mu_{0}} \, g^{\mu\alpha} \, F_{\alpha\beta} \, g^{\beta\nu} \, \sqrt{-g} \\ J^{\mu} \, = \, \partial_\nu \mathcal{D}^{\mu \nu}$$

And later they also use the partial derivative, instead of covariant derivative, of the source equation as well:

$$\partial_\mu J^{\mu} \, = \, \partial_\mu \partial_\nu \mathcal{D}^{\mu \nu} = 0$$

Can someone show me why we can interchange covariant derivatives for partial derivative here?

Secondly, I have trouble understanding how this is even possible. If Maxwell's equations don't depend on the Christoffel symbols, then how can light curve? And why when we write it with the potentials does it seem to now depend on the Christoffel symbols and even the Ricci curvature?

For instance the flat space-time, inertial frame equations: $$\partial^\mu \partial_\mu A^\nu = - \mu_0 J^\nu $$ apparently turn out to generalize to: $$\nabla^\mu \nabla_\mu A^\nu = - \mu_0 J^\nu + {R^{\nu}}_{\mu} A^{\mu} $$ So now we need the covariant derivatives and even an explicit curvature term. I'm having trouble understanding / reconciling this with Maxwell's equations seeming to be independent of that. Are Maxwell's equations kind of "misleading" / "hiding" subtleties from us? What is going on?

$\endgroup$
1
  • 1
    $\begingroup$ Regarding the source terms, and to answer with the notation you seem comfortable with, I think you may have just forgotten that the tensor is weight +1. So there is an extra term when expanding with the Christoffel symbols: $\nabla_a D^{ab} = \partial_a D^{ab} + {\Gamma^a}_{ca} D^{cb} + {\Gamma^b}_{ca} D^{ac} - {\Gamma^c}_{ca} D^{ab}$. The second Christoffel term goes away because of symmetry in the Christoffel vs antisymmetry in $D$. The other two cancel each other after considering arbitrariness of repeated index labels and the symmetry of the Christoffel symbol. $\endgroup$
    – PPenguin
    Commented Sep 12, 2017 at 14:46

3 Answers 3

8
$\begingroup$

Replacing covariants with partials:

The source equation you cite involves a skew-symmetric tensor density.

You may know that if $\nabla$ is the Levi-Civita connection, you can calculate divergences of vector fields without using the connection coefficients/Christoffel symbols: $$ \nabla_\mu X^\mu=\frac{1}{\sqrt{-g}}\partial_\mu(X^\mu\sqrt{-g}). $$

Now, let $\mathcal{J}^\mu=X^\mu\sqrt{-g}$ be a vector density. Then $$ (\nabla_\mu X^\mu)\sqrt{-g}=\nabla_\mu(X^\mu\sqrt{-g})=\nabla_\mu\mathcal{J}^\mu=\partial_\mu(X^\mu\sqrt{-g})=\partial_\mu\mathcal{J}^\mu, $$ so vector density fields can be differentiated partially.

However, as it turns out the situation is the same for arbitrary (k,0)-type antisymmetric tensor fields: Let $F^{\mu\nu}$ be such a field. Then $$ \nabla_\nu F^{\mu\nu}=\partial_\nu F^{\mu\nu}+\Gamma^\mu_{\ \nu\sigma}F^{\sigma\nu}+\Gamma^{\nu}_{\ \nu\sigma}F^{\mu\sigma}, $$ but the second term here vanishes because $\Gamma$ is symmetric in the lower indices, but $F$ is skew-symmetric in the upper indices, so we're left with $$ \nabla_\nu F^{\mu\nu}=\partial_\nu F^{\mu\nu}+\Gamma^\nu_{\ \nu\sigma}F^{\mu\sigma}=\partial_\nu F^{\mu\nu}+\partial_\sigma\ln\sqrt{-g}F^{\mu\sigma} \\ =\frac{1}{\sqrt{-g}}\partial_\nu(F^{\mu\nu}\sqrt{-g}). $$

Defining $\mathcal{F}^{\mu\nu}=F^{\mu\nu}\sqrt{-g}$ gives $$ \nabla_\nu \mathcal{F}^{\mu\nu}=\partial_\nu \mathcal{F}^{\mu\nu}. $$

Dependence of Maxwell's equations on the (pseudo-)Riemannian structure:

Remember that the fundamental field here is $A_\mu$, which is a covector field. The only Maxwell equation that doesn't depend on the Riemannian structure is $\partial_{[\mu} F_{\nu\sigma]}=0$, because you can replace the covariants here with partials do to skew-symmetry.

Also do remember that $F_{\mu\nu}=\partial_\mu A_\nu -\partial_\nu A_\mu$ is also well-defined without covariants.

When we get to the source equation things change, however, because

  • to define divergence we need upper indices, however $F$ is lower-indiced by nature, so we need the metric to raise indices;
  • you can replace covariants with partials there only if you multiply with $\sqrt{-g}$ to create densities. Which, of course, depends on the metric.

So you can weasel out of using the connection (which is great for computations), but you cannot weasel out of using the metric, therefore Maxwell's equations are absolutely not topological.


An aside on differential forms:

You should read about differential forms. I am trying to think of reference that should be quick-readable by physicist. Probably Flanders' book is a good one. Otherwise Anthony Zee's General Relativity and QFT books also contain differential forms but only in a heuristic manner. Sean Carroll's GR book also has an OK recount of them.

Basically, differential forms are totally antisymmetric covariant tensor fields. Instead of using index notation as in, say $\omega_{\mu_1,...,\mu_k}$ to denote them, usually 'abstract' notation is used as $\omega=\sum_{\mu_1<...<\mu_k}\omega_{\mu_1...\mu_k}\mathrm{d}x^{\mu_1}\wedge...\wedge\mathrm{d}x^{\mu_k}$, where the basis is written out explicitly. The wedge symbols are skew-symmetric tensor products.

Differential forms are good because they generalize vector calculus to higher dimensions, arbitrary manifolds and also to cases when you don't have a metric. Differential forms can be differentiated ($\omega\mapsto\mathrm{d}\omega$), where the "$\mathrm{d}$" operator, called the exterior derivative, turns a $k$-form into a $k+1$ form without the need for a metric or a connection, and generalizes grad, div and curl, all in one.

The integral theorems of Green, Gauss and Stokes are also generalized.

The point is, if you also have a metric, the theory of differential forms is enriched. You get an option to turn $k$-forms into $n-k$ forms ($n$ is the dimension of your manifold), and also to define a "dual" operation to the exterior derivative, called the codifferential. The codifferential essentially brings the concept of divergence to differential forms.

Written with differential forms, Maxwell's equations are given by $$ \mathrm{d}F=0 \\ \mathrm{d}^\dagger F=kJ, $$ where $\mathrm{d}^\dagger$ is the codifferential, and $k$ is some constant I care not about right now.

I am noting two things:

  • The $F$ field strength 2-form is given by $F=\mathrm{d}A$, where $A$ is ofc the 4-potential. The exterior derivative satisfies $\mathrm{d}^2=0$ (think of $\text{div}\ \text{curl}=0$ and $\text{curl}\ \text{grad}=0$), so with potentials, the first equation is $\mathrm{d}F=\mathrm{d}^2A=0$, which is trivially true.
  • The first equation contains only $\mathrm{d}$, which is well-defined without a metric. The second equation depends on the codifferential $\mathrm{d}^\dagger$, which does depend on the metric. There is your metric dependance!
$\endgroup$
1
  • $\begingroup$ Explain the downvote please. The answer addresses the question and is factually correct. $\endgroup$ Commented Sep 24, 2017 at 13:54
2
$\begingroup$

Bence Racskó's answer is great! I want to add something. You're saying that:

"In the wikipedia article Maxwell's equations in curved spacetime, it states without calculation that despite the use of partial derivatives, the equations are invariant under arbitrary curvilinear coordinate transformations. Because of symmetry, it is not too hard to see that:"

$F_{ab} \, = \, \nabla_a A_b \, - \, \nabla_b A_a = (\partial_a A_b -{\Gamma^c}_{ab} A_c) - (\partial_b A_a - {\Gamma^c}_{ba} A_c) = \partial_a A_b - \partial_b A_a - {\Gamma^c}_{ab} A_c + {\Gamma^c}_{ba} A_c$

"And the last two terms cancel because the Christoffel symbols are symmetric in the lower indicies."

The Faraday tensor $\textbf{is defined:}$ $F_{μν} = \partial_{μ}Α_{ν} - \partial_{ν}Α_{μ}$. Let's say that we allow torsion. Then we cannot say: $\Gamma^{k}_{μν} = Γ^{k}_{νμ}$ since the lower indices don't commute now. $\textbf{This does not mean that the Faraday tensor will change.}$ My point is that the Faraday tensor takes the form $F_{μν} = \partial_{μ}Α_{ν} - \partial_{ν}Α_{μ}$ not because " the Christoffel symbols are symmetric in the lower indicies". It's because it is defined that way.

$\endgroup$
0
$\begingroup$

The covariance with respect to arbitrary coordinate transformations is expressed directly as the invariance of the potential one form $A = A_μ dx^μ$ and the field strength two-form $F = ½ F_{μν} dx^μ ∧ dx^ν$. They are geometric invariants just as much as is $dx^μ ∂_μ$. Correspondingly, the equations $dA = F$ and $dF = 0$ remain unchanged under arbitrary coordinate transforms. In any given coordinate system, their component forms are the equations you cited - with partial derivatives, not covariant derivatives.

There's no involvement of the metric at all. The operator "d" is an instance of a "natural operator" and the objects are called "natural objects". They are defined in a geometry at a much deeper level in which there is no notion of metrics, connections, parallelism, congruence, orthogonality, angle, speed or anything of the like.

Similarly, the response fields and sources arise at this level, too - out of the action principle, which is stated in integral form $$S = \int L, \hspace 1em L = 𝔏 d^4x,$$ in terms of a Lagrangian 4-form $L$, with the Lagrangian density $𝔏$ being its component. They arise as the derivatives of the Lagrangian, and can be expressed in terms of the variational of the Lagrangian 4-form as: $$ΔS = \int ΔL, \hspace 1em ΔL = (ΔA) ∧ J - (ΔF) ∧ G.$$ In 4-D, $J$ will be a 3-form ... the 3-current density, and $G$ will be a 2-form. You can read the second set of equations directly off of this as the Euler-Lagrange equations by integrating by parts: $$(ΔF) ∧ G = (Δ(dA)) ∧ G = d(ΔA) ∧ G = d(ΔA ∧ G) + ΔA ∧ dG,$$ remembering that for odd-degree forms, such as $ΔA$, the Leibnitz rule has opposite sign: $$d(ΔA ∧ \_) = d(ΔA) ∧ (\_) - ΔA ∧ d(\_).$$ Substituting back into the integral, this leads to: $$ΔS = -\int d((ΔA) ∧ G) + \int (ΔA) ∧ (J - dG).$$ The boundary integral $\int d((ΔA) ∧ G)$ drops out from the analysis, leaving you with the Euler-Lagrange equation $J = dG$ and - as a consequence - $dJ = 0$.

All this is entirely non-metrical and lives at the deeper layer in geometry, so it involves no metrics or connections. In component form, the response field 2-form and source 3-form would be: $$G = ½ 𝔊^{μν} ∂_ν ˩ ∂_μ ˩ d^4 x, \hspace 1em J = 𝔍^μ ∂_μ ˩ d^4 x,$$ where the contraction operator $(\_)˩(\_)$ is defined recursively by $$∂_μ ˩ \left(dx^ν ∧ α\right) = δ_μ^ν α - dx^ν \left(∂_μ ˩ α\right), \hspace 1em ∂_μ ˩ f = 0, $$ for differential forms $α$ and 0-forms/scalars $f$.

That's also a "natural operation" and is totally non-metrical. That's in contrast with the Hodge duality operator, which is given by $$\star{\left(dx^μ ∧ ⋯ ∧ dx^ν\right)} = ∂^ν ˩ ⋯ ˩ ∂^μ \sqrt{|g|} d^4 x = g^{μμ'} ⋯ g^{νν'} ∂_{ν'} ˩ ⋯ ˩ ∂_{μ'} \sqrt{|g|} d^4 x,$$ where the dependence on a metric $g_{μν}$ is put clearly on display.

The forms $A$, $F$, $G$, $J$ are written in more familiar terms, as: $$ A = 𝐀·d𝐫 - φ dt, \hspace 1em F = 𝐁·d𝐒 + 𝐄·d𝐫∧dt, \\ G = 𝐃·d𝐒 - 𝐇·d𝐫∧dt, \hspace 1em J = ρdV - 𝐉·d𝐒∧dt, $$ where $$d𝐫 = (dx, dy, dz), \hspace 1em d𝐒 = (dy∧dz, dz∧dx, dx∧dy), \hspace 1em dV = dx∧dy∧dz.$$

This corresponds to the coordinates, operators and components: $$ t = x^0, \hspace 1em 𝐫 = (x, y, z) = \left(x^1, x^2, x^3\right), \\ \frac{∂}{∂t} = ∂_0, \hspace 1em ∇ = \left(\frac{∂}{∂x}, \frac{∂}{∂y}, \frac{∂}{∂z}\right) = \left(∂_1, ∂_2, ∂_3\right), \\ φ = -A_0, 𝐀 = \left(A_x, A_y, A_z\right) = \left(A_1, A_2, A_3\right), \\ 𝐁 = \left(B^x, B^y, B^z\right) = \left(F_{23}, F_{31}, F_{12}\right), 𝐄 = \left(E_x, E_y, E_z\right) = \left(F_{10}, F_{20}, F_{30}\right), \\ 𝐃 = \left(D^x, D^y, D^z\right) = \left(𝔊^{01}, 𝔊^{02}, 𝔊^{03}\right), 𝐇 = \left(H_x, H_y, H_z\right) = \left(𝔊^{23}, 𝔊^{31}, 𝔊^{12}\right), \\ ρ = 𝔍^0, \hspace 1em 𝐉 = \left(J^x, J^y, J^z\right) = \left(𝔍^1, 𝔍^2, 𝔍^3\right), $$ except that $x$, $y$, $z$ don't have to be Cartesian coordinates - or even space-like coordinates at all, nor does $t$ have to be a time-coordinate or even time-like. They can denote any four independent functions of the coordinates.

You ask: how, then, does one get light-speed motion out of this, if there is no metric-dependence or any reference to such primitives of chrono-geometry, such as orthogonality, space-like versus time-like, speed, congruence, distance, etc.? The answer is that it doesn't come from there.

The empirical content conveyed by the theory doesn't reside with those equations at all! They're just a framework. The only actual empirical statement being made by them is that the system in question (here: the electromagnetic field) actually has those attributes $A$ (and, thus $F$) as part of its description, and that its dynamics is described in terms of some first-order Lagrangian (and thus, that there should be $G$ and $J$ involved in the description of its dynamics).

The content resides with the Lagrangian, and with the relations conveyed by them; namely, the constitutive relations, here expressed in general form in terms of the Lagrangian density by: $$ρ = -\frac{∂𝔏}{∂φ}, \hspace 1em 𝐉 = \frac{∂𝔏}{∂𝐀}, \hspace 1em 𝐃 = \frac{∂𝔏}{∂𝐄}, \hspace 1em 𝐇 = -\frac{∂𝔏}{∂𝐁}.$$

In contrast to the equations already laid out, they are not symmetric under arbitrary coordinate transforms. In fact, the whole point behind the choice of $𝔏$ is to call out a specific set of symmetries: gauge symmetry and Lorentz symmetry.

Gauge symmetry - i.e. symmetry under gauge transforms: $$δA = -dχ \hspace 1em⇒\hspace 1em (δφ, δ𝐀) = \left(\frac{∂χ}{∂t}, -∇χ\right),$$ mandates that an $𝔏$ that is a function of $A$ and its first derivatives (and other fields that have non-trivial tranforms under this gauge transform) may not depend directly on $A$, may depend on the gradients of $A$ only in the anti-symmetric combinations $∂_μ A_ν - ∂_ν A_μ$ that make up the components of $F$, and may depend on the gradients $∂q$ of other fields $q$ that have non-trivial transforms under $χ$ only through their gauge-covariant derivatives $∇_A q = ∂q + (⋯A⋯q⋯)$ (out of which, an indirect dependency on $A$ in the Lagrangian may arise).

That's an instance of Utiyama's Theorem.

The requirement of Lorentz symmetry mandates that the Lagrangian's dependence on $F$ only be through its Lorentz invariants: $$ℑ^1 = \frac{E^2 - B^2 c^2}{2}, \hspace 1em ℑ^2 = 𝐁·𝐄,$$ and that most definitely breaks general coordinate covariance and calls out a specific geometry and has explicit dependence on a metric - one that is locally Minkowski.

For instance, a Lagrangian density that has the general form: $$𝔏 = 𝔏_0\left(ℑ^1, ℑ^2\right) + 𝔏_1\left(q, ∇_Aq\right),$$ that yield constitutive relations of the form: $$ρ = -\frac{∂𝔏_1}{∂φ}, \hspace 1em 𝐉 = \frac{∂𝔏_1}{∂𝐀}$$ for the sources, which depends on what extra fields are present, and will have response fields determined by the derivatives of $𝔏_0$: $$ε_1 = \frac{∂𝔏_0}{∂ℑ^1}, \hspace 1em ε_2 = \frac{∂𝔏_0}{∂ℑ^2},$$ given by: $$𝐃 = ε_1 𝐄 + ε_2 𝐁, \hspace 1em 𝐇 = ε_1 c^2 𝐁 - ε_2 𝐄,$$ where $ε_1\left(ℑ^1, ℑ^2\right)$ and $ε_2\left(ℑ^1, ℑ^2\right)$ are generally functions of the invariants $ℑ^1$ and $ℑ^2$ satisfying: $$\frac{∂ε_1}{∂ℑ^2} = \frac{∂ε_2}{∂ℑ^1}.$$ You'll recognize $ε_1$ as the permittivity, while $ε_2$ is the axial (parity-violating) version of permittivity.

For null fields $ℑ^1 = 0$ and $ℑ^2 = 0$, these reduce to constants, $ε_1(0, 0)$ and $ε_2(0, 0)$, the latter which can be set to 0, by just redefining the response fields as: $$𝐃 → 𝐃 - ε_2(0, 0) 𝐁, \hspace 1em 𝐇 → 𝐇 + ε_2(0, 0) 𝐄,$$ without affecting the Euler-Lagrange equations.

That essentially recovers the constitutive relations: $$𝐃 = ε_0 𝐄, \hspace 1em 𝐁 = μ_0 𝐇, \hspace 1em ε_0 = ε_1(0, 0), \hspace 1em μ_0 = \frac{1}{ε_1(0,0)c^2},$$ that comes out of the Maxwell-Lorentz Lagrangian density: $$\frac{ε_0|𝐄|^2}{2} - \frac{|𝐁|^2}{2μ_0}.$$

That's where the metric-dependence resides ... and where all the metric-dependence is confined to. That's where the light speed trajectories arise from.

It's coming out of the constitutive laws, not out of the other equations.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.