2
$\begingroup$

The Lagrangian for pure Yang-Mills theory is given by $$-\frac14 F^{a\mu\nu}F^a_{\mu\nu} \tag{1}$$ where $$F^a_{\mu\nu} = \partial_\mu A_\nu^a - \partial_\nu A_\mu^a + gf^{abc}A^b_\mu A_\nu^c.\tag{2}$$ Here $A_\mu$ is the gauge field, $g$ is the coupling constant, and $f^{abc}$ are structure constants.

The typical motivation for $F_{\mu\nu}^a$ given in textbooks is due to the commutator of gauge covariant derivatives, $$F_{\mu\nu}\sim [D_\mu, D_\nu].\tag{3}$$ From a more mathematical point of view, the gauge field $A$ is actually a connection 1-form on a principal-$G$ bundle, and thus is a curvature 2-form.

My questions are as follows:

  1. Why is the commutator $[D_\mu, D_\nu]$ important and how does it motivate the Lagrangian (1)?
  2. How do we know this is the "right" Lagrangian for a pure gauge theory? It can be shown that there cannot be a mass term due to gauge invariance, but what about other terms? Other than the fact that this is also the Lagrangian of classical electrodynamics, why $F^{a\mu\nu}F^a_{\mu\nu}$?
  3. In terms of geometry, how does the Lagrangian (1) follow from the fact that $F^a_{\mu\nu}$ is a curvature 2-form? In other words, when varying (1) to obtain the equations of motion, why is it important to minimize the $L^2$ norm of the curvature?
$\endgroup$
2
  • $\begingroup$ You are looking for something that simply doesnt exist. There is no nice philosophical reason why our universe must work the way it does. I once sat in a lecture of Yang, that of Yang-Mills, and he simply replied to a question similar to yours, that it was pure curiosity to try such mathematical form. In particular, other people tried making theories without the $f^{abc}A_\mu^bA_\nu^c$ term, but when churning through the standard QFT methods, such terms abound. So they simply tried to use a term like this to cancel them out, and it worked. Their guiding principle was symmetry. $\endgroup$ Commented Jun 13 at 1:59
  • $\begingroup$ @naturallyInconsistent Thank you, I suppose I will have to just take it for what it is. I will leave the question up in case anyone has anything to add about minimizing the curvature, I have a feeling there could be something deep there. $\endgroup$
    – CBBAM
    Commented Jun 13 at 2:37

3 Answers 3

2
$\begingroup$

You can look at this from various angles, but the Lagrangean as you write can be (basically) fixed by basic QFT considerations:

  • The $A_\mu^a$ are the basic degrees of freedom of the theory. This the difference to GR, where the $\Gamma_{\mu\nu}^\rho$ (or the spin connection $\omega^{ab}_\mu$), which is the analogous object, is not fundamental but derived from the metric.
  • The $A_\mu^a$ are bosons, and you want to impose the Poincaré group, i.e. $P^2=M^2$. Hence, you want an equation of motion of the form$$\Box A=M^2 A+\text{other stuff, including interactions}.$$Hence, there should be a "kinetic term" in the Lagrangean with two derivatives and two powers of $A$.
  • You want gauge invariance to decouple the ghost (the unphysical negative-energy component).
  • You don't want terms with mass dimension larger than four, so up to four fields+derivatives.

I haven't checked it, but I guess the gauge field Lagrangean is already fixed (up to normalisation) by these requirements.

$\endgroup$
4
  • $\begingroup$ This is a much better answer than the accepted one. $\endgroup$
    – MadMax
    Commented Jun 13 at 14:55
  • $\begingroup$ Thank you. Would you be able to elaborate on your second point? Wouldn't we also want to impose the Lorentz group for fermions? How did you obtain that equation of motion for $A$? $\endgroup$
    – CBBAM
    Commented Jun 13 at 18:10
  • $\begingroup$ "You don't want terms with mass dimension larger than four": this is a point the other answers failed to mention. If we lift the restriction of "mass dimension four", there could be infinite number of Lagrangian terms in addition to the Yang-Mills term that satisfy both Lorentz invariance and gauge invariance. This is exactly the reason this answer is superior than the other answer. $\endgroup$
    – MadMax
    Commented Jun 13 at 19:22
  • $\begingroup$ @CBBAM For fermions, you also have the Lorentz (Poincaré actually, with the $P_\mu$) group, of course, but the representations are constructed from the Dirac algebra (it's actually the covering group $Spin(1,3)$). Thus, you can form an equation with $\gamma^\mu P_\mu$, i.e. the Dirac equation, first order in derivatives. Hence, the kinetic term has two fields, but only one derivative for fermions (and fermions have mass dimension $3/2$). $\endgroup$
    – Toffomat
    Commented Jun 13 at 21:10
1
$\begingroup$

Toffomat's answer is a great answer. Let me try to add a little bit more physical motivation for what it means to "decouple the ghosts."

SUMMARY: The requirement of gauge invariance comes out of trying to describe a relativistic theory of quantum particles with local Lorentz-invariant field theories. Specifically, trying to construct a local field theory of spin-1 particles is surprisingly challenging, and one of the few ways to make it work is to adopt gauge invariance.

LONG ANSWER: We believe that particles are ultimately irreducible unitary (or projective) representations of the Poincare group. The math works out so that these are defined by an energy-momentum relation $p^2 = m^2$ together with a unitary irreducible representation of $SU(2)$ (ignoring massless particles for now). These are the familar "spin-0", "spin 1/2", etc. representations.

On the other hand, we believe that to represent local dynamics for these particles we need to describe the equations of motion for the particles using fields, specifically using field functions that are irreducible representations of the Lorentz group. Despite sounding similar, this is actually a very different requirement than the one above. Unitary irreducible representations of the Poincare group are unitary but infinite-dimensional. Irreducible representations of the Lorentz group are finite-dimensional but non-unitary.

Most importantly, irreducible representations of the Lorentz are equivalent to irredicuble representations of $SU(2)\times SU(2)$. So while particles are naturally "spin 1/2" or "spin 1", fields are naturally "spin 1/2 times spin 1/2" To construct a local field-theory of relativistic particles we have to overcome this mismatch.

The big idea is that we have to find subset of the field variable that corresponds to the particle representation. In the case of spin-0 it is easy. Representation theory tells us that "spin 0 times spin 0" = "spin 0" and so making the correspondence is simple. Spin 1/2 is also pretty simple. Dirac spinors are basically the field representation "spin 1/2 times spin 0" = "spin 1/2"

We first encounter a problem with spin 1. You can try "spin 1 times spin 0" = "spin 1" and this is valid, but it gives you unfamiliar equations of motion so no one is very interested in it. This would correspond to trying to make $F_{\mu\nu}$ your fundamental fields.

The other thing to try is "spin 1/2 times spin 1/2", which corresponds to a vector field $A_\mu$. The problem is that in terms of representation theory "spin 1/2 times spin 1/2" = "spin 1 plus spin 0." We want to make the "spin 1" part correspond to our particle, but we have to deal with the extra "spin 0" without introducing an extra particle. The first thing to try is separating out the spin 1 part of the field, for example $A^i$, from the spin 0 part, $A^0$, and treating them separately, writing equations of motion (or a Lagrangian) that doesn't have the spin-0 part in it. But that doesn't work because this separation is not Lorentz invariant. There are Lorentz transformations that mix $A^i$ and $A^0$, but Lorentz transformations can't change a spin 1 particle into a spin 0 particle.

But there is a way around it. You start with some separation of $A^\mu$ into a spin 1 part and a spin 0 part and ask how those parts mix into each other when you do a Lorentz transformation. Then if you can find equations of motion for the spin 1 part with no equations of motion for the spin 0 part and those equations of motion are invariant under the mixing transformation then you have equations of motion that really do describe just one spin-1 particle without the extra spin-0 particle.

If you do this for a vector field you find that the mixing term you have to avoid is $A_\mu \mapsto A_\mu + \partial_\mu f$ where $f$ is some arbitrary function. This is the requirement of gauge invariance.

But this is not the only solution! One might ask whether there other ways to do it. It turns out that the most general way to have a theory of spin-1 particles using vector fields is to have a theory that is invariant under local non-abelian group transformations in exactly the Yang-Mills theory.

Weinberg does all the gory details of this proof in his quantum field theory book. I believe Schwartz's book "Quantum Field Theory and the Standard Model" also gives a good summary of this argument in his early chapters. He phrases it in terms of "degrees of freedom," referring to the mismatch between the "spin 1" particle and the "spin 1 plus spin 0" field. I believe my outline here is cribbing heavily from him.

Note that this whole problem exists even for massive spin-1 particles. Gauge theories naturally lend themselves to describing massless particles because the natural mass term $m^2 A_\mu A^\mu$ is not gauge-invariant, but vector-field theories of massive spin-1 particles must also be gauge-invariant, leading one naturally to Higgs theory etc.

$\endgroup$
2
  • $\begingroup$ Thank you, this is an amazing answer! I think my understanding of particles and QFT from a representation theory point of view is lacking. You mention Schwartz's book as a good summary, would this be enough to learn from or does one have to consult Weinberg (or another source)? $\endgroup$
    – CBBAM
    Commented Jun 13 at 18:18
  • 1
    $\begingroup$ I think of Weinberg as the ultimate source, but it is only really helpful once you already have a sense of the overall point because he writes pretty densely. I think Schwarz and Peskin do a pretty good job giving an introduction to how representation theory comes into QFT, though I don't remember exact chapters. $\endgroup$ Commented Jun 13 at 18:33
1
$\begingroup$

Classical Yang Mills theory is a generalisation of classical EM. More precisely, it generalises the gauge structure group from an abelian group like $U(1)$ to non-abelian group like $SU(2)$ or $SU(3)$. (I say gauge structute group because physicists call two different but related groups the gauge group whereas the mathematicians call one of these the structure group - so a portmanteau of the terms seemed best).

The eom for EM is in the language of differential forms on manifolds:

$dF = 0$ and $\delta F = J$

Whilst the eom for YM is:

$d^{\nabla} F = 0$ and $\delta^{\nabla} F = J$

The parallel between these two sets of equations here is obvious. (Here,$\nabla$ is the connection whilst $d^{\nabla}$ is the exterior covariant derivative).

$\endgroup$
3
  • $\begingroup$ Thank you. So the motivation is nothing more than generalizing EM by replacing $U(1)$ with a more general group? If so, why do most physics books emphasize the commutator $[D_\mu, D_\nu]$? $\endgroup$
    – CBBAM
    Commented Jun 13 at 5:09
  • 1
    $\begingroup$ That's obviously because that is the most salient difference that happens when you replace the Abelian U(1) by non-Abelian SU(n); and indeed, that is what Yang and Mills were also focused upon when they were deriving the theory. $\endgroup$ Commented Jun 13 at 5:17
  • 2
    $\begingroup$ @CBBAM: That's motivation given by looking at the situation in hindsight. Yang & Mills were driven by physics reasons whilst mathematicians who developed the fibre bundle and differential forms language were driven by natural questions in the math. Yang is on record for saying he was over awed when he realised that field strength could be understood as curvature as defined by the mathematicians. $\endgroup$ Commented Jun 13 at 5:24

Not the answer you're looking for? Browse other questions tagged or ask your own question.