125
$\begingroup$

I am a mathematics student with a hobby interest in physics. This means that I've taken graduate courses in quantum dynamics and general relativity without the bulk of undergraduate physics courses and sheer volume of education into the physical tools and mindset that the other students who took the course had, like Noether's theorem, Lagrangian and Hamiltonian mechanics, statistical methods, and so on.

The courses themselves went well enough. My mathematical experience more or less made up for a lacking physical understanding. However, I still haven't found an elementary explanation of gauge invariance (if there is such a thing). I am aware of some examples, like how the magnetic potential is unique only up to a (time-)constant gradient. I also came across it in linearised general relativity, where there are several different perturbations to the spacetime metric that give the same observable dynamics.

However, to really understand what's going on, I like to have simpler examples. Unfortunately, I haven't been able to find any. I guess, since "gauge invariance" is such a frightening phrase, no one use that word when writing to a high school student.

So, my (very simple) question is: In many high school physics calculations, you measure or calculate time, distance, potential energy, temperature, and other quantities. These calculations very often depend only on the difference between two values, not the concrete values themselves. You are therefore free to choose a zero to your liking. Is this an example of gauge invariance in the same sense as the graduate examples above? Or are these two different concepts?

$\endgroup$
2
  • 1
    $\begingroup$ If you like this question you may also enjoy reading this Phys.SE post. $\endgroup$
    – Qmechanic
    Commented Jul 10, 2016 at 19:15
  • $\begingroup$ John Baez writes: "The gauge principle says, in simple terms, that you can only tell if two particles are in the same state if you move them next to each other so you can compare them. Working out the mathematical consequences of this principle leads to gauge theories which explain the forces we see in nature." $\endgroup$ Commented Nov 25, 2017 at 5:09

9 Answers 9

101
$\begingroup$

The reason that it's so hard to understand what physicists mean when they talk about "gauge freedom" is that there are at least four inequivalent definitions that I've seen used:

  • Definition 1: A mathematical theory has a gauge freedom if some of the mathematical degrees of freedom are "redundant" in the sense that two different mathematical expressions describe the exact same physical system. Then the redundant (or "gauge dependent") degrees of freedom are "unphysical" in the sense that no possible experiment could uniquely determine their values, even in principle. One famous example is the overall phase of a quantum state - it's completely unmeasurable and two vectors in Hilbert space that differ only by an overall phase describe the exact same state. Another example, as you mentioned, is any kind of potential which must be differentiated to yield a physical quantity - for example, a potential energy function. (Although some of your other examples, like temperature, are not examples of gauge-dependent quantities, because there is a well-defined physical sense of zero temperature.)

    For physical systems that are described by mathematical structures with a gauge freedom, the best way to mathematically define a specific physical configuration is as an equivalence class of gauge-dependent functions which differ only in their gauge degrees of freedom. For example, in quantum mechanics, a physical state isn't actually described by a single vector in Hilbert space, but rather by an equivalence class of vectors that differ by an overall scalar multiple. Or more simply, by a line of vectors in Hilbert space. (If you want to get fancy, the space of physical states is called a "projective Hilbert space," which is the set of lines in Hilbert space, or more precisely a version of the Hilbert space in which vectors are identified if they are proportional to each other.) I suppose you could also define "physical potential energies" as sets of potential energy functions that differ only by an additive constant, although in practice that's kind of overkill. These equivalence classes remove the gauge freedom by construction, and so are "gauge invariant."

    Sometimes (though not always) there's a simple mathematical operation that removes all the redundant degrees of freedom while preserving all the physical ones. For example, given a potential energy, one can take the gradient to yield a force field, which is directly measurable. And in the case of classical E&M, there are certain linear combinations of partial derivatives that reduce the potentials to directly measurable ${\bf E}$ and ${\bf B}$ fields without losing any physical information. However, in the case of a vector in a quantum Hilbert space, there's no simple derivative operation that removes the phase freedom without losing anything else.

  • Definition 2: The same as Definition 1, but with the additional requirement that the redundant degrees of freedom be local. What this means is that there exists some kind of mathematical operation that depends on an arbitrary smooth function $\lambda(x)$ on spacetime that leaves the physical degrees of freedom (i.e. the physically measurable quantities) invariant. The canonical example of course is that if you take any smooth function $\lambda(x)$, then adding $\partial_\mu \lambda(x)$ to the electromagnetic four-potential $A_\mu(x)$ leaves the physical quantities (the ${\bf E}$ and ${\bf B}$ fields) unchanged. (In field theory, the requirement that the "physical degrees of freedom" are unchanged is phrased as requiring that the Lagrangian density $\mathcal{L}[\varphi(x)]$ be unchanged, but other formulations are possible.) This definition is clearly much stricter - the examples given above in Definition 1 don't count under this definition - and most of the time when physicists talk about "gauge freedom" this is the definition they mean. In this case, instead of having just a few redundant/unphysical degrees of freedom (like the overall constant for your potential energy), you have a continuously infinite number. (To make matters even more confusing, some people use the phrase "global gauge symmetry" in the sense of Definition 1 to describe things like the global phase freedom of a quantum state, which would clearly be a contradiction in terms in the sense of Definition 2.)

    It turns out that in order to deal with this in quantum field theory, you need to substantially change your approach to quantization (technically, you need to "gauge fix your path integral") in order to eliminate all the unphysical degrees of freedom. When people talk about "gauge invariant" quantities under this definition, in practice they usually mean the directly physically measurable derivatives, like the electromagnetic tensor $F_{\mu \nu}$, that remain unchanged ("invariant") under any gauge transformation. But technically, there are other gauge-invariant quantities as well, e.g. a uniform quantum superposition of $A_\mu(x) + \partial_\mu \lambda(x)$ over all possible $\lambda(x)$ for some particular $A_\mu(x).$

    See Terry Tao's blog post for a great explanation of this second sense of gauge symmetry from a more mathematical perspective.

  • Definition 3: A Lagrangian is sometimes said to posses a "gauge symmetry" if there exists some operation that depends on an arbitrary continuous function on spacetime that leaves it invariant, even if the degrees of freedom being changed are physically measurable.

  • Definition 4: For a "lattice gauge theory" defined on local lattice Hamiltonians, there exists an operator supported on each lattice site that commutes with the Hamiltonian. In some cases, this operator corresponds to a physically measurable quantity.

The cases of Definitions 3 and 4 are a bit conceptually subtle so I won't go into them here - I can address them in a follow-up question if anyone's interested.

Update: I've written follow-up answers regarding whether there's any sense in which the gauge degrees of freedom can be physically measurable in the Hamiltonian case and the Lagrangian case.

$\endgroup$
4
  • 6
    $\begingroup$ Excellent answer! This is one of the best explantions (in a single place) ive come across yet!!!! :D $\endgroup$
    – user122066
    Commented Jul 9, 2016 at 17:29
  • 3
    $\begingroup$ Ive asked the followup question on the subtleties between #3 and #4 $\endgroup$
    – user122066
    Commented Jul 9, 2016 at 17:35
  • $\begingroup$ physics.stackexchange.com/q/267175/122066 $\endgroup$
    – user122066
    Commented Jul 9, 2016 at 17:37
  • $\begingroup$ @user122066 See the update at the end of my answer for links to my follow-ups. $\endgroup$
    – tparker
    Commented Jul 14, 2016 at 4:43
25
$\begingroup$

I only understood this after taking a class in general relativity (GR), differential geometry and quantum field theory (QFT). The essence is just a change of coordinates systems that needs to be reflected in the derivative. I'll explain what I mean.

You have a theory that is invariant under some symmetry group. So in quantum electrodynamics you have a Lagrangian density for the fermions (no photons yet) $$ \mathcal L = \bar\psi(x) [\mathrm i \gamma^\mu \partial_\mu - m] \psi(x) \,.$$ This $\bar\psi $ is just $\psi^\dagger \gamma^0$, important is that it is complex conjugated. The fact that it is a four-vector in spin-space is of no concern here. What one can do now is transform $\psi \to \exp(\mathrm i \alpha) \psi$ with some $\alpha \in \mathbb R$. Then $\bar\psi \to \bar\psi \exp(-\mathrm i \alpha)$ and the Lagrangian will be invariant as the derivative does not act on the exponential function, it is just a phase factor. There you have a global symmetry.

Now promote the symmetry to a local one, why not? Instead of a global $\alpha$ one now has $\alpha(x)$. This means we choose a different $\alpha$ at each point in spacetime. The problem is that when we transform now, one picks up the $\partial_\mu \alpha(x)$ with the chain and product rules of differentiation. That seems like a technical complication at first.

There is a more telling way to see this:
You take a deriviative of a field $\psi(x)$. This means taking a difference quotient like $$ \partial_\mu \psi(x) = \lim_{\epsilon \to 0} \frac{\psi(x + \epsilon \vec e_\mu) - \psi(x)}{\epsilon} \,.$$ This works just fine with a global transformation. But with the local transformation, you basically subtract two values that are gauged differently. In differential geometry you have that the tangent spaces at the different points of the manifold are different and therefore one cannot just compare vectors by their components. One needs a connection with connection coefficients to provide parallel transport. It is similar here. We now have promoted $\phi$ from living on $\mathbb R^4$ to living in the bundle $\mathbb R^4 \times S^1$ as we have an U(1) gauge group. Therefore we need some sort of connection in order to transport the transformed $\phi$ from $x + \epsilon \vec e_\mu$ to $x$. This is where one has to introduce some connection which is $$ \partial_\mu \to \mathrm D_\mu := \partial_\mu + \mathrm i A_\mu \,.$$

If you plug that into the Lagrange density to make it $$ \mathcal L = \bar\psi(x) [\mathrm i \gamma^\mu \mathrm D_\mu - m] \psi(x)$$ and then choose $A_\mu = \partial_\mu \alpha$ you will see that the Lagrangian density does stay invariant even under local transformations as the connection coefficient will just subtract the unwanted term from the product/chain rule.

In general relativity you have the symmetry under arbitrary diffeomorphism, the price is that you have to change the derivative to a connection, $$ \partial \to \nabla := \partial + \Gamma + \cdots \,.$$

$\endgroup$
0
18
$\begingroup$

Since you mentioned coming from a mathematics background, you might find it nice to take an answer in terms of equivalence classes.

A gauge theory is physical theory where the observable quantities, as in, things you could measure with an experiment given perfect measuring equipment, are equivalence classes in a vector space.

Electromagnitism is the most common example. Modern physics theories are always written as fiber bundles where the underlying manifold is spacetime and the fibers are some tangent space associated with each point (called an event) in spacetime. E&M in free space (no charges present) is described by associating a 4 component object called $A_{\mu}$ to each spacetime point, $x$, and requiring $A_{\mu}(x)$ to satisfy maxwell's equations.

However, the observable, equally measurable, quantities in nature are the electric and magnetic fields, $\vec{E}(x)$ and $\vec{B}(x)$. These are derived from $A_{\mu}(x)$ using the definition given in this wiki (look at the matrix elements of $F_{\mu \nu}(x)$).

It turns out that the transformation $A_{\mu}(x) \rightarrow A_{\mu}(x) + \partial_{\mu}f(x)$ for any twice differentiable function $f(x)$ gives the same values of the observable fields $\vec{E}(x)$ and $\vec{B}(x)$. So there is an equivalence relation

$A_{\mu}(x) \approx A_{\mu}(x) + \partial_{\mu} f(x)$.

And in general, gauge theories are theories where the observable quantities are functions on equivalence classes of some vectors in a vector space. In this case our vectors were $A_{\mu}(x)$ (these are vectors in the function space of twice differentiable functions on spacetime), and our equivalence relation was given above.

As to your final question about whether things like the total energy of system being determined only up to constant factor in any reference frame makes Newtonian dynamics a gauge theory. The answer is no, not really. Basically, if you're not talking about a field theory, a physicist won't call the thing a gauge theory.

$\endgroup$
2
  • 1
    $\begingroup$ Nice answer, but perhaps it would be more precise to say that observables in a gauge theory are functions on a set of equivalence classes of [things like connections and bundle sections] mod gauge equivalence. The frustration of gauge theory is that we can don't know of many cases where we can describe these functions except by giving functions on the connections and sections. $\endgroup$
    – user1504
    Commented Jul 10, 2016 at 1:02
  • $\begingroup$ You are right, my language is a bit sloppy. It should read something like "observables are functions on the equivalence classes of some vector space." $\endgroup$ Commented Jul 13, 2016 at 7:53
12
$\begingroup$

Gauge invariance is simply a redundancy in the description of a physical system. I.e. we can choose from an infinite number of vector potentials in E&M.

For example, an infinite number of vector potentials can describe electromagnetism by the transformation below

$$A(x) \to A_\mu(x) + \partial_\mu \alpha(x)$$

Choosing a specific gauge (gauge fixing) can make solving a physical problem much easier than it would be if you did not fix a gauge.

Normally one chooses the Coulomb gauge: $\nabla \cdot A = 0$.

It should be stressed that gauge invariance is NOT a symmetry of nature and you cannot measure anything associated with it.

Gauge invariance is most useful in quantum field theory and is crucial in proving renormalizability. Additionally S-matrix elements in QFT require a local Lagrangian and hence gauge invariance.

As an example of why we would introduce the vector potetial $A^\mu$ consider the Aharonov-Bohm effect which arises due to global topological properties of the vector potential. There are still other reason gauge invariance makes life easy, reducing degrees of freedom of the photon in the so-called covariant or $R_\xi$ gauge, causality, etc. Essentially the utility of gauge invariance doesnt become entirely evident until one starts trying to work through quantum field theory. :D

$\endgroup$
8
  • 3
    $\begingroup$ @user122066 For future reference, if you need to look up a symbol, see this tex.SE question. But only certain (La)TeX commands are supported in MathJax. See the MathJax documentation for a list. $\endgroup$
    – David Z
    Commented Jul 8, 2016 at 17:18
  • 2
    $\begingroup$ For all MathJax reference, check this: MathJax basic tutorial and quick reference $\endgroup$
    – user36790
    Commented Jul 8, 2016 at 17:18
  • 2
    $\begingroup$ @user122066: you wrote: "Now it is an utterly crucial property of modern physics and we may very well be lost without it!" I think you exaggerate here and this is what makes such a phrase "frightening". There is no proof that we must only work with "gauge theories". Other approaches are just unexplored. $\endgroup$ Commented Jul 8, 2016 at 19:04
  • 1
    $\begingroup$ @VladimirKalitvianski fair enough. There are recursion relations related to the S matrix that avoids gauges but it's very much hard to imagine something being discovered that makes conputation easier than gauge invariance. Youre absolutely right though. Ill delete this part $\endgroup$
    – user122066
    Commented Jul 8, 2016 at 19:08
  • 2
    $\begingroup$ (Also useful for TeX symbol look up - Detexify.) $\endgroup$ Commented Jul 10, 2016 at 18:47
12
$\begingroup$

These calculations very often depend only on the difference between two values, not the concrete values themselves. You are therefore free to choose a zero to your liking. Is this an example of gauge invariance in the same sense as the graduate examples above?

Yes indeed it is, in the most general definition of gauge invariance, it's what physicists call a global gauge invariance. More on that below.

If I had to write a one sentence answer to your title, it would be this:

Gauge invariance is the well definedness of physical law under a quotent map that condenses a configuration/ parameter space/ co-ordinates for a physical system into a set of equivalence classes of physically equivalent configurations.

This is in the same sense that, for example, the coset product is well defined under the map that quotients away a group's normal subgroup. The physics of a configuration is independent of the choice of equivalence class member.

In its barest terms, gauge invariance is simply an assertion that there is redundancy in a mathematical description of a physical system. Otherwise put, the system has a symmetry, an invariance with respect to a group of transformations.

A global gauge symmetry is one where the configuration space is a simple Cartesian product (i.e. a trivial fiber bundle) of the set of physically distinct equivalence classes and a redundant parameter, as with your difference between two values example. If the physical description is a Lagrangian description, then this is where Noether's theorem comes to the fore and identifies conserved quantities, one for each such redundant parameter. The gauge group, i.e. group of symmetries, affects all equivalence classes (fibers) equally. Subtraction of a constant potential from an electrostatic potential is such a symmetry, and a huge advance for Corvid Civilization, as it lets crows sit on high tension powerlines and happily shoot the breeze together, discussing their latest thoughts on gauge theories, and declaring that "Nevermore!" shall we fear the global addition of 22kV to the electrostatic potential can change the physics of the system we belong to.

However, usually when physicists speak of a gauge theory, they mean one where the symmetry group can act in a more general way, with a different group member acting at each point on the configuration space. The corresponding fiber bundle is no longer trivial. Although you wanted a simpler example than electrodynamics, I don't think there is one. The phase added to the electron wavefunction can be any smooth function of co-ordinates, and the extra terms that arise from the Leibniz rule applied to the derivatives in the wavefunction's equation of motion (Dirac, Schrödinger) are exactly soaked up into the closed part of the EM potential one-form. Incidentally, as an aside, I always like to visualize EM potential in Fourier space, which we can do with reasonable restrictions (e.g. a postulate that we're only going to think about tempered distributions, for example), because the spatial part of the redundant part of the four-potential is then its component along the wavevector (i.e thought of as a 3-vector), and only the component normal to the wavevector matters physically: it is the only part that survives $A\mapsto \mathrm{d} A = F$.

There are two things I believe you should take from the EM example:

  1. Even though practically it leads to quite a bit of further complexity, conceptually, it is only a small jump from your simple global gauge symmetric example; we simply allow the symmetries to act locally instead of acting on all configuration space points equally;

  2. Taking a lead from the experimentally real electromagnetism, we postulate that this gauge invariance might be relevant more generally, and so we look its presence in other physical phenomena. This is nothing more than a deed motivated by a hunch. Experimentally, we find that this is a fruitful thing to do. In physics, there is no deeper insight than experimental results.

Lastly I should mention that gauge / fiber bundle notions are also useful when we artificially declare equivalence classes of configurations grounded on the needs of our problem, even if there is a physical difference between equivalence class members. One of the loveliest examples of this way of thinking is Montgomery's "Gauge Theory of the Falling Cat". We study equivalence classes of cat configuration that are equivalent modulo proper Euclidean isometry to formulate a cat shape space, which, in the standard treatment where the cat is thought of as a two-section robot with twist-free ball-and-socket joint turns out to be the real projective plane $\mathbb{RP}^2$. The whole configuration space is then a fiber bundle with the shape space $\mathbb{RP}^2$ as base and the group $SO(3)$ defining orientations as fiber. The cat can flip whilst conserving angular momentum using cyclic deformations of its own shape owing to the curvature of the connexion that arises from the notion of parallel transport that is implied by angular momentum conservation.

$\endgroup$
11
$\begingroup$

Here's the most elementary example of a gauge symmetry I can think of.


Suppose you want to discuss some ants walking around on a Möbius band. To describe the positions of the ants, it's convenient to imagine cutting the band along its width, so it becomes a rectangle. Then you can tell me where an ant is by telling me three things:

  • Her latitude—her position along the width of the rectangle.
  • Her longitude—her position along the length of the rectangle.
  • Her orientation—whether she's clinging to the top or the bottom surface of the rectangle.

The meaning of longitude depends on the location of that imaginary cut. If you move the cut, all the ants' longitudes change. There can't be any physical reason to prefer one cut over another, because you can slide the band along its length without changing its shape or affecting the ants' behavior. In other words, there can't be any physically meaningful notion of absolute longitude, because the band has a translation symmetry.

Similarly, the meaning of orientation depends on how you label the surfaces of the rectangle as top and bottom. There can't be any physical reason to prefer one labeling over another, because you can exchange the two surfaces of the band without changing its shape or affecting the ants' behavior. That exchange is an example of a gauge symmetry. It has some striking features that aren't shared by ordinary symmetries. Let's take a look at one of them.


For every symmetry of a situation, there's some aspect of the situation that can be described in multiple ways, with no physical grounds for choosing between them. Sometimes, though, it's useful to make a choice and stick to it, even though the choice is physically meaningless. In discussions about people sailing around on the surface of the Earth, for example, pretty much everyone I know defines longitude using a cut that goes through Greenwich, London, mostly because some people who lived around there took over the world and printed a lot of nautical charts.

If we'd gone ant-watching on an ordinary cylindrical band, we could've settled on a notion of orientation just as easily. We'd paint one side of the band turquoise for "top" and the other side blue for "bottom," and that would be that. On a Möbius band, things are more complicated, because a Möbius band only has one side! If you try to paint one surface turquoise and the opposite surface blue, starting in a small region of the band and moving outward, the turquoise and blue areas will inevitably collide. (In our earlier discussion, the collision was hidden along the longitude cut.)

In a situation with an ordinary symmetry, like a translation symmetry, you can't choose between possible descriptions in a way that's physically meaningful. In a situation with a gauge symmetry, you may not even be able to choose between possible descriptions in a way that's globally consistent! You can always, however, choose consistent descriptions in small regions of space. That's why gauge symmetries are often called local symmetries.


Having attempted a long, elementary description of what a gauge symmetry is, I'd also like to offer a short, sophisticated one. In our simplest physical models, events take place on a smooth manifold called space or spacetime. An ordinary symmetry is a diffeomorphism of spacetime that preserves the physical possibility of events. In more sophisticated models, events take place on a fiber bundle over spacetime. A gauge symmetry is an automorphism of the fiber bundle which preserves the physical possibility of events.

In our elementary example, the Möbius band plays the role of space, and the ants are walking around in the band's orientation bundle. The orientation bundle has an automorphism which exchanges the two surfaces of the band.

In classical electromagnetism, Minkowski spacetime or some other Lorentzian manifold plays the role of spacetime, and the electromagnetic field is represented by a connection on a circle bundle over spacetime. In the Kaluza-Klein picture, charged particles move around in the circle bundle, flying in straight lines whose "shadows" in spacetime are the spiraling paths we see. The circle bundle has a family of automorphisms that rotate the circle fibers, which fancy people call a $\operatorname{U}(1)$ gauge symmetry. This picture generalizes to all classical Yang-Mills theories.

In the Palatini picture of general relativity, a smooth $4$-dimensional manifold plays the role of spacetime, and the gravitational field is represented by an $\operatorname{SO}(3,1)$ connection on the manifold's frame bundle. I suspect that the gauge symmetries of linearised gravity that you mentioned are automorphisms of the frame bundle.

In Einstein's picture of general relativity, the symmetries are diffeomorphisms of spacetime. I classify these as ordinary symmetries, rather than gauge symmetries. As tparker mentioned, however, not everyone uses the term "gauge symmetry" in the same way.

$\endgroup$
3
  • $\begingroup$ Wonderful! The Möbius band idea is just beautiful, and it really captures all the essence of much more complicated ideas. What I also like about it is how the flow of ideas shows how the simple seamlessly generalizes. $\endgroup$ Commented Jul 11, 2016 at 10:32
  • 1
    $\begingroup$ Hey, what's with the three votes? Dunno what's wrong with the lurkers on this site, this is the best answer to this question so far, given the OP's requirements. Anyhow, one of the votes is mine. $\endgroup$ Commented Jul 14, 2016 at 16:01
  • 2
    $\begingroup$ @WetSavannaAnimalakaRodVance, I wouldn't worry about the number of votes. If you meet someone who might benefit from this answer, you can just link them to it directly. As a reference, it works just as well at the bottom of the vote-sorted answer list as at the top. $\endgroup$
    – Vectornaut
    Commented Jul 14, 2016 at 17:20
3
$\begingroup$

There is very interesting physical interpretation of the gauge invariance in the case of $U(1)$ symmetry. Gauge symmetry is the only way to obtain Lorentz invariant interaction of the matter (in the wide sense - the field of arbitrary spin) and photons (being massless particles with helicity 1), which decreases as $\frac{1}{r^{2}}$ at large distances (this statement is nothing but Coulomb law). Briefly, 4-potential $A_{\mu}$, which provides inversed square law of EM interactions, isn't Lorentz covariant, and manifestation of Lorentz invariance of interaction leads to charge local conservation.

Really, it can be shown from very general considerations, based on the symmetry of our space-time, that photons are presented by the antisymmetric 4-tensor $F_{\mu\nu}$, called EM strength tensor. It is Lorentz covariant formally (by using naive manipilations with tensor indices) and by construction (as the field which represents particles with helicity 1), i.e., under Lorentz transformation given by matrix $\Lambda_{\mu}^{\ \nu}$ it is transformed as $$ F_{\mu\nu} \to \Lambda_{\mu}^{\ \alpha}\Lambda_{\nu}^{\ \beta}F_{\alpha\beta} $$ Next, suppose we have matter fields $\psi$ and discuss an interaction of matter with photons. The most obvious way to get such interaction is to obtain it by constructing all possible convolutions of $F_{\mu\nu}$ with matter fields and Lorent-covariant objects (Dirac matrices, Levi-Civita connection etc.). Suppose also we know from experiment, that interaction falls down as $\frac{1}{r^{2}}$ at large distance. Unfortunately, this is impossible, if we use $F_{\mu\nu}$. The formal reason is that the propagator of this field, which shows the interaction law, falls faster than $\frac{1}{r^{2}}$. This is because two indices and antisymmetry of $F_{\mu\nu}$.

We can make some hint and introduce object $A_{\mu}$ with one indice, called 4-potential: $$ F_{\mu\nu} = \partial_{\mu}A_{\nu}-\partial_{\nu}A_{\mu} $$ Interactions now are constructed by convolutions of $A_{\mu}$ with matter fields and other covariant objects.

Of course, we require that $A_{\mu}$ represent massless helicity 1 particles as well as $F_{\mu\nu}$. Unfortunately, this requirement leads to the statement that 4-potential isn't Lorentz covariant (although formally it is, of course). Precisely, under Lorentz transformation field $A_{\mu}$, which is assumed to represent helicity 1 massless particles, is changed as $$ \tag 1 A_{\mu} \to \Lambda_{\mu}^{\ \nu}A_{\nu} + \partial_{\mu}\varphi $$ We see that it is not Lorentz covariant. The free lagrangian for $A_{\mu}$, which is just $$ L = -\frac{1}{4}F_{\mu\nu}F^{\mu\nu}, $$ is Lorentz invariant.

But there is one way to preserve Lorentz invariance of interactions. This way is to construct them in order to be invariant under transformation $A_{\mu} \to A_{\mu}+\partial_{\mu}\varphi$. Precisely, the amplitude of interaction $M_{\mu_{1}...\mu_{n}}(p_{i},\epsilon_{j}(k_{j}))$, where $\epsilon$ are photon helicity (polarization) vectors, $p_{i}$ are all momentums of interacting particles and $k_{j}$ being momenta of photons), must be invariant under transformation $$ \tag 2 \epsilon_{\mu}(p) \to \epsilon_{\mu}(p) + \alpha p_{\mu} $$ On the formal language, as it can be shown by treating processes with emission of soft photons (photons with almost zero momenta), this means, that there must be conservation law of matter couplings $g_{i}$: $$ g_{1}+g_{2}+... = \text{const} $$ This is nothing but the charge conservation law. Together with $(2)$ this is nothing but $U(1)$ gauge symmetry.

So, we see that the Lorentz invariance of interactions of photons with matter by inversed square law leads to gauge invariance. Analogically can be argued equivalence principle for the case of interacting of gravitons with all fields.

$\endgroup$
2
$\begingroup$

Gauge theories describe the connectivity of a space with small, symmetric extra dimensions

Start with an infinite cylinder (the direct product of a line and a small circle). The cylinder can be twisted. To avoid appealing to concepts that I'm trying to explain, I'll just say that the cylinder is made of wire mesh: evenly spaced circles soldered to wires running the length of it. The long wires can rotate as a unit, introducing an angular twist between each pair of adjacent circles. It's clear that any such configuration can be continuously deformed into any other: all such cylinders are equivalent from the perspective of the proverbial ant crawling on them.

Replace the line with a closed loop, so that the product is a torus (and think of the torus as a mesh doughnut, even though varying the plane of the small circles like that technically breaks the analogy). Any portion of the doughnut short of the whole thing can be deformed into the same portion of any other doughnut, but the doughnuts as a whole sometimes can't be, because the net twist around the doughnut can't be altered. The classes of equivalent doughnuts are completely characterized by this net twist, which is inherently nonlocal.

Replace the loop (not the small circle) with a manifold of two or more dimensions. It's true, though not obvious, that the physical part of the connection is completely given by the integrated twist around all closed loops (Wilson loops).

$A$ and $F$ quantify the connectivity

In the discrete case, the connection can be described most simply by giving the twist between adjacent circles. In the continuum limit, this becomes a "twist gradient" at each circle. This is $A_\mu$, the so-called vector potential.

Any continuous deformation can be described by a scalar field $\phi$ representing the amount that each circle is twisted (relative to wherever it was before). This alters $A_\mu$ by the gradient of $\phi$, but doesn't change any physical quantity (loop integral).

The description in terms of Wilson loops, $\oint_\gamma A \cdot \, \mathrm dx$, is more elegant because it includes only physically meaningful quantities, but it's nonlocal and highly redundant. If the space is simply connected, you can avoid the redundancy and nonlocality by specifying the twist only around differential loops, since larger loops can be built from them. The so-called field tensor, $\partial_\nu A_\mu - \partial_\mu A_\nu = F_{\mu\nu}$, gives you exactly that.

(If the space is not simply connected, you can still get away with the differential loops plus one net twist for each element of a generating set of the fundamental group. The torus was of course a simple example of this.)

The force comes from the Aharonov–Bohm effect

Consider a scalar field defined over the entire space (unlike the earlier fields, this one takes a value at each point on each circle). The field is zero everywhere except for two narrow beams which diverge from a point and reconverge somewhere else. (Maybe they're reflected by mirrors; maybe the space is positively curved; it doesn't matter.)

Unless the field is constant across the circles, the interference behavior of the beams will depend on the difference in the twist along the two paths. This difference is just the integral around the closed loop formed by the paths.

This is the (generalized) Aharonov–Bohm effect. If you restrict it to differentially differing paths and use $F_{\mu\nu}$ to calculate the effect on the interference, you get the electromagnetic force law.

You can decompose the field into Fourier components. The Fourier spectrum is discrete in the small dimension. The zeroth (constant) harmonic is not affected by the twisting. The second harmonic is affected twice as much as the first. These are the electric charges.

In reality, for unknown reasons, only certain extra-dimensional harmonics seem to exist. If only the first harmonic exists, there's an equivalent description of the field as a single complex amplitude+phase at each point of the large dimensions. The phase is relative to an arbitrary local zero point which is also used by the vector potential. When you compare the phase to the phase at a nearby point, and there is a vector-potential twist of $\mathrm d\theta$ between them, you need to adjust the field value by $i \, \mathrm d\theta$. This is the origin of the gauge covariant derivative.

Circles generalize to other shapes

If you replace the circles with 2-spheres, you get an $\mathrm{SU}(2)$ gauge theory. It is nastier numerically: the symmetry group is noncommutative, so you have to bring in the machinery of Lie algebra. Geometrically, though, nothing much has changed. The connectivity is still described by a net twist around loops.

One unfortunate difference is that the description of charge as extra-dimensional harmonics doesn't quite work any more. Spherical harmonics give you only the integer-spin representations, and all known particles are in the spin-0 or spin-½ representations of the standard model $\mathrm{SU}(2)$, so the particles that are affected by the $\mathrm{SU}(2)$ force at all can't be described this way. There may be a way to work around this problem with a more exotic type of field.

I have nothing insightful to say about the $\mathrm{SU}(3)$ part of the Standard Model gauge group except to point out that the whole SM gauge group can be embedded in $\mathrm{Spin}(10)$, and I think it's easier to visualize a 9-sphere than a shape with $\mathrm{SU}(3)$ symmetry.

General relativity is similar

In general relativity, the Riemann curvature tensor is analogous to the field tensor; it represents the angular rotation of a vector transported around a differential loop. The Aharonov-Bohm effect is analogous to the angular deficit around a cosmic string. Kaluza-Klein theory originally referred to a specific way of getting electromagnetism from general relativity in five dimensions; now it often refers to the broad idea that the Standard Model gauge forces and general relativity are likely to be different aspects of the same thing.

$\endgroup$
1
$\begingroup$

In Classical Electrodynamics (CED) the gauge invariance means independence of the electric and magnetic fields from a particular "choice" of the potentials $\varphi$ and $\bf{A}$. The equation for potentials depend, of course, on the particular choice of the "gauge", and they give different solutions for different gauges.

In QM and QED the gauge invariance means also "invariance" of the form of equations (the solutions being still different, but physically equivalent).

But one should keep in mind that any helpful variable change is as well acceptable if the corresponding results remain physically the same. For that the form of equations should not obligatory be "invariant" at all.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.