4
$\begingroup$

Now the default answer to this may be, "It has no origin because it's a definition", but let me just try to justify my concern here.

On page 842, equation (22.91) of "Modern Electrodynamics" by Andrew ZangWill, 2013, Cambridge university press. It is written that "$[\cdots]$ if we can define the four-vector $A^\mu=\left(\vec A,\, i \phi/c\right)$". While on Wikipedia it is simply stated that "The contravariant electromagnetic four-potential can be defined as $A^\mu=\left(\phi, \vec A\right)$", on that page there is a citation to the text by D.J. Griffiths (2007). "Introduction to Electrodynamics (3rd ed.)". I have that book and what he actually writes is $A^\mu=\left(V/c,A_x,A_y,A_z\right)$ on page 541, eqn. (12.131), which is fine and a subtlety I won't dwell over. But I would like to emphasize the words in bold here which seem to suggest that this 'definition' is not unique.

Now you may wonder why all the concern regarding $A^\mu$ and why not fixate on some other four-vector, like the four-current (density) $j_\mu=\left(-\rho,\vec J\right)$ for instance.

The reason I'm okay with that equation is because the four-current has an origin, unlike $A^\mu$, the four-current originates from the sourced Maxwell's equations $$\begin{pmatrix}\nabla\cdot \vec E \\\\\frac{\partial{\vec E}}{\partial t}-\nabla \times \vec B\end{pmatrix}=-\mu_0 \begin{pmatrix}-\rho\\\\\vec J\end{pmatrix}\tag{1}$$ Where I identify the vector on the RHS of $(1)$ as $j_\mu=\left(-\rho,\vec J\right)$ or the contravariant form as $j^\mu=\left(\rho,\vec J\right)$ - by raising the zeroth component with the inverse Minkowski metric for which the convention I used is $\eta=\mathrm{diag}(-1,1,1,1)$. Eqn. $(1)$ was written in natural units ($c=1$).

Equation $(1)$ tells me that 'sourced' Maxwell equations lead to the zeroth component being a scalar, $\rho$ and the other 3 components combine to a 3-vector, $\vec J$. This is understood and makes sense to me as $\nabla\cdot \vec E=\mu_0 \rho$ is a scalar equation and $\frac{\partial{\vec E}}{\partial t}-\nabla \times \vec B=-\mu_0 \vec J$ is a 3-vector equation.


Most textbooks normally just state $A^\mu=\left(\phi, \vec A\right)$ without any explanation as to why it takes that form. By 'form' here, I mean why does it have a scalar, $\phi$ component, and a 3-vector component, $\vec A$. Why not $A^\mu=\left(\phi_1,\phi_2, A_1, A_2\right)$ for instance? Also why is there not some other choice of sign, say, $A^\mu=\left(-\phi, \vec A\right)$ or $A^\mu=\left(\phi, -\vec A\right)$ or even $A^\mu=\left(-\phi, -\vec A\right)$?

I know there may not be a simple answer to this question, but to summarize, can anyone please explain to me where the four-vector potential originates?

$\endgroup$
11
  • $\begingroup$ This is not an answer, but: "can" and "if" mean that we are proposing a definition, but in this case there is only one sensible definition, which is the only one that plays well with the Lorentz transformations. $\endgroup$
    – Javier
    Commented May 2 at 22:03
  • $\begingroup$ $(\rho,\vec j)$ and $(\phi,\vec A)$ can be shown to Lorentz-transform in exactly the same way as $(t,\vec x)$ does. This is why we write them as $j^\mu$, $A^\mu$, and $x^\mu$. $\endgroup$
    – Ghoster
    Commented May 2 at 23:00
  • $\begingroup$ It wouldn’t make any sense to use four-vector notation for them unless they did indeed transform as four-vectors when one changes to another inertial frame. $\endgroup$
    – Ghoster
    Commented May 2 at 23:10
  • $\begingroup$ What are your $\phi_1$ and $\phi_2$ supposed to mean? There is only $\phi$. And why would it make sense to define something that involved $A_1$ and $A_2$ but not $A_3$? This would be privileging the $x$ and $y$ directions and ignoring the $z$ direction, which is unphysical. $\endgroup$
    – Ghoster
    Commented May 2 at 23:19
  • $\begingroup$ @Ghoster That was just a random example as a question to why $(\phi,\vec A)$ can't be 'something else', at this point $\phi_1$ and $\phi_2$ have no meaning, apart from that they are simply 'components of a four-vector', the same goes for $A_1$ and $A_2$. What I am essentially questioning here is why we must have a zeroth component as a scalar and then a 3-vector. Why not something else? $\endgroup$
    – Electra
    Commented May 2 at 23:27

5 Answers 5

2
$\begingroup$

I see (at least) two questions here.

  1. Why is a four-vector such as $A^\alpha$ often written as a "scalar+vector" quantity, in this case $(\phi, \vec{A})$?
  2. Where does $A_\alpha$ come from?

For (1) it is important to realize that a four-vector is defined to transform in a particular way with respect to the group $L$ of Lorentz transformations, i.e. in the vector representation. That is, a four-vector $V^\alpha$ transforms under a Lorentz transformation $\Lambda \in L$ as $\Lambda \cdot V^\alpha = \Lambda^{\alpha}_{\;\;\beta}V^\beta$. Now, the group of 3-dimensional rotations is embedded inside $L$ by $$ R' = \begin{pmatrix} 1 & \vec{0}^T \\ \vec{0} & R\end{pmatrix} \in L $$ where $R \in SO(3)$. The vector representation of $L$ correspondingly breaks down into a scalar and vector representation, as we can see from $$ R' \begin{pmatrix} V_1 \\ V_2 \\ V_3 \\ V_4 \end{pmatrix} = \begin{pmatrix} V_1 \\ R\begin{pmatrix} V_2 \\ V_3 \\ V_4 \end{pmatrix} \end{pmatrix}$$ so we see that $V_1$ behaves as a scalar whilst $(V_2,V_3,V_4)^T$ behaves as a SO(3) vector.

Thus given a four-vector such as $A^\alpha$, it is natural, from a non-relativistic point of view, to break it down as $(\phi, \vec{A})$ where $\phi$ is an $SO(3)$ scalar and $\vec{A}$ is an $SO(3)$ vector. Of course in relativistic settings this decoupling no longer holds and one should look at the entire four-vector.

This is very much akin to the physics of a particle orbiting within a plane, where $SO(3)$ symmetry is broken to $SO(2)$ in an analogous way.

For (2), the most concise way to explain this question is by first rewriting the Maxwell equations using differential forms. Here one realises that the electric field $\vec{E}$ is truely an $SO(3)$ vector, whereas the magnetic field $\vec{B}$ is not: by examining for example the dynamics $\vec{F} = q(\vec{E} + \vec{v}\times \vec{B})$ we see that $\vec{B}$ is a 'pseudovector' - it transforms with a sign ($\det R$ in the above notation). In fact in other dimensions the cross product does not even exist, but the most obvious generalization means that a dimension-independent way to express $\vec{B}$ is as the components of a two-form $B$. From this point on I assume some knowledge of forms; you can try to read ahead otherwise to get the general idea, but it is frankly not difficult to learn the basics of differential forms which should ideally be (albeit seldom is) widespread in undergraduate courses.

In this language one can define a two-form $F$ whose components are $F_{0i} = A_i$ and $F_{ij} = \epsilon_{ijk}B_k$. For very similar reasons to above, we see that the non-relativistic decomposition is consistent with $\vec{A}$ being an $SO(3)$ vector. With this notation, Maxwell's equations become very concise: $$ \begin{aligned} d\star F = J \\ dF = 0 \end{aligned} $$ where $J$ is the source (expressed as a one-form), $d$ is the exterior derivative (generalization of grad, div and curl) and $\star$ is the Hodge star. From the first of the Maxwell equations, we see that $d\star F$ transforms as a four-vector, hence $F$ transforms as a (0,2)-tensor under Lorentz transformations [see Weinberg's book Gravitation... I think for details].

Poincaré's lemma (a unification of the divergence and curl theorems) tells us that locally we can write $F = dA$ for some $A$. From the transformation properties of $F$ we see that the components of $A$ form a Lorentz four-vector, and so it is appropriate to write them as $(\phi, \vec{A})$ for some $\phi$ and $\vec{A}$. Whether we give them signs is immaterial, since one can always redefine them $\phi' = -\phi$ etc.

By the way, this leads us to one of the most profound parts of physics. We have found a configuration space made up of (local patches of) a one-form $A$ which makes Maxwell's equations particularly tidy. However, this is at the expense of gauge redundancy: different $A$ lead to the same physical field $F$. This is one of the most important insights of 20th century physics.

$\endgroup$
1
$\begingroup$

The four-vector ${\bf A}$ with components $A^{\mu}$ arises from the expressions for the electric field $\vec{\bf E}$ and the magnetic induction field $\vec{\bf B}$, which are three-dimensional. The spatial part of the four-vector ${\bf A}$ is obtained from

$$\vec{{\bf \nabla}}{\bf \times}\vec{\bf B}\,,$$

whilst the temporal component comes from the definition of the expression for the electric field

$$ \vec{\bf E} = -\vec{\bf\nabla}\Phi - \frac{1}{c}\frac{\partial \vec{\bf A}}{\partial t}\,, $$ where $\Phi$ is the scalar potential, $\vec{\bf A}$ is the vector potential, and I am using the expression for $\vec{\bf E}$ and $\vec{\bf B}$ in Gaussian units.

Now, the purpose of defining the four-vector ${\bf A}$ is to write Maxwell's equations in a covariant manner. This is done by defining an anti-symmetric tensor of rank 2, commonly denoted $F^{\mu\nu}$ when the indices are raised. The tensor is given by

$$ F^{\mu\nu} = \partial^{\mu}A^{\nu} - \partial^{\nu}A^{\mu} $$

or, with the indices lowered

$$ \eta_{\mu\alpha}\eta_{\nu\beta}F^{\alpha\beta} = F_{\mu\nu} = \partial_{\mu}A_{\nu} - \partial_{\nu}A_{\mu}\,, $$

where $\eta_{\nu\alpha}$ is the metric for the space-time.

To the best of my knowledge, how $F_{\mu\nu}$ is defined depends on how the metric for the Minkowski space-time is expressed. One can write the metric in a few different ways in Cartesian co-ordinates - at least four to my knowledge:

$$ \eta_{\mu\alpha} = \begin{pmatrix} -1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix} $$

or $$ \eta_{\mu\alpha} = \begin{pmatrix} 1 & 0 & 0 & 0\\ 0 & -1 & 0 & 0\\ 0 & 0 & -1 & 0\\ 0 & 0 & 0 & -1 \end{pmatrix} $$

or

$$ \eta_{\mu\alpha} = \begin{pmatrix} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & -1 \end{pmatrix} $$

and finally

$$ \eta_{\mu\alpha} = \begin{pmatrix} -1 & 0 & 0 & 0\\ 0 & -1 & 0 & 0\\ 0 & 0 & -1 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix}\,. $$

And then there is the role that the choice of co-ordinates, whether to include c or set c equal to 1 when defining the temporal co-ordinate, plays and the system of units employed plays, i.e., whether one uses S.I. units, Gaussian or c.g.s units, or Heaviside-Lorentz units.

So, in conclusion, the definition of the four-vector ${\bf A}$ having components $A^{\mu}$ is not unique.

$\endgroup$
1
$\begingroup$

The only reason one would ever define something like $A^{\mu}$ is "an effort to go and pull electrodynamics into a four-dimensional framework. Now, in order to do that, one can notice that you have the Maxwell tensor:

$$F_{ab} = \left(\begin{array} &0&E_{x}&E_{y}&E_{z}\\ -E_x&0&B_z&-B_y\\ -E_z&-B_z&0&B_x\\ -E_z&B_y&-B_x&0 \end{array}\right)$$

Which, has the nice property of The Lorentz force law being given by:

$$F^{b} = F^{a}{}_{b}j^{b}$$

and for which the covariant maxwell equations become:

$$\nabla^{a}F_{ab} = j_{b}\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \nabla_{c}\left(\epsilon^{abcd}F_{ab}\right) = 0$$

So, this is a pretty and elegant way of writing these equations in terms of fields, but what if you want to use the potentials? Well, remembering that ${\vec E} = -{\vec \nabla}\phi$ and ${\vec B} = {\vec \nabla}\times {\vec A}$ as the definition of your three dimensional potentials, then it works out that if you make the above definition of $A_{a}$, then it works out that $F_{ab} = \frac{1}{2}\left(\partial_{a}A_{b} - \partial_{b}A_{a}\right)$, and so you can get rid of all references to three dimensional things entirely, and the only first-class entity you have to care about is $A_{a}$. The reasons that the other suggestions you make for creating an $A$ don't work is because they don't produce the ordinary maxwell equations.

$\endgroup$
0
$\begingroup$

It is just a historical name, it is not a scalar, it is the zero component of a four vector, so it does change if you make a boost. It looks like a scalar only if you restrict the transformation to a 3D rotation, which obviously will not affect the first component. Every 4 vector must have this property

$\endgroup$
0
$\begingroup$

It comes from the gauge covarient derivative in field theory [Definition].

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.