23
$\begingroup$

I've read the various threads on this site that talk about it being impossible for photons (or massless particles in general, really) to have a rest frame, and the answers all seem to boil down to "the existence of such a reference frame would lead to all sorts of absurd results, such as photons having zero energy in that reference frame". And OK, I can see why, logically, if defining a certain mathematical "object" leads to absurd results, the reasonable response is to say there is no such object. But I don't find any of these explanations intellectually satisfying, because they don't really explain why there is no rest frame for photons so much as they give reasons we know there isn't one.

If I understand correctly, one way to think about why special relativity gives such counterintuitive results is because it works with a hyperbolic geometry, whereas our everyday experience corresponds to how things would work in a Euclidean geometry. Indeed, it wasn't until I learned parts of SR through a geometric lense that it actually made intuitive sense why $c$ is invariant and the maximum possible speed, why length contraction and time dilation are things, etc. So I was wondering, is there also an explanation for why we can't have a reference frame for photons in terms of the geometry?

Or, if there's no good geometric explanation, is there an algebraic explanation that's more akin to a direct proof (it doesn't have to be an actual mathematical proof)? All the other explanations seem akin to proofs by contradiction -- they assume the existence of such a reference frame and show that it leads to results that contradict other things we generally assume to be true.

$\endgroup$
7
  • 1
    $\begingroup$ It's impossible to answer 'why' because it is a direct result of a postulate, like the 2nd one here: en.wikipedia.org/wiki/Postulates_of_special_relativity $\endgroup$
    – Avantgarde
    Commented Jul 30, 2023 at 23:56
  • 10
    $\begingroup$ Not sure if this constitutes an answer, but one way to think about it is that the rest frame, by definition, is the frame in which you have zero velocity. However, the speed of light must be the same in all frames and therefore a rest frame for light can't exist. $\endgroup$ Commented Jul 31, 2023 at 0:12
  • 1
    $\begingroup$ My answer to A photon travels in a vacuum from A to B to C. From the point of view of the photon, are A, B, and C at the same location in space and time? might help. I am not sure it is what you are after. It explains how you can't get to a reference frame of light by an infinite series of boosts. $\endgroup$
    – mmesser314
    Commented Jul 31, 2023 at 1:37
  • 1
    $\begingroup$ Honestly, if you want physics to have intellectually satisfying explanations for phenomena, the only way to get there reliably is to do the math (with clearly and comprehensibly defined variables and operators) until it starts to form patterns that your brain can anticipate. $\endgroup$
    – g s
    Commented Jul 31, 2023 at 4:39
  • 2
    $\begingroup$ I consider a photon to be a continuous spear going from origin to annihilation. It doesn't move, it just IS. $\endgroup$
    – Stian
    Commented Jul 31, 2023 at 13:38

7 Answers 7

23
$\begingroup$

If you're willing to start from the Lorentzian geometry of spacetime, then the answer is nearly trivial. In special relativity, the (instantaneous) rest frame of a particle is a coordinate system in which the Minkowski metric takes the form $\mathrm{diag}(-1,1,1,1)$ and the particle is (instantaneously) at rest at the origin.

In such a frame, the 4-velocity takes the form $$u(\tau) = c\pmatrix{1\\ 0 \\0 \\0}$$ and is therefore timelike. But the worldline of a photon is lightlike, not timelike.


More concretely, we might consider a 1D example. Begin from a reference frame $(x,t)$ in which the Minkowski metric takes the form $\mathrm ds^2 = -c^2\mathrm dt^2 + \mathrm dx^2$ and a photon moves with speed $c$ in the $+\hat x$ direction. We may define new coordinates $$\matrix{X(x,t) = x-ct\\T(x,t)=t} \iff \matrix{x(X,T) = X + cT\\t(X,T) = T}$$ In this new coordinate system, the trajectory of the photon takes the form $X = 0$, and so the photon is "at rest". However, the Minkowski metric takes the form $$\mathrm ds^2 = - c^2\mathrm dT^2 + (\mathrm dX + c\mathrm dT)^2 = \mathrm dX^2 + 2c\mathrm dX \mathrm dT$$ which is not the correct form. There is no way around the fact that if the trajectory of the photon lies along the "$T$-axis", then $T$ must be a lightlike, not timelike, coordinate and the Minkowski metric cannot take the form we require.


The reason we require the metric to be in this form is because we want to associate some physical meaning to our coordinates. Roughly speaking, when the metric takes the usual form, the coordinates mean what they usually mean - that is, a coordinate system consisting of clocks and rods as per the usual Einstein construction.

If we choose a different coordinate system in which the metric takes a different form, we have to be very careful with how we interpret what the coordinates actually mean. The coordinate system I define above is perfectly legitimate, but because the metric takes a strange form, we must be careful with what $T$ and $X$ actually represent. They cannot be understood in terms of a clocks-and-rods coordinate system, which is reflected in the non-standard form of the metric, and a similar issue will plague any coordinate system in which a photon is "at rest" because the $T$ coordinate will necessarily be lightlike rather than timelike.

$\endgroup$
21
$\begingroup$

As you rotate a vector in Euclidean $n$-space, it will trace out an $n$-sphere (circle, "ordinary" sphere, or a higher dimensional analog). A sphere contains one point in every direction. Thus, in Euclidean space, any vector can be rotated to point in any direction.

As you rotate a vector in Minkowski spacetime, it will instead trace out a hyperbola/hyperboloid. A hyperbola does not contain one point in every direction. There are some directions that simply cannot be reached.

Drawing of a 2D hyberbola, with two points on and one point off. The two points on the hyperbola can be smoothly hyperbolically rotated into each other. The point off the hyperbola is forbidden.

This disconnectedness under rotation is a new thing in Lorentzian (mixed signature) geometry. It isn't even present in hyperbolic geometry proper, as the hyperbolic plane itself is a proper Riemmanian manifold, and is locally Euclidean. In Minkowski geometry, unlike Euclidean geometry, all directions are not equivalent. They split into three (or five) groups that can never rotate into each other: (future-directed and past-directed) timelike, (future-directed and past-directed) lightlike, and spacelike.

To connect back to physics, the rotations in Minkowski spacetime look to us as combinations of spatial rotations and boosts (changes in velocity). Any observer has a trajectory through spacetime, and the tangent vector to that trajectory at any point is the velocity. Say that, initially, the velocity is timelike. Then we can normalize it so $u_\mu u^\mu=1.$ If the observer accelerates, the normalized velocity vector $u^\mu$ may rotate along the hyperbola $u_\mu u^\mu=1.$ But, from the diagram, you can see that the velocity can never be rotated to point in a lightlike or spacelike direction.

We expect to be able to construct a reference frame for objects on trajectories whose velocities are timelike, as our trajectories are timelike. We expect our formulas to work for all timelike observers, because we can always imagine boosting an existing observer to any (future-directed) timelike velocity. But lightlike and spacelike velocities are geometrically distinct from (and inaccessible from) from timelike velocities. They have significantly different properties and there's no reason to expect we can construct a reference frame around the trajectory of e.g. a photon that would resemble the experience that we're used to.

As the other answers note, you can formalize these intuitions algebraically in several ways.

$\endgroup$
8
$\begingroup$

Let's restrict ourselves to:

  • A 1+1 dimensional spacetime,
  • Lorentz boosts (as opposed to other coordinate transformations).

The reason to restrict ourselves to Lorentz boosts, is essentially so that we stay within the mathematical framework of special relativity, without getting into tools that require general relativity. (J. Murray's answer nicely interprets this question using a general relativistic framework by discussing how the spacetime metric transforms). Physically, Lorentz boosts are the transformations that relate inertial observers (in the absence of gravity), so this assumption boils down to assuming we only want to consider inertial observers and we are neglecting any gravitational effects.

The trajectory of a photon can be specified by giving the $x$ as a function of $t$ (in general you'd want to give both $x$ and $t$ in terms of a parameter $\lambda$, but there's no need to make the formalism that complicated in this example). An example trajectory would be $$ x = x_0 + ct $$ for some constant $x_0$. The 3-velocity is, of course, $c$: $$ v = \frac{dx}{dt} = c $$

Now we perform a Lorentz transformation to move from the $(x, t)$ frame to the $(x', t')$ frame, related by a boost with velocity parameter $v$ \begin{eqnarray} t' &=& \gamma \left(t - \frac{x v}{c^2}\right) \\ x' &=& \gamma\left(x - v t\right) \end{eqnarray} What happens to our trajectory? Well, inverting the above transformation, and plugging it back into the trajectory, yields... \begin{eqnarray} x &=& x_0 + ct \\ \gamma\left(x' + v t'\right) &=& x_0 + c \gamma \left(t' + \frac{x' v}{c^2} \right) \\ x' \left(1 - \frac{v}{c}\right) &=& x_0 + c t' \left(1 - \frac{v}{c}\right) \\ x' &=& \frac{x_0}{1 - \frac{v}{c}} + c t' \end{eqnarray} In other words, the initial position has transformed, but the 3 velocity $dx'/dt'$ is still $c$.

Now, you may notice that the above derivation involves canceling factors of $1-v/c$. What happens if $v=c$? Then $1-v/c=0$, and it is not valid to divide by zero.

This is actually the question you're interested in; a Lorentz transformation to the rest frame of a photon would involve using a boost parameter equal to $c$.

In fact, the issue with taking $v=c$ already arises in the first line. The factor $\gamma=1/\sqrt{1-v^2/c^2}$ becomes infinite when $v=c$, and therefore the Lorentz transformation itself is ill-defined. (Incidentally, for $v>c$, $\gamma$ becomes imaginary, which also means the Lorentz transformations don't make physical sense for $v>c$!)

A more sophisticated way to interpret this geometrically is that the inner product of two 4-vectors remains invariant under a Lorentz transformation. In particular, the sign of the inner product remains invariant. This allows us to divide separations between points in spacetime into timelike separated regions (I usually work in a convention where these have a positive norm), spacelike separated regions (which have a negative norm in my convention), and null separated regions (which have zero norm). A Lorentz transformation cannot change the sign of the inner product, so they cannot change a timelike vector into a spacelike vector, or a null vector into a timelike vector. Photon trajectories are null paths, and therefore remain null after a Lorentz transformation.

Finally, as stated in the comments, while this answer gives a few different perspectives as to why you cannot construct a photon rest frame within the framework of special relativity, in fact special relativity itself is built by assuming the speed of light is the same in all inertial reference frames. So logically speaking, these arguments should be considered as consistency checks, but not really a "proof".

$\endgroup$
6
$\begingroup$

OK, so I posted the question, but I just found this YouTube video by PhysicsNextBook that gives what I find to be a very satisfying geometric explanation, so I thought I'd explain based on the video for anyone else looking for a similar explanation.

The video explains that the definition of a reference frame in special relativity requires that there be at least two distinct points in time, because otherwise there's no way to make measurements. In other words, as
J. Murray pointed out in his answer, the spacetime interval of the observer's world-line needs to be timelike in order to get a physically meaningful coordinate system. But a photon's worldline is lightlike, which just means its spacetime interval is zero, and hence there are no points on the worldline that are separated in time. That means there's no way to make measurements in such a coordinate system. Hence, such a coordinate system isn't physically meaningful and cannot be considered a reference frame because it contradicts the definition.

$\endgroup$
5
$\begingroup$

Here's an old comment of mine from
https://www.physicsforums.com/threads/photons-perspective-of-time.107741/#post-899778

I've reproduced most of it here:


First, all terms (e.g. "reference frame", "photonic perspective", "simultaneously", etc...) need to be defined precisely. This is one important role of mathematics (namely, a mathematical model of the physics one wishes to describe).

Here are some "reasonable" properties of a "reference frame" of a massive particle in SR (whose worldline has an everywhere timelike tangent vector). [I am going to emphasize the geometric structures to avoid dealing with and trying to interpret certain algebraic formulas that break down when [probably inappropriately] applied to a massless particle.]

  • If A and B are distinct events on that worldline, either A is in the causal past of B (so that A can influence B), or vice versa.
  • Its Minkowski-arc-length along the worldline is nonzero and can be associated with a clock carried by the particle... this clock marks the particle's "proper time". -The hyperplane Minkowski-orthogonal to that tangent vector does not contain that tangent vector... and can be called the particle's "space at an instant" (a "moment of time").
  • The events on this hyperplane are regarded to be "simultaneous" according to this particle since:
    • these events are assigned the same time coordinate as read off by the particle's wristwatch/proper-time (e.g., by a radar method [at least for nearby events]: send off a light signal at wristwatch time t1, receive its echo off the distant event at wristwatch time t2, assign to that distant event the time-coordinate (t1+t2)/2)
    • these events are spacelike-related (and therefore not causally-related) to each other
  • For an inertial massive particle in SR, the entire Minkowski spacetime is foliated by these hyperplanes... meaning that the entire spacetime is sliced into nonintersecting hyperplanes, so that every event in spacetime is assigned a (but certainly not all the same) time-coordinate.

I think the list above seems reasonable.

What are the analogous statements for a photon (a massless particle) in SR (whose worldline has an everywhere lightlike [a.k.a. null] tangent vector)?

  • (Does "time stop" for a photon?)
    If A and B are distinct events on that worldline, either A is in the causal past of B (so that A can influence B), or vice versa. [still TRUE... so (to me) it DOES NOT make sense to say that "time stops" or that all of its events occur "simultaneously" since there is certainly a sense of causal-sequence of the photon's events.]
  • (Does it make sense to define "proper time" for the photon?)
    Its Minkowski-arc-length along the worldline is ZERO. So, there may be a problem here. Maybe one can define it...but is it useful? Don't ignore the previous point.
  • (Does it make sense to call this hyperplane "space" for the photon?)
    The hyperplane Minkowski-orthogonal to that tangent vector DOES contain that tangent vector [joining two causally-related events]... this is a feature of the Minkowskian geometry of SR.
  • The events on this hyperplane are NOT all spacelike-related to each other... There are events on this plane (namely along the tangent vector) that are causally-related to each other.
  • For an inertial massless particle in SR, the entire Minkowski spacetime is NOT foliated by these hyperplanes. So, there are many events that are not assigned a full set of coordinates. [One might argue here that one really needs a congrunce of worldlines.]

So, it seems to me that there are some problems trying to define a reference frame for a photon.


$\endgroup$
1
$\begingroup$

A photon's worldline is orthogonal only to itself, and therefore can't be a component of an orthonormal frame.

$\endgroup$
-1
$\begingroup$

In a right triangle (x,y,r), we have : $r^{2}=x^{2}+y^{2}\;\;\;\;(1)$

$$x^{2}=r^{2}-y^{2}=(r-y)(r+y)=r^{2}\left(1-\frac{y}{r}\right)\left(1+\frac{y}{r}\right)$$

$$\frac{r^{2}}{x^{2}}=\frac{1}{(1-\frac{y}{r})(1+\frac{y}{r})}=\frac{1}{\frac{x^{2}}{r^{2}}}=\frac{1}{\cos^{2}(\alpha)}=\frac{1}{1-\sin^{2}(\alpha)}$$

In the right triangle : $\sin(\alpha)=\frac{y}{r}$, i.e:

$$\frac{1}{1-(\frac{y}{r})^{2}}=\frac{1}{(1-\frac{y}{r})(1+\frac{y}{r})}=\left(\frac{1}{\sqrt{1-\frac{y^{2}}{r^{2}}}}\right)^{2}$$

From the figure 1 : enter image description here $$\begin{cases}x=ct\\y=vt'\\r=ct'\end{cases}$$

we have :$\;\;\;\frac{1}{1-(\frac{v}{c})^{2}}=\gamma^{2}\;,t'=\gamma t\;$

the relation (1) becomes:$$\gamma^{2}c^{2}t^{2}=c^{2}t^{2}+\beta^{2}\gamma^{2}c^{2}t^{2}$$

same form as the energy relation (time plays the same role as mass).

If the photon is at rest, there is a time $t$ such that:$$\gamma^{2}c^{2}t^{2}=c^{2}t^{2}+\gamma^{2}c^{2}t^{2}$$

i.e.$$\gamma^{2}=1+\gamma^{2}$$ which is mathematically false.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.