6
$\begingroup$

I'm a high school student trying to understand the basics of special relativity, and I've been learning about the Lorentz Transformations. I understand that one transforms between the reference frames of two observers using the following equations: $$x'=\frac {1}{\sqrt{1-\frac{v^2}{c^2}}}(x-vt)$$ $$t'=\frac {1}{\sqrt{1-\frac{v^2}{c^2}}}\left(t-\frac{vx}{c^2}\right)$$ Assuming that: $$y'=y$$ $$z'=z$$ But what if $y'$ doesn't equal $y$, and the primed observer is moving relative to the unprimed observer in both the $x$- and $y$- directions (but still not the $z$-direction)? I couldn't find answers online, so I tried to come up with new equations to describe this. I read that motion does not cause length contraction in the direction perpendicular to the motion, so I assumed, perhaps wrongly, that the $x$ and $y$ coordinates would transform independently of each other according to their own separate velocities. For example, if two observers were moving relative to each other in the $x$ direction with velocity $v_x$, and they were moving relative to each other in the $y$ direction with velocity $v_y$, then one would use the following equations: $$x'=\frac {1}{\sqrt{1-\frac{v_x^2}{c^2}}}(x-v_xt)$$ $$y'=\frac {1}{\sqrt{1-\frac{v_y^2}{c^2}}}(y-v_yt)$$ I thought that the time dilation would still depend on the total velocity, though. I calculated that the overall relative velocity $v_{total}$ would be: $$v_{total}=\sqrt{v_x^2+v_y^2}$$ Furthermore, the difference in observed time would depend on the total distance of the event (which I guess is just the distance from the origin if on a graph). The total distance $d$, according to the unprimed observer, would be: $$d=\sqrt{x^2+y^2}$$ Therefore, to calculate $t'$, one would use the equation: $$t'=\frac {1}{\sqrt{1-\frac{v_{total}^2}{c^2}}}\left(t-\frac{v_{total}d}{c^2}\right)$$ However, when I tested out these equations, they didn't work. For example, when used on events with a spacetime interval of $0$, the interval did not remain $0$ after the transformation. Where did I go wrong, and what are the real equations that transform between the reference frames moving relative to each other in two spacial dimensions? If the math is too complicated for a sophomore high school student to understand, I would still love a conceptual explanation about what my approach missed. I hope this question makes sense.

$\endgroup$
7
  • $\begingroup$ Have you seen that when a transformation acts on $[t, x, y]$ as a matrix (as Lorentz does) then $R M R^{-1}$ represents this transformation in a new basis? You'll want something like $R = [[1, 0, 0], [0, \cos \theta, \sin \theta], [0, -\sin \theta, \cos \theta]]$ where $\theta$ is the angle between the x-axis and the direction the thing is actually moving in. $\endgroup$ Commented Jul 2, 2021 at 0:16
  • $\begingroup$ @ConnorBehan, Thank you, but unfortunately, I haven't learned about matrices yet, and I think my trigonometry is pretty limited. Could you possibly tell me what I've missed in a more conceptual way? $\endgroup$
    – L.B.
    Commented Jul 2, 2021 at 0:28
  • 1
    $\begingroup$ It is basically not practicable to do Lorentz transformations in more than 1+1 dimensions without using matrices. $\endgroup$
    – Buzz
    Commented Jul 2, 2021 at 0:38
  • $\begingroup$ @Buzz Thanks. Maybe I'm being redundant, but I would love a brief explanation for why the the math gets more complicated with the added dimension. Is there another effect of relative motion in special relativity that complicates things? Have I assumed something that is wrong? $\endgroup$
    – L.B.
    Commented Jul 2, 2021 at 0:44
  • $\begingroup$ If you are just interested in velocity boosts along a single direction, you can choose that direction to define your $x$-axis. However, as soon as you start combining Lorentz boosts along different velocity directions or boosts in combination with rotations, things get more complicated, because the transformations do not commute, and the only natural way to describe this is via matrices. $\endgroup$
    – Buzz
    Commented Jul 2, 2021 at 0:55

3 Answers 3

13
$\begingroup$

Since you're trying to understand, I'll show you how to figure it out, instead of just telling you the answer. You don't need to know what a matrix is, and you don't need trigonomotry. Algebra is sufficient.

To avoid unenlightening distractions, I'll using units in which the speed of light is $c=1$. If you want to restore factors of $c$, just replace every $t$ in the following equations with $ct$. A Lorentz transformation about the origin is a linear transformation that leaves the quantity $$ t^2-(x^2+y^2+z^2) \tag{1} $$ unchanged. I'll show a couple of examples, and I'll show how to combine them to generate more examples — including the one you asked for.

Ordinary rotation

One example of a Lorentz transformation is an ordinary rotation that mixes two spatial coordinates: \begin{align} t &\to t\\ x &\to ax-by \\ y &\to bx+ay \\ z &\to z \tag{2} \end{align} with $$ a^2+b^2=1 \tag{3} $$ Equation (3) is the same as saying $a=\cos\theta$ and $b=\sin\theta$ for some $\theta$, but we don't need to know that. Using (3), you can confirm that the replacement (2) leaves (1) unchanged: $$ t^2-\big((ax-by)^2+(bx+ay)^2+z^2\big) = t^2-(x^2+y^2+z^2). \tag{4} $$

Boost

Another example of a Lorentz transformation is a boost that mixes one spatial coordinate with the time coordinate: \begin{align} t &\to At+Bx\\ x &\to Bt+Ax \\ y &\to y \\ z &\to z \tag{5} \end{align} with $$ A^2-B^2=1. \tag{6} $$ Equation (6) is the same as saying $A=\cosh\theta$ and $B=\sinh\theta$ for some $\theta$, but we don't need to know that. Using (6), you can confirm that the replacement (5) leaves (1) unchanged: $$ (At+Bx)^2-\big((Bt+Ax)^2+y^2+z^2\big) = t^2-(x^2+y^2+z^2). \tag{7} $$

Composing Lorentz transformations

Here's the key: given any two transformations that both leave (1) unchanged, their composition clearly also leaves (1) unchanged. "Composition" just means making one replacement after the other. In particular, making the replacement (5) followed by the replacement (2) is equivalent to making the single replacement \begin{align} t &\to At+B(ax-by)\\ x &\to Bt+A(ax-by) \\ y &\to bx+ay \\ z &\to z. \tag{8} \end{align} Even better, we can first do (2), then do (5), and then do (2) with the opposite sign for $b$. The result is \begin{align} t &\to At+B(ax+by)\\ x &\to a(Bt+A(ax+by))-b(ay-bx) \\ y &\to b(Bt+A(ax+by))+a(ay-bx) \\ z &\to z \tag{9} \end{align} There you go. That's a Lorentz boost along an arbitrary direction in the $x$-$y$ plane. Notice that it leaves two of the original spatial directions unchanged, namely $z$ and $ay-bx$, so it qualifies as a boost in the $ax+by$ direction.

Relating $A,B$ to velocity

I used a notation that highlights how simple the concepts are. To relate my quantities $A,B$ to the velocity of the boost, use $v=B/A$. In the case (9), the components of the velocity are $v_x=Ba/A$ and $v_y=Bb/A$. If you want to use Standard International units, just multiply these velocities by $c$.

Perspective

At the beginning of this answer, I said that you don't need to know what a matrix is, and you don't need trigonomotry. Maybe I should have said that you already know everything you need to know about matrices and trigonometry! The idea of composing two linear transformations to get another linear transformation is what matrix algebra is all about. Going from (2) and (5) to (8) is an example of matrix multiplication, even though we didn't use matrix notation. Equation (3) is the foundation for ordinary trigonometry, and equation (6) is the foundation for hyperbolic trigonometry.

Regarding Lorentz boosts in arbitrary directions, you might be interested in the related question Lorentz boost matrix for an arbitrary direction in terms of rapidity, but you already know everything you need to know: start with the boost (5), and compose it with whatever rotation(s) you want to point the velocity in the desired direction.

$\endgroup$
4
$\begingroup$

The very first step, where you assume that you can apply the transform simultaneously in two directions is wrong. If you first transformed from $(t,x,y,z)$ to $(t',x',y',z')$, and then applied a second boost with the new $t'$, then at least you'd know that your result was physical, even if the overall boost you get is incorrect. But you tried to smoosh the two boosts together into one step and there's no guarantee that what you got has any physical meaning. Now, even if you do the two boosts separately according to your plan, you end up in a frame moving at the wrong velocity, but at least it is a well defined frame, so we can see what went wrong.

Consider what it means to boost up to a velocity. It means that an object that previously looked to you as moving with that velocity now appears to be at rest. Say you have such a "pacemaker" set up, so that it passes the point $(t_0,x_0,y_0,z_0)=(0,0,0,0)$ in your original coordinates and later passes $(t,x,y,z)=(1\,\mathrm{s},v_x\cdot1\,\mathrm{s},v_y\cdot1\,\mathrm{s},0).$ After your two boosts, you want $(x'',y'',z'')=(0,0,0)$ (you don't care about $t''$). Ok, so perform your first boost in the $x$ direction with velocity $v_x.$ What do you see now? I get that the pacemaker event now occurs at $(t',x',y',z')=(\sqrt{1-\frac{v_x^2}{c^2}}\,\mathrm{s},0,v_y\cdot1\,\mathrm{s},0).$ Note that the reference point $(t_0,x_0,y_0,z_0)=(0,0,0,0)$ remains invariant. If you calculate the remaining velocity of the object in the $y$ direction, which is $\frac{y'}{t'},$ you see it has increased. So the issue with your original plan is that, once you perform the first boost, the time dilation of the intermediate frame causes the velocity you have to match with your second boost to increase. Perform the second boost with the adjusted velocity and you will see the object come to rest as desired.

So what have we learned? If you want to boost up to velocity $(v_x,v_y,0),$ you can first apply a boost in the $x$ direction of $v_x$, and then apply one in the $y$ direction with the adjusted velocity $\frac{v_y}{\sqrt{1-\frac{v_x^2}{c^2}}}$. This will still not be quite right, because now your frame will suffer Wigner rotation: the $x''$ and $y''$ axes are unexpectedly no longer parallel to the original $x$ and $y$ axes, which can be avoided by using a more complicated derivation for the boost (as indicated in the comments), but if you don't care about that then this might be fine.

Try your hand at extending this scheme to all three dimensions.


Here follows a somewhat overkill derivation of boosts in general directions. If what you say about your skill is true, you won't understand this right now, but just consider it something of a goal.

The matrix $$\boldsymbol\eta=\begin{bmatrix}-1&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&0&1\end{bmatrix}$$ is of fundamental importance to special relativity. Consider a vector quantity $\mathbf{x}=\begin{bmatrix}ct&x&y&z\end{bmatrix}^T$ measured in some reference frame. The quantity $\mathbf{x}^T\boldsymbol\eta\mathbf{x}=-c^2t^2+x^2+y^2+z^2$ remains the same in every reference frame, even as the components of the vector change. Specifically, a vector quantity's components change under a Lorentz transformation as $\mathbf{x}'=\mathbf{\Lambda x},$ where $\mathbf{\Lambda}$ is a matrix associated with the transformation. Then we must have $\mathbf{x}^T\mathbf{\Lambda}^T\boldsymbol\eta\mathbf{\Lambda x}=\mathbf{x}'^T\boldsymbol\eta\mathbf{x}'=\mathbf{x}^T\boldsymbol\eta\mathbf{x},$ and so we find that Lorentz transformations are represented by matrices where $\mathbf{\Lambda}^T\boldsymbol\eta\mathbf{\Lambda}=\boldsymbol\eta.$

Now, we bring in the big guns: the theory of matrix Lie groups. We ask, what do really small Lorentz transformations look like? Well, they should look like $\mathbf{\Lambda}=\mathbf{I}+\epsilon\mathbf{H},$ where $\mathbf{I}$ is the identity matrix, which does nothing when multiplied onto anything else, $\epsilon$ is a small real number controlling the size of the transformation, and $\mathbf{H}$ represents the "kind" or "essence" of the transformation. Plug that into the defining $\mathbf{\Lambda}^T\boldsymbol\eta\mathbf{\Lambda}=\boldsymbol\eta$ and get $\epsilon\boldsymbol\eta\mathbf{H}+\epsilon\mathbf{H}^T\boldsymbol\eta+\epsilon^2\mathbf{H}^T\boldsymbol\eta\mathbf{H}=0.$ Drop the $\epsilon^2$ term (we're only interested in linear/first-order variations) and rearrange to $\mathbf{H}=-\boldsymbol\eta\mathbf{H}^T\boldsymbol\eta.$ Now eliminate as many equations as you can to arrive at $$\mathbf{H}=\begin{bmatrix}0&\xi_x&\xi_y&\xi_z\\\xi_x&0&-\theta_z&\theta_y\\\xi_y&\theta_z&0&-\theta_x\\\xi_z&-\theta_y&\theta_x&0\end{bmatrix}.$$

This is the generator of Lorentz transformations. The remaining variables are just the ones that you don't end up solving for. If you plug in values for them, you can get a Lorentz transformation with the matrix exponential $\mathbf{\Lambda}=\exp(\mathbf{H}).$ By plugging in zero for all but one of them, you can come to understand what each one means. I have thus named them according to function: $\xi_x,\xi_y,\xi_z$ are "rapidities" for boosts, related to velocities by $\xi=\operatorname{artanh}(\frac{v}{c})$), and $\theta_x,\theta_y,\theta_z$ are rotation angles.

Now we have a way to make a formula for all boosts. If you want to boost with velocity $v_x,v_y,v_z$, first calculate the total velocity $v=\sqrt{v_x^2+v_y^2+v_z^2}$ and rapidity $\xi=\operatorname{artanh}(\frac{v}{c}).$ Then find the rapidity vector $\xi_i=-v_i\frac{\xi}{v}.$ Plug these rapidities into $\mathbf{H}$ and null out the spatial rotations. This matrix has the following properties: $$\begin{align}&\mathbf{H}^0=\mathbf{I},\\&\mathbf{H}^{2n+1}=\xi^{2n}\mathbf{H},\\&\mathbf{H}^{2n+2}=\xi^{2n}\mathbf{H}^2.\end{align}$$ These are useful because $$\begin{aligned}\mathbf{\Lambda}&=\exp(\mathbf{H})=\sum_{k=0}^\infty\frac{\mathbf{H}^k}{k!}\\&=\mathbf{I}+\sum_{k=0}^\infty\frac{\mathbf{H}^{2k+2}}{(2k+2)!}+\sum_{k=0}^\infty\frac{\mathbf{H}^{2k+1}}{(2k+1)!}\\&=\mathbf{I}+\frac{\mathbf{H}^2}{\xi^2}\sum_{k=0}^\infty\frac{\xi^{2k+2}}{(2k+2)!}+\frac{\mathbf{H}}{\xi}\sum_{k=1}^\infty\frac{\xi^{2k+1}}{(2k+1)!}\\&=\mathbf{I}+\frac{\mathbf{H}}{\xi}\sinh\xi+\frac{\mathbf{H}^2}{\xi^2}(\cosh\xi-1).\end{aligned}$$ This finally gives an explicit expression for the Lorentz transformation matrix $\mathbf{\Lambda}$ associated with a given velocity.

Expanded out in terms of the coordinates and with all the hyperbolic trig simplified away, $$\begin{alignat}{10} t'&=&\gamma t&{}+{}&-\gamma \frac{v_x}{c^2}x&{}+{}&-\gamma \frac{v_y}{c^2}y&&-\gamma \frac{v_z}{c^2}z&,\\ x'&=&-\gamma v_xt&{}+{}&\left(1+(\gamma-1)\frac{v_x^2}{v^2}\right)x&{}+{}&\frac{v_xv_y}{v^2}(\gamma-1)y&{}+{}&\frac{v_xv_z}{v^2}(\gamma-1)z&,\\ y'&=&-\gamma v_yt&{}+{}&\frac{v_xv_y}{v^2}(\gamma-1)x&{}+{}&\left(1+\frac{v_y^2}{v^2}(\gamma-1)\right)y&{}+{}&\frac{v_yv_z}{v^2}(\gamma-1)z&,\\ z'&=&-\gamma v_zt&{}+{}&\frac{v_xv_z}{v^2}(\gamma-1)x&{}+{}&\frac{v_yv_z}{v^2}(\gamma-1)y&{}+{}&\left(1+\frac{v_z^2}{v^2}(\gamma-1)\right)z&\end{alignat}\\ \left(\gamma=\frac{1}{\sqrt{1-\frac{v^2}{c^2}}}\right).$$

This boost does not cause any unwanted rotations. But good luck remembering it!

$\endgroup$
0
$\begingroup$

Consider the simple case of a single inertial observer, sitting alone in space without accelerating.

Coordinates like $x, y, z,$ and $t$ are simply labels for different points/events in spacetime (we call them "points" if we only care about $x, y, z$, and "events" if we care about $x,y,z,t$). Choose the labels in certain ways, and they have convenient properties, like the (spacial) distance between two points is $\sqrt{\Delta x^2 + \Delta y^2 + \Delta z^2}$. It is usual to pick coordinates that are "orthonormal", which basically means that each coordinate axis is perpendicular (orthogonal) to the others, and a unit distance along one axis is the same distance along another axis. It makes the distance rule I gave above work.

Since reality doesn't work in terms of coordinates,it is possible for our single observer to have and use multiple different coordinate systems, depending on what is most convenient for them. They could have two orthonormal spacetime coordinate systems with different origins, or different spacial axis (as long as in each set, they are orthogonal and normalized), or different "parity", and the two systems would be related to each other by spacial rotations, spacetime translations, and reflections. Given the coordinates $(x,y,z,t)_1$ of a event in coordinate system 1, you could easily compute the $(x,y,z,y)_2$ of the event in coordinate system 2.

What this means is that if our single observer sees an object moving with a $v_x \neq 0, v_y \neq 0$, then our single observer also has available to them a different coordinate system, gotten by rotation from the first one, where $v_x \neq 0, v_y = v_z = 0$. The equations of motion of that object are completely compatible with each other in either coordinate system.

You could also think of it as two observers motionless relative to each other who each have their own orthonormal coordinate system. If Alice sees and event at $(x,y,z,t)_{Alice}$, she can tell Adam to look at $(x,y,z,t)_{Adam}$ to see the same event.

Now consider your situation, Alice and Bob are moving relative to each other. The Lorenz Transformation, as normally written with $(x,y,z,t)_{Alice'} = (\gamma (x-vt),y,z,\gamma(tc^2 - vx)/c^2)_{Bob'}$, already assumes that both Alice and Bob have applied rotations and translations (and possibly a reflection) to their own preferred coordinate system to get a pair of coordinate systems that share an origin (in both space and time) and at time 0 all three sets of spacial axis correspond.

It is perfectly reasonable to start with coordinates in system Alice, convert to Alice' using rotations and translations, convert that to Bob' using a Lorenz boost, and then convert finally to coordinate system Bob to see where Bob should look for the event.

Each of those coordinate transformations is a messy calculation. To teach Special Relativity, a lot of the mess can be avoided by only working with Alice' and Bob' -- and perhaps Charlie', who happens to be on a parallel course with Alice' and Bob'. It was generally assumed that the rotations and translations were nothing new to the physicists first learning SR, and thus were unimportant to mention except in passing.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.