22
$\begingroup$

It's common for physicists to say that not every 3-tuple of real numbers is a vector:

“Well, isn’t torque just a vector?” It does turn out to be a vector, but we do not know that right away without making an analysis.... because force is a vector it transforms into the new system in the same way as do $x$, $y$, and $z$, since a thing is a vector if and only if the various components transform in the same way as $x$, $y$, and $z$.

Richard Feynman

You might be inclined to say that a vector is anything that has three components that combine properly under addition. Well, how about this: We have a barrel of fruit that contains $N_x$ pears, $N_y$ apples, and $N_z$ bananas. Is $N = N_x\hat{x} + N_y\hat{y} + N_z\hat{z}$ a vector? It has three components, and when you add another barrel with $M_x$ pears, $M_y$ apples, and $M_z$ bananas the result is $(N_x + M_x)$ pears, $(N_y + M_y)$ apples, $(N_z + M_z)$ bananas. So it does add like a vector. Yet it’s obviously not a vector, in the physicist’s sense of the word, because it doesn’t really have a direction. What exactly is wrong with it?

David J. Griffiths, Introduction to Electricity and Magnetism, 1.15

Interestingly, Griffiths' example of apples and bananas not being a vector is almost identical to Strang's example of what a vector is:

"You can't add apples and oranges." In a strange way, this is the reason for vectors. We have two separate numbers $V_I$ and $V_2$. That pair produces a two-dimensional vector $v$.

Gilbert Strang, Introduction to Linear Algebra, 1.1

What do physicists mean then when they say a certain $(x,y,z)$ isn't a vector? Certainly any element of $\mathbb R^3$ is a vector.

The sources talk about being the same under coordinate transforms: but of course any transform of $(x,y,z)$ yields a particular $(x', y', z')$ value. What does it mean for a 3-tuple to be invariant under transform?

Is there a way of expressing the physicists' statement clearly and precisely? How can I test if a certain tuple (or function) "is" a vector? Can this statement be put into mathematical language: $f: \mathbb R^3 \to \mathbb R^3$ is a physicists' vector if...

$\endgroup$
11
  • 9
    $\begingroup$ When physicists refer to vector/tensors transforming in certain ways under coordinate transformations, they are referring to vector/tensor fields. It is necessary for a "physical" vector field to respect symmetries of the physical world, such as rotation or translation invariance, or Lorentz transformations in relativistic spacetime. c.f. physics.stackexchange.com/questions/406408/… $\endgroup$
    – whpowell96
    Commented Dec 19, 2023 at 21:03
  • 13
    $\begingroup$ Old physics joke: “a vector is something that transforms like a vector.” $\endgroup$ Commented Dec 19, 2023 at 21:30
  • 2
    $\begingroup$ @spaceisdarkgreen Wasn't a joke when I studied physics. That was literally the definition we were given... $\endgroup$
    – Digitallis
    Commented Dec 19, 2023 at 21:35
  • $\begingroup$ Griffiths vectors are $p+q=1$ tensors that transform covariantly if $p=0,\,q=1$ or contravariantly if $p=1,\,q=0$; Strang vectors are elements of vector spaces. $\endgroup$
    – J.G.
    Commented Dec 19, 2023 at 21:54
  • $\begingroup$ @Digitallis that may explain why the professor didn’t crack a smile. $\endgroup$ Commented Dec 20, 2023 at 0:01

5 Answers 5

18
$\begingroup$

As was correctly indicated by @whpowell96, for a physicist a vector is not just an element of a linear space, but rather a pair $(\text{transformation group}, \text{element of a vector space on which the group acts})$, and the group must act on the space in a particular manner (mathematically speaking, I suppose, the action must be by the fundamental representation of the group). This means, in particular, that if a physical quantity is a vector depends not only on the quantity itself, but also on the transformation group you are concerned with.

Take, for instance, proper rotations of $\mathbb{R}^3$ (i.e. the regular rotations you can perform on a physical object), which form a group called $SO(3)$. Consider the speed $\vec{v}$ of a charged object with a charge $q$ moving in space, the constant external magnetic field $\vec{B}$ acting on the object, and the Lorentz force ${\vec{F}}$ in which this action results. The speed, the field and the force all have three components in a particular coordinate system, $\vec{v}=(v_1, v_2, v_3)$, $\vec{B}=(B_1, B_2, B_3)$, $\vec{F}=(F_1, F_2, F_3)$. The force is given by the equation $$\vec{F}=q(\vec{v} \times \vec{B})$$ where $\times$ is the vector product.

Suppose we rotate in space, our rotation being described by an orthogonal matrix $R$ with unit determinant (i.e., a proper rotation matrix): first we measured all quantities with respect to a set of three orthogonal vectors $\vec{e_1}, \vec{e_2}, \vec{e_3}$, now we want to take measurements with respect to a new triple of orthogonal vectors of the same length, but differently oriented: the new (basis) vectors are $Re_{1}$, $Re_2$, $Re_3$. The speed of an object is measured as change in position per unit time: in old coordinates the object has moved by $\Delta x_{1} \cdot e_{1} + \Delta x_{2} \cdot e_{2} + \Delta x_{3} \cdot e_{3}$.

Express the new coordinates through the old: $${e_i}' = Re_i \iff e_{i} = R^{-1}{e_i}'$$

The key conservation is that the displacement of the object hasn't changed, the way we describe it has, that is what's meant by the phrase "a vector is invarinat with respect to a transformation" - the transformations describe the way we observe the universe, in this particular example, our spatial orientation. In different coordinate systems we describe differently the same objects. So the displacement vector in the old coordinates equals the displacement vector in the new coordinates:

$$\Delta x_{1}'\cdot \vec{e}'_{1} + \Delta x_{2}'\cdot \vec{e}'_{2} + \Delta x_{1}'\cdot \vec{e}'_{1} = \Delta x_{1}\cdot \vec{e}_{1} + \Delta x_{2}\cdot \vec{e}_{2} + \Delta x_{3}\cdot \vec{e}_{3}$$

Since we know how to express the new basis vectors in terms of the old, we can determine the new coordinates:

$$\sum_{i=1}^{3} \Delta x_{i} \cdot \vec{e}_{i} = \sum_{i=1}^{3} \Delta x_{i} \cdot R^{-1}\vec{e}'_{i}= \sum_{i=1}^{3} \Delta x_{i}' \cdot e'_{i}$$

Note that $R^{-1}e_{i}' = \sum_{k} R^{-1}_{ki}e_{k}'$ where the first index enumerates the row, and the second enumerates the column. Physicists are lazy, so instead of writing down sums they usually assume all indices that repeat twice in a monomial are summed over, and write down such an expression as $$\sum_{k} R^{-1}_{ki}\vec{e}_{k} := R^{-1}_{ki}\vec{e}_{k}$$ - that's called Einstein's convention, and I will use it form now on. The summation over a repeated pair of indices is called contraction.

The previous expression is then written down as $$\Delta x_i \cdot \vec{e}_{i} = \Delta x_{i} \cdot R^{-1} \vec{e}'_{i} = \Delta x_{i} \cdot R^{-1}_{ki}\vec{e}'_{k} = \Delta x_{i}' \cdot \vec{e}'_{k}$$

and now it is easy to see that the new coordinates of the displacement vector $\Delta x_{i}'$ are given by the sum $\Delta x_{i} R_{ki}^{-1}$. First, we've expressed new coordinates in terms of old, second, since $R_{ki}$ are just numbers, we can rewrite the sum as $R_{ki}^{-1} \Delta x_{i}$ (this is a general thing: in Einstein notation multipliers with indices inside monomials become commutative), third, we note that $R_{ki}^{-1}\Delta x_{i}$ is matrix multiplication $R^{-1}\Delta x_{i}$. This is, of course, a familiar fact from linear algebra I: if a basis is changed by a linear transform $R$, the coordinates of vectors change by $R^{-1}$. The magnetic field components, of course, change in the same manner, as well as the components of the Lorentz force, so we say that all three are $SO(3)$-vectors: because their components change like the components of the coordinate vector when we rotate our coordinate system. The tuple $(\text{#bananas}, \text{#pears}, \text{#apples})$ is not an $SO(3)$-vector: the number of bananas, pears and apples does not change when we rotate, and bananas, apples and pears do not transform one into another. It is a tuple of $SO(3)$-scalars.

Now, suppose we look at the world through a looking glass, and want to describe the same system. The coordinate transform corresponding to this observation method is reflection with respect to a plane; such reflections are described by so-called Hausholder matrices. It is easy to see that the reflection with respect to a plane is linear, orthogonal (it conserves lengths of vectors), but is not described by a proper rotation: you can't rotate your right hand in such a way that it would look like a left hand. Reflections thus constitute a part of a larger group, $O(3)$, the group of all orthogonal transformations, of which $SO(3)$ is a subgroup. One can also show that all orthogonal transformations are combinations of reflections with respect to one of coordinate axes and of proper rotations.

So, how do the three quantities change under reflections? Take, for instance, reflection with respect to the $x$ axis: $P_x: (\vec{e}_{1}, \vec{e}_{2}, \vec{e}_{3}) \rightarrow (-\vec{e}_{1}, \vec{e}_{2}, \vec{e}_{3})$. It is easy to see that the $x$-component of the coordinate vector changes its sign. Remember that Lorentz force is described as $\vec{F}= q\cdot [\vec{v}\times \vec{B}]$. When we reflect two vectors $\vec{B}$ and $\vec{v}$ with respect to a plane, the orientation of the pair $(\vec{B}, \vec{v})$ changes, if the vectors are not very special (are not both in the reflection plane, and are not collinear). According to the Lorentz law, the force must now act in the direction opposite to the one really observed. Hence, we should either change the Lorentz law, or state that one of the quantities involved in its calculation changes the sign when mirrored. The displacement vector, evidently, does not change the sign, and, using the Coulumb law, one can check that this is not the case for the electric charge, either. We must conclude it is magnetic field that changes its sign; besides, one can consider the magnetic field generated by a loop of electric current (see the Wikipedia illustration). Therefore, we conclude that velocity and Lorentz force are $O(3)$ vectors, while magnetic field is not, despite being $SO(3)$-vector. Such a kind of $SO(3)$-vectors that change sign under reflection is called "pseudovectors". In particular, the vector product of a pair of $O(3)$ vectors is always a pseudovector: take, for instance, the angular momentum vector, and observe how it changes when you look at a rotating object through a mirror.

In special relativity, even velocity is not a vector anymore: if you want to describe the universe moving with a speed close to the speed of light relative to it, or describe objects which are themselves moving very fast with respect to you (which is, of course, the same thing), you need to work with a four-dimensional vector called 4-velocity, which is a vector with respect to a group of transformations called the Lorentz group, or $O(3,1)$; the Lorentz group contains $O(3)$ as a subgroup, the first component of the 4-velocity can be thought of as "the speed of moving forward in time". It is similar to the rotation group in the sense that it consists of transformations of spacetime that conserve a certain kind of "length" of vectors (called interval), but this relativistic "length" can be zero or even negative, since vectors oriented "along the time coordinate" contribute to this "length" with a negative sign.

The usual rotations of coordinate frame with zero speed relative to the objects you describe are given by transformations of the form

$$1 \oplus SO(3)$$

while so-called Lorentz boosts, that describe how you observe the universe moving with a nonzero speed with respect to it, are given by a $4 \times 4$ matrix that "mixes up" the time and the spatial components of the 4-vector. Another example of a 4-vector is the vector with components $A^{\nu} = (\phi, \vec{A})$, where $\vec{A}$ is the vector potential, a quantity such that $\nabla \times \vec{A} = \vec{B}$, and $\phi$ is the elecrtic potential. If you want to describe the 4-potential in a moving reference frame, you multiply the 4-potential in the still reference frame by the boost matrix $\Lambda^{\mu}_{ \ \nu}$, which is a function of 3-velocity, and obtain the components of the 4-potential in the moving frame: $$A'^{\mu} = \Lambda^{\mu}_{ \ \nu}A^{\nu} \text{ Remember the Einstein's convention}$$.

The $\vec{E}$ electric field $O(3)$-vector and the $\vec{B}$ magnetic field $O(3)$-pseudovector are not vectors under the Lorentz group: they are actually components of the electromagnetic field tensor: a $4 \times 4$ matrix $F_{\mu \nu}$, which changes under Lorentz transformations according to the tensorial transformation law: $$F_{\mu \nu}' = \Lambda^{\mu'}_{ \ \mu}\Lambda^{\nu'}_{ \ \nu}F_{\mu' \nu'}$$. In general, physicists describe tensors as objects with multiple indices ("$n$-dimensional matrices"), the components of which transform under a group action by contracting the transformation matrices by one of the indices with all indices of the tensor. Scalars are tensors of rank 0, vectors are tensors of (total) rank 1, matrices are tensors of (total) rank 2, and so on. Generally, there is more then one type of tensors per each total rank: for example, tensors on the Lorentz group can have two type of indices, "higer" and "lower", which correspond to "contravariant" and "covariant" quantities, but that is a topic for a different discussion. For $SO(3)$-tensors, there is only one type of indices.

$\endgroup$
11
  • $\begingroup$ Wow, this is a very comprehensive answer, but of course involves some very advanced concepts. Yet physicists mention the concept of "being a vector" quite casually. Is there any way to explain their concept with simpler mathematical machinery? $\endgroup$ Commented Dec 20, 2023 at 2:30
  • $\begingroup$ TLDR: when you change the method of observation, your description of a system changes. If your description of a system involves a tuple of numbers, and to each change of method of observation corresponds a matrix that acts on this tuple, and you can add such tuples component-wise and multiply them by scalars, then we say that such a tuple is a physicist's vector on the group which the changes of methods of observation form. The same tuple of physical quantities can be a vector on one group and not a vector on another group. For instance, angular momentum is $SO(3)$, not $O(3)$ vector. $\endgroup$ Commented Dec 20, 2023 at 3:00
  • $\begingroup$ In fact, in mathematics there are two slightly different notions of a vector as well, one of which is a prototypical example of a physicist's vector. The one is just "an element of a vector space". The other is used when talking about tensor products of vector spaces and dual spaces. Consider a vector space $V$. If you change the basis in $V$ by a matrix $M$, the components of vectors in $V$ change by $M^{-1}$ (although the vectors themselves do not change). Consider then the tensor product $V \otimes V$. Having a basis $e_{i}$ in $V$, you can naturally choose the basis $e_{i} \otimes e_{j}$ $\endgroup$ Commented Dec 20, 2023 at 3:26
  • $\begingroup$ in $V \otimes V$. If you change the basis in $V$ with an operator $M$, the corresponding basis in $V \otimes V$ is changed by $M \otimes M$, and the coordinates change by $M^{-1} \otimes M^{-1}$. We say that objects of $V \otimes V$ are $(0, 2)-$tensors. Note that they are formally vectors over $\mathbb{K}$ as well, but in this specific sense when change of coordinates in $V \otimes V$ is defined through the change of coordinates in $V$, we say that they are not $GL(V)$-vectors, and reserve this term to the elements of $V$. Likewise, if you define the basis in the dual space $V^*$ through $\endgroup$ Commented Dec 20, 2023 at 3:34
  • 1
    $\begingroup$ @DaigakunoBaku your TLDR is often TLDR'd further as "a vector is something that transforms like a vector". $\endgroup$
    – hobbs
    Commented Dec 20, 2023 at 5:42
5
$\begingroup$

For a mathematician, linear algebra starts with a vector space. Vectors are elements of this space. Sometimes a finite basis exists, then one can consider the dual basis to it as coordinate functions or functionals and consider the coordinate vector. If the basis changes in some linear way, the dual basis and thus the coordinate vector changes in the opposite way, thus the naming of the coordinate vector as contravariant.

For a physicist the coordinate vector is the primary object, together with its changes under coordinate transformations. There is some awareness that there is a vector space in the background, but usually that is far in the background. For a single vector this is trivial, but if you take a bunch of coordinate vectors and do some construction with them, then it is not immediately clear if the result at the end is still an element of the original vector space. The test for that is the behavior under coordinate transformations, that is, one takes the original vectors, applies a coordinate change, does the same construction and checks if the result shows the same coordinate change.

Seen this way, even the elementary operations are suspect at first. Adding two vectors is a construction in this sense and one would have to test the sum in under coordinate changes (which makes for a trivial calculation, but still ...).

After the start this is easier than the mathematics version for some way, but gets more complicated if there are more than two different vector spaces in play simultaneously ("space" and two Lie algebras for example).

$\endgroup$
1
$\begingroup$

Going even more into physics (only for flat space which is in the scope of your question). Put an pencil on your table with some angle. This is the most fundamental vector in (flat)physics, it has a direction and a length (and is real).

Now put a (orthogonal) coordinate system on your table and the z-axis perpendicular to your table. (Like three rulers). Now measure that vector (difference of pencil tip coordinates to coordinates of base of pencil). This is the coordinate reprensation of the pencil-vector with respect to the coordinate system.

Now take three new rulers, create another orthogonal coordinate system, get the new coordinates of the pencil-vector, and also the coordinates of the new rulers with respect to the old rulers. You find that the new pencil coordinates can be expressed with the rulers coordinates and the old pencil coordinates (you know these formulas).

So far so good, so if I have a physical vector it obey the transformation formula.

Now I do some fancy calculations, definitions etc (like calculate velocity of some real trajectory), and I get 3 numbers for each chosen coordinate system. Now the question is: can I put these three number into the coordinate-system such that is a 'real' physical 'pencil' like vector. Or do I have to move it with my hand if I want to change coordinate system (then of course it is not).

Like (0,0,1) is only a vector if it changes appropriately in each coordinate system (so it can stay fixed in your space). Or more instructive: angular momentum, almost behaves like a vector (is called a pseudovector) under any change of coordinate system except handed-ness.

$\endgroup$
1
$\begingroup$

At the request of the post creator, I compile my comments to the previous answer in a separate entry. This was originally a "TLDR" of the previous comment, to which I later added a different perspective on the topic.

When you change the method of observation, your description of a system changes. If your description of a system involves a tuple of numbers, and to each change of method of observation corresponds a matrix that acts on this tuple, and you can add such tuples component-wise and multiply them by scalars, then we say that such a tuple is a physicist's vector on the group which the changes of methods of observation form. The same tuple of physical quantities can be a vector on one group and not a vector on another group. For instance, angular momentum is $SO(3)$, not $O(3)$ vector.

In fact, in mathematics there are two slightly different notions of a vector, one of which is a prototypical example of a physicist's vector. The one is just "an element of a vector space". The other is used when talking about tensor products of vector spaces and dual spaces. Consider a vector space $V$. If you change the basis in $V$ by a matrix $M$, the components of vectors in $V$ change by $M^{-1}$ (although the vectors themselves do not change). Consider then the tensor product $V⊗V$. Having a basis $e_{i}$ in $V$, you can naturally choose the basis $e_{i}⊗e_{j}$ in $V⊗V$. If you change the basis in $V$ with an operator $M$, the corresponding basis in $V⊗V$ is changed by $M⊗M$, and the coordinates change by $M^{−1}⊗M^{−1}$. We say that objects of $V⊗V$ are $(0,2)$−tensors. Note that they are formally vectors over $\mathbb{K}$ as well, but in this specific context when change of coordinates in $V⊗V$ is defined through the change of coordinates in $V$, we say that they are not $GL(V)$-vectors, and reserve this term to the elements of $V$. Likewise, if you define the basis in the dual space $V^{∗}$ through the basis in $V$, using an inner product, then coordinates in $V^{∗}$ change by $M$ when you change the basis in $V$ by $M$. That is why we say that elements of $V^{∗}$ are $GL(V)$−covectors. The coordinates in $V^{*\otimes k} \otimes V^{\otimes l}$ are then acted upon by the operator $M^{⊗k}⊗(M^{−1})^{⊗l}$, and we say that elements of this space are $(k,l)$−tensors on $GL(V)$ because they satisfy this transformation law. Physicist's tensors are tensors in the same sense, but the transformation law is only demanded to be satisfied on a smaller subgroup of $GL(V)$.

For instance, an $SO(3)$-vector is a tuple of numbers which transforms by $M^{−1}$ if you change the basis of a vector space by $M∈SO(3)$. A $(1,1)$ $SO(3)$-tensor is a matrix $S$ that transforms as $S→M^{−1}SM$ when the basis is changed by $M$ (it transforms like an operator on $V$, if the basis of $V$ is acted upon by the matrices of proper rotations, and does not have to transform in the same way if the change of basis is from $GL(\mathbb{R}^3)$, but not from $SO(3)$).

I am afraid, the notion of fundamental representation (or at least irreducible representation) is required to define the physicist's notion of a vector strictly, because, strictly speaking, the example with bananas, pears and apple satisfies the other conditions of being a physicist's vector: you just define the trivial action of the group $SO(3)$ on this tuple that sends every element of $SO(3)$ into the unit matrix. What physicists usually say in response is that a vector is a quantity that changes like the coordinates under a change of basis - which the quantities of bananas, pears and apples do not. Of course, to define a vector in such a way, one first needs to define the action of a group on coordinate vectors, which is usually the fundamental representation of the group.

All groups physicists actually use on the introductory level are the ones for which this action is naturally defined. When the time comes for the groups that do not have an obvious action on the coordinate space, the patient is usually familiar enough with the concept to either grasp the notion of a tensor on an arbitrary group intuitively or to understand basic representation theory.

$\endgroup$
1
$\begingroup$

A linear space is a set of elements which, very concisely, can be summed and scaled. There are of course rules regarding the summing and scaling.

Tuples are not intrinsically elements of linear spaces. There are no distinguished summing or scaling operations on tuples. We can sometimes construct linear spaces whose elements are tuples, but there is no distinguished linear space which canonically owns any particular tuple.

But we often use tuples to help us work with or describe the elements of linear spaces. When we do so, we must always be clear about the precise way we are using tuples to describe the elements of linear spaces, because without that precision and context, the tuples lose all meaning.

An ordered basis of a linear space is a minimal sequence of elements of that linear space which are linearly independent and which span the space. That means that the only sequence of coefficients scaling these elements that allows their total sum to be the zero element is the all-zeros sequence of coefficients, and that any given element of the space is a total sum of the basis elements, scaled by some sequence of scalars. An ordered basis is exactly what can allow us to use tuples to describe the elements of linear spaces. A tuple of scalars, if they are taken as a sequence of coefficients scaling the elements of a given ordered basis, uniquely identifies the resulting sum, at least with respect to that ordered basis.

The free linear space of some given sequence of underlying elements and with respect to some given field of scalars is a linear space whose elements are each some abstract thing representing a sequence of counts, one count for each of the underlying elements, and where these elements can be summed and scaled in the natural way. This is exactly the type of idea which Griffiths describes mockingly but which Strang welcomes. For example, if the underlying set is the ordered set containing apple and orange, then the free linear space is the set containing elements, where each element is some number of apples and some number of oranges.

We sometimes call some linear space a vector space and its elements vectors, but there are no clear rules about this. Nevertheless, there are certain contexts where it is almost universal. In particular, we always use the word vector to denote geometrical objects forming a linear space which have geometrical magnitude and geometrical direction. For example, we use the word vector for the directed momentum of a classical particle.

In the case of a geometrical vector, we have many ways of describing it as a tuples, each way corresponding to a different choice of ordered basis. When physicists talk of vectors “transforming”, as Feynman, they mean that there is some formula, typically a matrix, which can convert between one and another tuple describing the same geometrical vector but corresponding to different choices of ordered basis.

When a physicist says that something is a vector, they might typically mean that it is an element of a geometrical, physically-meaningful linear space. That it has physically-meaningful magnitude and physically-meaningful direction within the geometrical context. A mathematician, though, might use the word vector space almost as a synonym for the phrase linear space.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .