7
$\begingroup$

I am on my way to general relativity, but I am struggling with the covariant derivative.

At this point I am trying to ignore the spacetime character of the world i.e. I am trying to understand what a covariant derivative means in an intrinsically curved space, without taking into account, that time is also effected by the curvature. I hope that it is possible to understand things in this simplified way, so that in the next step I can deal with time. If that isn't possible, so be it, but if you people see a way to explain things without taking the curvature of time into account, it would mean a lot to me, because it seems just less complicated this way.

My main problem with the covariant derivative arrises, when derivatives of basis vectors appear. Let's take the covariant derivative of a vector field $\vec{v}$ in direction of coordinate $x^i$:

$$\nabla_{\vec{e}_i}\vec{v}~=~\frac{\partial}{\partial x^i}\vec{v}~=~\frac{\partial}{\partial x^i}\left(v^j\vec{e}_j\right)~=~\frac{\partial v^j}{\partial x^i}\vec{e}_j+v^j\frac{\partial \vec{e}_j}{\partial x^i}$$

Now, what is to be understood by $\frac{\partial \vec{e}_j}{\partial x^i}$? I cannot really imagine, what a change of direction in curved space looks like, because I think it is necessary to have a straight line in order to define a change in direction as differing from that straight line...

elaboration:

Maybe I have to elaborate a little, to make clear what my problem is. In order to so, I have to give some credit to eigenchris from youtube, whos video series on tensor calculus I watched on my mission to understand the covariant derivative and whose sketches I use to formulate my question.

I already struggled with this question when thinking about it in flat space or on a curved 2-dimensional surface in flat 3-dimensional space. In those two cases I was able to understand, what it means, but unfortunately my reasoning does not work anymore, when space itself is curved and there is no higher dimensional flat space to help me. To make clear, what my problem is, I think it is necessary to go through my reasoning of those two cases which I think I understood:

two dimensional flat space:

In this video the covariant derivative in flat space was explained as just taking the ordinary derivative, but doing it properly (i.e. taking into account, that the derivatives of the basis vectors are not necessarily zero). For example in Cartesian and polar coordinates:

$$\frac{\partial \vec{e}_x}{\partial x}~=~\frac{\partial \vec{e}_x}{\partial y}~=~0~~~~~\text{but}~~~~~\frac{\partial \vec{e}_\theta}{\partial \theta}~=~-r\vec{e}_r~,~\frac{\partial \vec{e}_\theta}{\partial r}~=~\frac{1}{r}\vec{e}_\theta$$

enter image description here

Here the basis vectors are not normalized, so $\vec{e}_\theta = \partial \vec{R}/\partial \theta$, etc. This derivative can then be computed, by expanding $\vec{e}_\theta$ in Cartesian coordinates and using, that the Cartesian basis vectors are constant, which leads to the results on the right. So to show, that $\vec{e}_\theta$ is not constant, it was necessary to know, that $\vec{e}_x$ and $\vec{e}_y$ are constant.

At first this did seem weird to me. Why can I objectively say, that $\vec{e}_x$ is constant, but $\vec{e}_\theta$ is not? I can expand $\vec{e}_x$ in polar coordinates and suddenly it does not look constant at all. Now the solution of this is propably obvious: As soon, as I don't think about the vectors in purely abstract terms, it is clear, that $\vec{e}_\theta$ does physically change its direction, while $\vec{e}_x$ does not.

I could print out a large version of the coordinate systems in the picture above and put it on the floor of my room, with the origin being in the center. Now when I start walking in $\vec{e}_x$-direction and keep walking in a straight line, it does not really matter, from what point in my room I start walking. I can start from some point A and after some time I will arrive at say the football stadium. The next day I can start from a different point right next to A and I will still arrive at the football stadium. The two straight lines, that mark my ways on the two days are parallels. The distance between them does not change, so in the end I will arrive at points that are still right next to each other. That is not the case, if I follow the directions of $\vec{e}_\theta$ at two different points close to each other. In this case, starting from point A and going straight into the direction at which $\vec{e}_\theta$ points, might still take me to the football stadium, but starting from a point right next to A and following the dirction of $\vec{e}_\theta$ from there, might bring me to the cathedral. Basically I am saying: I can see the real difference in the change of $\vec{e}_\theta$, beacause I can attach a straight line, and see where it leads me.

The only problem is: How do I know, if I am following a straight line, while walking away from my room? In flat space and with Newtonian physics this is easy and there are many ways:

1) I can just trust my eyes: I keep the stadium in the center of my field of view. Because I know, that the light coming from the stadium moves on a straight line, I know, that I myself am moving on a straight line, when I always see the stadium right in front of me.

2) I could use Newton’s first law of motion: If I just accelerate one time at the beginning and there are no forces acting on me (neglecting friction, wind and so on) I can be sure, that I will not change direction and therefore move on a straight line.

3) I could take a string and attach one end to my room and the other one to the stadium. When the string ist streched, I know that the line is straigth, because a straight line is the shortest path between two points.

To sum up: When I want to know if a vector field is constant or not and I have been given the vector field in non-Cartesian coordinates, I have to take into account, that my basis vectors may be changing their direction, depending on their position in space. I can understand this, because I can grasp what changing direction means. And I can understand what changing direction means, because I can define straight lines.

A two dimensional curved surface in three dimensional flat space

The next step is to formulate derivative for people living on a curved surface, e.g. the earth. How would a constant vector field for somone living on the surface look like?

Looking from space, we see, that the two vectors on the left hand side in the picture above point in the same direction (e.g. some fixed star). But for someone on the surface those two vectors are very different because the one at the north pole points forward along the surface, but the one at the equator just points out of the surface. Walking down from the north pole to the equator, the vector field on the right hand side does look way more constant, than the one on the left. If the vector field is a kind of force, say the wind, it would have the same effect on the person every step of the way, namely providing some momentum by tailwind.

The covariant derivative takes this into account by subtracting the component normal to the surface from the rate of change of the vector field:

$$\nabla_{\vec{e}_i}\vec{v}=\frac{\partial \vec{v}}{\partial x^i}-\vec{n}~=~\left[\frac{\partial v^k}{\partial x^i}+v^j\Gamma^k_{ij}\right]\vec{e}_k$$

Where the $\Gamma^k_{ij}$ are the Christoffel-symbols which give the rate of Change of basis-vectors tangent to the surface:

$$\frac{\partial \vec{e}_j}{\partial u^i}~=~\Gamma^k_{ij}\vec{e}_k+L_{ij}\hat{n}$$

This does make sense to me. I can understand the rate of change of the basis vectors $\frac{\partial \vec{e}_j}{\partial u^i}$, because it happens in three dimensional falt space and I can use all the reasoning from flat space.

Intrinsically curved space

Now if I don't have any extrinsic dimension from which I can look at the curved surface, my reasoning does not work anymore. I cannot understand, what $\frac{\partial \vec{e}_j}{\partial x^i}$ would mean in curved space.

How would I know in curced space whether I am approaching my target on a straight line (without any change of direction on the way)? I cannot trust my eyes, because light itself travels on curved lines. I can't use Newtons laws, because in general relativity there is no force acting on the moon, but it still goes around the earth rather than travelling on a straight line away from it. I could find the shortest path, I think, but the length of a path depends on the speed with which one travels and even if there is one invariant shortest path, why would it make sense to call this one straight and define a change of direction as not following that path?

I would not know what it means, just to keep on walking in one direction in curved space. But if I cannot say what it means, not to change direction, than I can't understand what it means when the basis vectors do change direction.

Any help?

EDIT:

I have learned, that $\frac{\partial\vec{e}_j}{\partial x^i}$ is the rate of change of a basis vector, where a basis vector is defined to be constant, if it keeps being tangent to the same geodesic.

My problem now is, that I don't understand, where that definition comes into play. I think this must happen at some point, while finding the Christoffel symbols. It is:

$$\frac{\partial\vec{e}_j}{\partial x^i} \equiv \Gamma^k_{ij}\vec{e}_k$$

I am familiar with the following derivation of the Christoffel symbols $\Gamma^k_{ij}$:

$$\frac{\partial g_{ij}}{\partial u^k}~=~\frac{\partial}{\partial u^k}\left(\vec{e}_i\cdot\vec{e}_j\right)$$

$$~~~~~~~~~~~~~~~~~~~~~~=~\frac{\partial\vec{e}_i}{\partial u^k}\cdot \vec{e}_j+\vec{e}_i\cdot\frac{\vec{e}_j}{\partial u ^k}$$

$$~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~=~\Gamma^l_{ik}\left(\vec{e}_l\cdot\vec{e}_j\right)+\Gamma^l_{jk}\left(\vec{e}_i\cdot\vec{e}_l\right)$$

$$~~~~~~~~~~~~~=~\Gamma^l_{ik}g_{lj}+\Gamma^l_{jk}g_{il}$$

Now using the symmetry of the metric tensor and the Christoffel symbols in the lower indices one can show:

$$\Gamma^k_{ij}~=~\frac{1}{2}g^{kl}\left(\frac{\partial g_{li}}{\partial u^j }+\frac{\partial g_{jl}}{\partial u^i}-\frac{\partial g_{ij}}{\partial u^l }\right)$$

But I don't think that any physically relevant stuff is happening there. I rather feel like the choice, that $\frac{\partial\vec{e}_j}{\partial x^i}$ is the rate of change in contrast to a geodesic has to implemented in one of the first two steps of calculating the derivative of the metric tensor. But I don't see how.

$\endgroup$
5
  • $\begingroup$ Possible duplicate: physics.stackexchange.com/q/2447/2451 $\endgroup$
    – Qmechanic
    Commented May 25, 2020 at 11:31
  • 1
    $\begingroup$ I don't think my question is answered in that thread. The Christoffel Symbols are explained by using them in flat space to give some intuition. But the problem is, that as I do understand what the change of basis vectors means in flat space, I cannot grasp what it means in a curved space. The transition from flat to curved space is precisely the point at which I struggle. $\endgroup$ Commented May 25, 2020 at 11:40
  • $\begingroup$ Are you looking for a mathematical definition of the deviation of the shape of a world line from the shape of a geodesic that is tangent to the world line? $\endgroup$
    – S. McGrew
    Commented May 25, 2020 at 12:17
  • $\begingroup$ I don't think so, but if it helps... Is the "deviation of the shape of a world line from the shape of a geodesic that is tangent to the world line" $\frac{\partial \vec{e}_i}{\partial x^j}$? $\endgroup$ Commented May 25, 2020 at 12:21
  • $\begingroup$ I'm not going to risk making a mathematical statement, but your statement, "I cannot really imagine, what a change of direction in curved space looks like, because I think it is necessary to have a straight line in order to define a change in direction as differing from that straight line..." would make sense in curved space if "straight line" were translated as "geodesic line". $\endgroup$
    – S. McGrew
    Commented May 25, 2020 at 12:36

2 Answers 2

4
$\begingroup$

I am going to use the technical term “geodesic” to refer to a “straight line” in a curved manifold. There are two ways to understand this. One is a global way and one is a local way.

Global

The global way may be the easiest (at least to me). Globally a geodesic is the shortest distance* between two points. Once you have a geodesic any slight deviation from that path in any direction will increase your distance. When you have a flat manifold then a geodesic is a straight line, i.e. the shortest distance is a straight line. So the global notion of a geodesic in a curved manifold shares the same minimum-distance property as a straight line in a flat manifold.

For example, on a sphere the geodesics are great circles. If you pick two points on the sphere and attach a rubber band between them then that rubber band will try to minimize the distance and will naturally assume a great circle path. Similarly a rubber band stretched between two points on a flat plane will form a straight line.

*technically it extremizes the distance, so it can be a minimum or maximum

Local

The local concept is a bit more difficult, in my opinion, because it requires two new concepts. One is called parallel transport, and the other is the tangent vector.

Parallel transport is used to map vectors at one point in the manifold to vectors at another nearby point. The idea is to move the vector from one point to the next without turning it. Think about laying a piece of tape smoothly along the path (no wrinkles) and then flattening the tape and making the vector at one point on the path parallel to the vector at any other point on the path. That is the parallel in parallel transport. The mathematical function that maps vectors at one point to the parallel vector at a nearby point is called a connection.

The other concept is the tangent vector. At each point on a path you can form a vector that points along the path. It shows which direction you need to step if you want to stay on the path. Combining the ideas of parallel transport and tangent vectors a geodesic is a curve that parallel transports its tangent vector. Intuitively, this is the concept of never turning either left or right but always stepping straight ahead.

Returning to the example of the sphere. If you walk along a great circle then you never turn to the right or left at any point but you always step straight ahead.

So those are the two concepts of geodesics: geodesics minimize the path length between two points and they parallel transport their tangent vector. Those are the concepts of “the shortest distance between two points is a straight line” and “straight lines don’t turn anywhere” both applied to a curved manifold.

$\endgroup$
4
  • $\begingroup$ Thanks for the comment! I am familiar with the concept of parallel transport . The problem is, that I don't understand, what not turning left or right means in curved space. From your answer and one other comment I assume, that not changing direction means per definition following a geodesic? So $\frac{\partial\vec{e}_i}{\partial x^j}$ is the rate of change in contrast to following a geodesic? $\endgroup$ Commented May 26, 2020 at 8:52
  • $\begingroup$ Now if I got that right, it leads to another question: I feel like that woould be somewhat arbitrary. To me it sounds like geodesics are the closest thing we have to a straight line, so we define a change of direction as differing from a geodesic. Am I right here or does it actually follow from something else? Because I cannot see, where the definition is made oder implied in the process of deriving the covariant derivative... $\endgroup$ Commented May 26, 2020 at 9:21
  • 2
    $\begingroup$ The thing that seems arbitrary to you is the connection, that mapping between neighboring points. It is arbitrary. Out of all of the possible connections that could be used we choose to use the Levi-Civta connection. This is the one that produces geodesics with both the global and the local properties described above $\endgroup$
    – Dale
    Commented May 26, 2020 at 11:24
  • $\begingroup$ Thanks. It already helps me a lot to know, that indeed it is a choice that a change of direction means differing from the geodesic. What I don't understand yet is where this choice comes into play. I edited my question at the end. If somebody could help me there, I think I can understand, what the covariant derivative really does. $\endgroup$ Commented May 26, 2020 at 12:24
2
$\begingroup$

To elaborate on Dale's answer, which I believe does not fully address how to parallel-transport the vectors along the geodesics, I will start from a more general definition of a covariant derivative, define the Levi-Civita connection and interpret it in the light of parallel-transport.

Covariant derivatives: a general definition

Suppose that you want to take the derivative of a vector field $X$ in some direction specified by the vector $Y$, whatever this may mean. Let's agree to denote such a derivative with $D_{Y}X$. The derivative operator $D$ should have some nice properties, such as

$$ $$

(i) $D_{Y}(c_{1}X_{1}+c_{2}X_{2})=c_{1}D_{Y}X_{1}+c_{2}D_{Y}X_{2}\qquad\forall\ c_{1},c_{2}\in\Bbb{R}\qquad$ ($\Bbb{R}$-linearity with respect to the derivand),

(ii) $D_{Y}(fX)=Y(f)X+fD_{Y}X\qquad$(Leibniz rule),

$$ $$

where $X_{1},X_{2}$ are vector fields, $f$ is a function on the manifold and $Y(f)$ denotes the partial derivative of $f$ in the direction $Y$, i.e. $Y(f)=Y^{\mu}\partial_{\mu}f$. These properties are what one expects from a derivative. As you can check, they are respected by the ordinary directional derivative on flat spacetime.

From the properties (i) and (ii) it follows that, in coordinates,

$$ D_{Y}X=D_{Y}(X^{\mu}\partial_{\mu})=[Y(X^{\mu})]\partial_{\mu}+X^{\mu}[D_{Y}(\partial_{\mu})]. $$

Therefore $D_{Y}X$ is completely specified once we define how $D_{Y}$ acts on the basis vectors $\partial_{\mu}$. In this respect, we may require the derivative operator $D$ to have a third property, namely

$$ $$

(iii) $D_{fY}X=fD_{Y}X\qquad$ ($C^{\infty}$-linearity with respect to the direction of the derivative),

$$ $$

where again $f$ is a function on a manifold. This property, which again is respected by the ordinary directional derivative on flat spacetime, makes $D$ into a covariant derivative, and implies that

$$ D_{Y}(\partial_{\mu})=D_{(Y^{\nu}\partial_{\nu})}(\partial_{\mu})=Y^{\nu}D_{\nu}(\partial_{\mu})\qquad(D_{\nu}\equiv D_{\partial_{\nu}}). $$

Now the covariant derivative $D$, which we denote with $\nabla$, is completely specified once we define what $\nabla_{\nu}(\partial_{\mu})$ is. Observe that $\nabla_{\nu}(\partial_{\mu})$ is a vector field. Therefore, in general, it can be expressed as

$$ \nabla_{\nu}(\partial_{\mu})=\Gamma^{\sigma}_{\nu\mu}\partial_{\sigma}, $$

where the $\Gamma^{\sigma}_{\nu\mu}$'s - the Christoffel symbols - are functions on the manifold. The $\Gamma$'s define what is known as a connection. A connection specifies how the derivatives of vector fields are to be taken on the manifold. As you noticed, it is completely arbitrary: manifolds do not come equipped with an intrinsic definition for the derivatives of vector fields$^{(*)}$, and you need to specify a connection in order to be able to do so. The connection is extra structure on the manifold.

$$ $$

$^{(*)}$ The Lie derivative is an exception, since it can be defined on any manifold without the need of extra structure. The downside of the Lie derivative is that it does not verify property (iii) given above.

$$ $$

The Levi-Civita connection

Suppose that you manifold comes equipped with a metric $g$,

$$ g=g_{\mu\nu}dx^{\mu}dx^{\nu}. $$

On the manifold, you may want to define a connection that is compatible with the geometry specified by $g$. For instance, you may want that the derivative of the inner product $g(X,Z)$ in the direction $Y$, where $X$ and $Z$ are vector fields on the manifold with vanishing covariant derivative, $\nabla_{Y}X=\nabla_{Y}Z=0$, vanishes as well: "if $X$ and $Z$ are constant, then also $g(X,Z)$ is constant". What you need to do is first of all extend the derivative to general $(n,k)$-tensors by the Leibniz rule:

$$ \nabla_{Y}(T_{1}\otimes T_{2})=(\nabla_{Y}T_{1})\otimes T_{2}+T_{1}\otimes(\nabla_{Y}T_{2}), $$

where $T_{1}$ and $T_{2}$ are arbitray tensors; and do the same to contractions: for instance

$$ Y[\omega(X)]=(\nabla_{Y}\omega)(X)+\omega(\nabla_{Y}X) $$

where $\omega$ is a 1-form on the manifold. Once you have done that, you find

$$ Y[g(X,Z)]=(\nabla_{Y}g)(X,Z)+g(\nabla_{Y}X,Z)+g(X,\nabla_{Y}Z)=(\nabla_{Y}g)(X,Z), $$

since we had assumed $\nabla_{Y}X=\nabla_{Y}Z=0$. If you want $Y[g(X,Z)]=0$ for general $X,Y,Z$, then you must require that

$$ $$

(iv) $\nabla_{Y}g=0\qquad\forall\ Y$.

$$ $$

This property is known as the metric-compatibility of the connection.

A further requirement is to ask that the torsion $T$ of the connection vanishes:

$$ $$

(v) $T(X,Z)=\nabla_{X}Z-\nabla_{Z}X-[X,Z]$.

$$ $$

where $[X,Z]=(X(Z^{\mu})-Z(X^{\mu}))\partial_{\mu}$ is the Lie bracket between the fields $X$ and $Z$. By translating the above equation in coordinates, one finds that the torsion-freeness of the connection amounts to the symmetry of its Christoffel symbols:

$$ \Gamma_{\mu\nu}^{\sigma}=\Gamma_{\nu\mu}^{\sigma}. $$

The motivation for the requirement $T=0$ is somewhat more difficult to understand; indeed, there are non-standard formulations of GR which allow for non-vanishing torsion. In the next section I will leave you a reference on torsion in the context of parallel transport. In this section, let me motivate $T=0$ as follows:

(1) In the presence of torsion, identities such as the Bianchi identities are spoiled.

(2) It can be shown that there exists only one metric-compatible, torsion-free connection.

The above referenced connection is called the Levi-Civita connection, and its Christoffel symbols are given by

$$ \Gamma_{\mu\nu}^{\sigma}=\frac{1}{2}\,g^{\sigma\tau}(\partial_{\mu}g_{\nu\tau}+\partial_{\nu}g_{\mu\tau}-\partial_{\tau}g_{\mu\nu}). $$

$$ $$

The Levi-Civita connection and the geodesics

An interpretation behind the Levi-Civita connection can be given in terms of parallel transport.

Suppose that you want to define the covariant derivative at the point $x$ by the usual limiting formula

$$ \nabla_{Y}X|_{x}=\lim_{s\to 0}\frac{X(s)|_{x}-X|_{x}}{s}, $$

where $X(s)|_{x}$ is a smooth collection of vectors at the point $x$ such that $X(0)|_{x}=X|_{x}$. How do we obtain such an $X(s)|_{x}$?

First of all, we must look for the value of $X$ at some neighboring point $x(s)$ (this is the intrinsic information that we have about the vector field $X$: we always know what $X|_{x(s)}$ is). Observe that $x(s)$ is none other than a curve on the manifold. If the covariant derivative has to be in the direction $Y$, it better be $\dot{x}(0)=Y$, i.e. the tangent vector of such a curve at the initial point $x$ must be equal to $Y$. Second of all, we need to bring back $X|_{x(s)}$ to point $x$ in order to be able to take the difference $X(s)|_{x}-X|_{x}$. This is referred to as parallel transporting the vector along the curve $x(s)$ (back to $x$).

The rule for how to parallel transport the vector back to $x$, together with the definition itself of the curve $x(s)$, entirely defines the derivative: if we denote by $P^{-1}_{s}$ the operator that brings $X|_{x(s)}$ back to $x$ then we can define $X(s)|_{x}=P^{-1}_{s}(X|_{x(s)})$, so that

$$ \nabla_{Y}X|_{x}=\lim_{s\to 0}\frac{P^{-1}_{s}(X|_{x(s)})-X|_{x}}{s}. $$

At this stage the above equation can reproduce an arbitrary connection. So how do we recover the Levi-Civita connection? The answer is as follows. We specialize to a map $P_{s}$ with the following properties:

$$ $$

(I) we require the curves x(s) to be geodesics (e.g. in the global sense explained by Dale in his answer),

(II) we require the parallel transporting of a vector to be trivial on the tangent vectors to the geodesic, i.e. $P^{-1}_{s}(\dot{x}(s))=\dot{x}(0)$ for every $s$ (compatibility between parallel transport and geodesy),

(III) we require that parallel transport give rise to no torsion.

$$ $$

Requirement (III) has to do with how parallel transport behaves with respect to neighboring geodesics (rather than to a single geodesic). For more details see here.

$$ $$

Conclusions

In order to define the covariant derivative of a vector field on a manifold you need to introduce extra structure in the form of a connection. The connection, in principle, is arbitrary. It contains information on how vectors are parallel-transported along curves.

In the presence of a metric, some connections are more well-behaved than others with respect to the geometry of the manifold. These are the metric-compatible connections, such as the Levi-Civita connection. The Levi-Civita connection can be interpreted as the torsion-free connection which parallel-transports vectors along ("global") geodesics in such a way that the tangent vectors along a geodesic are parallel to themselves.

$\endgroup$
2
  • $\begingroup$ Thanks for the detailed answer! This is what I was looking for, it's a huge help :) $\endgroup$ Commented May 27, 2020 at 8:40
  • $\begingroup$ You're welcome! If it answers your question consider to accept it! All the best. $\endgroup$ Commented May 27, 2020 at 8:58

Not the answer you're looking for? Browse other questions tagged or ask your own question.