42
$\begingroup$

I have seen similar questions, but none of the answers relate to my difficulty, which I will now proceed to convey.

Let $(M,g)$ be a Riemannian manifolds. The Levi-Civita connection is the unique connection that satisfies two conditions: agreeing with the metric, and being torsion-free.

Agreeing with the metric is easy to understand. This is equivalent to the parallel transport associated with the connection to satisfy that the isomorphism between tangent spaces at different points along a path are isometries. Makes sense.

Let's imagine for a second what happens if we stop with this condition, and take the case of $M=\mathbb{R}^2$, with $g$ being the usual metric. Then it's easy to think of non-trivial ways to define parallel transport other than the one induced by the Levi-Civita connection.

For example, imagine the following way to do parallel transport: if $\gamma$ is a path in $\mathbb{R}^2$, then the associated map from $TM_{\gamma(s)}$ to $TM_{\gamma(t)}$ will be a rotation based with angle $p_2(\gamma(s))-p_2(\gamma(t))$, where $p_i$ is the projection of $\mathbb{R}^2$ onto the $i^\text{th}$ coordinate.

So I guess torsion-free-ness is supposed to rule this kind of example out.

Now I'm somewhat confused. One of the answers to a similar question that any two connections that satisfy that they agree with the metric satisfy that they have the same geodesics, and in that case choosing a torsion-free one is just a way of choosing a canonical one. That seems incorrect, as $\gamma(t)=(0,t)$ is a geodesic of $\mathbb{R}^2$ with the Levi-Civita connection but not the one I just described...

Let's think from a different direction. In the case of $\mathbb{R}^2$, if $\nabla$ is the usual (and therefore Levi-Civita) connection then $\nabla_XY$ is just $XY$, and $\nabla_YX$ is just $YX$. So of course we have torsion-free-ness.

So I guess one way to think of torsion-free-ness is saying that you want the parallel transport induced by the connection to be the one associated with $\mathbb{R}^n$ via the local trivializations.

Except that this seems over-simplistic: torsion-free-ness is weaker than the condition that $\nabla_XY=XY$ and $\nabla_YX=YX$. So why this crazy weaker condition that $\nabla_XY-\nabla_YX=[X, Y]$? What does that even mean geometrically? Why is this sensible? How would say that in words that are similar to "it means that the connection is the connection induced from the trivializations" except more correct than that?

$\endgroup$
12
  • 2
    $\begingroup$ Because it works $\endgroup$ Commented Nov 15, 2020 at 3:41
  • 13
    $\begingroup$ It works to what end? If we are proving properties of the parallel transport corresponding to the Levi-Civita connection, what is the merit of that as an object of interest? Additionally, are you claiming that it is inherently non-intuitive, and that any effort to understand it intuitively is misguided? $\endgroup$
    – Andrew NC
    Commented Nov 15, 2020 at 4:18
  • 8
    $\begingroup$ The condition that $\nabla_X Y=XY$ is not invariant under diffeomorphism. Vanishing of torsion is diffeomorphism invariant. There are no other tensor invariants of a connection which are linear and zero order and diffeomorphism invariant. $\endgroup$
    – Ben McKay
    Commented Nov 15, 2020 at 8:25
  • 2
    $\begingroup$ "...the usual (and therefore Levi-Civita) connection then $\nabla_XY$ is just $XY$" - I'm confused by your notation: what do you mean by $XY$? (the covariant derivative along a given vector (field) of a tensor field is still a tensor field (of the same type), but $XY$ seems to usually denote a second-order differential operator...) $\endgroup$
    – Qfwfq
    Commented Nov 15, 2020 at 20:09
  • 2
    $\begingroup$ @AndrewNC: I just added a remark at the end of my answer that explains this. What you may have read claiming that all the $g$-compattible connections have the same geodesics as the Levi-Civita connection turns out to be wrong. In fact, in dimension $2$, the Levi-Civita connection is the only $g$-compatible connection that has the same geodesics as the Levi-Civita connection. In higher dimensions, though, there are more $g$-compatible connections that have the same geodesics as the Levi-Civita connection of $g$. See my answer for details. $\endgroup$ Commented Nov 16, 2020 at 2:05

5 Answers 5

49
$\begingroup$

I think that the literal answer is that the Levi-Civita connection of $g$ is trying to describe the metric $g$ and nothing else. It is the only connection-assignment that is uniquely defined by the metric and its first derivatives and nothing else, in the sense that, if you have a diffeomorphism-equivariant assignment $g\to C(g)$ where $C(g)$ is a connection that depends only on $g$ and its first derivatives, then $C(g)$ is the Levi-Civita connection.

Note that the restriction to first derivatives is necessary. For example, there is a unique connection on $TM$ that is compatible with $g$ and satisfies $$ \nabla_XY -\nabla_YX - [X,Y] = \mathrm{d}S(X)\,Y - \mathrm{d}S(Y)\,X, $$ where $S= S(g)$ is the scalar curvature of $g$. However, this canonical connection depends on three derivatives of $g$.

Meanwhile, connections with torsion can arise naturally from other structures: For example, on a Lie group, there is a unique connection for which the left-invariant vector fields are parallel and a unique connection for which the right-invariant vector fields are parallel. When the identity component of the group is nonabelian, these are distinct connections with nonvanishing torsion, while their average is a canonical connection that is torsion-free. (This latter connection need not be metric compatible, of course.) A more well-known example is the unique connection associated to an Hermitian metric on a complex manifold that is compatible with both the metric and the complex structure and whose torsion is of type (0,2).

It's not unreasonable to ask whether imposing the torsion-free condition, just because you can, right out of the gate is too restrictive. Einstein tried for years to devise a 'unified field theory' that would geometrize all of the known forces of nature by considering connections compatible with the metric (i.e., the gravitational field) that had torsion. There is a book containing the correspondence between Einstein and Élie Cartan (Letters on absolute parallelism) in which Einstein would propose a set of field equations that would constrain the torsion so that they describe the other known forces (just as the Einstein equations constrain the gravitational field) and Cartan would analyze them to determine whether they had the necessary 'flexibility' to describe the known phenomena without being so 'flexible' that they couldn't make predictions. It's very interesting reading.

This tradition of seeking a physical interpretation of torsion has continued, off and on, since then, with several attempts to generalize Einstein's theory of gravity (aka, 'general relativity'). Some of these are described in Misner, Thorne, and Wheeler, and references are given to others. In fact, quite recently, Thibault Damour (IHÉS), famous for his work on black holes, and a collaborator have been working on a gravitational theory-with-torsion, which they call 'torsion bigravity'. (See arXiv:1906.11859 [gr-qc] and arXiv:2007.08606 [gr-qc].) [To be frank, though, I'm not aware that any of these alternative theories have made any predictions that disagree with GR that have been verified by experiment. I think we all would have heard about that.]

I guess the point is that 'why impose torsion-free?' is actually a very reasonable question to ask, and, indeed, it has been asked many times. One answer is that, if you are only trying to understand the geometry of a metric, you might as well go with the most natural connection, and the Levi-Civita connection is the best one of those in many senses. Another answer is that, if you have some geometric or physical phenomenon that can be captured by a metric and another tensor that can be interpreted as (part of) the torsion of the connection, then, sure, go ahead and incorporate that information into the connection and see where it leads you.

Remark on connections with the same geodesics: I realize that I didn't respond to the OP's confusion about connections with the same geodesics vs. compatible with a metric $g$ but with torsion. (I did respond in a comment that turned out to be wrong, so I deleted it. Hopefully, this will be better.)

First, about torsion (of a connection on TM). The torsion $T^\nabla$ of a (linear) connection on $TM$ is a section of the bundle $TM\otimes\Lambda^2(T^*M)$. Here is an (augmented) Fundamental Lemma of (pseudo-)Riemannian geometry:

Lemma 1: If $g$ is a (nondegenerate) pseudo-Riemannian metric on $M$ and $\tau$ is a section of $TM\otimes\Lambda^2(T^*M)$, then there is a unique linear connection $\nabla$ on $TM$ such that $\nabla g = 0$ and $T^\nabla = \tau$.

(The usual FLRG is the special case $\tau=0$.) Note that this $\nabla$ depends algebraically on $\tau$ and the $1$-jet of $g$. The proof of Lemma 1 is the usual linear algebra.

Second, if $\nabla$ and $\nabla^*$ are two linear connections on $TM$, their difference is well-defined and is a section of $TM\otimes T^*M\otimes T^*M$. Specifically $\nabla^* - \nabla:TM\times TM\to TM$ has the property that, on vector fields $X$ and $Y$, we have $$ \left({\nabla^*} - \nabla\right)(X,Y) = {\nabla^*}_XY-\nabla_XY. $$

Lemma 2: Two linear connections, $\nabla$ and $\nabla^*$ have the same geodesics (i.e., each curve $\gamma$ is a geodesic for one if and only if it is a geodesic for the other) if and only if $\tilde\nabla - \nabla$ is a section of the subbundle $TM\otimes\Lambda^2(T^*M)\subset TM\otimes T^*M\otimes T^*M$.

Proof: In local coordinates $x = (x^i)$, let $\Gamma^i_{jk}$ (respectively, $\tilde\Gamma^i_{jk}$) be the coefficients of $\nabla$0 (respectively, $\tilde\nabla$). Then $$ \tilde\nabla-\nabla = (\tilde\Gamma^i_{jk}-\Gamma^i_{jk})\ \partial_i\otimes \mathrm{d}x^j\otimes\mathrm{d}x^k. $$ Meanwhile, a curve $\gamma$ in the $x$-coordinates is a $\nabla$-geodesic (respectively, a $\tilde\nabla$-geodesic) iff $$ \ddot x^i + \Gamma^i_{jk}(x)\,\dot x^j\dot x^k = 0\qquad (\text{respectively},\ \ddot x^i + \tilde\Gamma^i_{jk}(x)\,\dot x^j\dot x^k = 0). $$ These are the same equations iff $(\tilde\Gamma^i_{jk}(x)-\Gamma^i_{jk}(x))\,y^jy^k\equiv0$ for all $y^i$, i.e., iff $$ {\tilde\nabla}-\nabla = \tfrac12({\tilde\Gamma}^i_{jk}-\Gamma^i_{jk})\ \partial_i\otimes \mathrm{d}x^j\wedge\mathrm{d}x^k.\quad \square $$

Finally, we examine when two $g$-compatible connections have the same geodesics:

Lemma 3: If $g$ is a nondegenerate (pseudo-)Riemannian metric, and $\nabla$ and $\nabla^*$ are linear connections on $TM$ that satisfy $\nabla g = \nabla^*g = 0$, then they have the same geodesics if and only if the expression $$ \phi(X,Y,Z) = g\bigl( X,(\nabla^*{-}\nabla)(Y,Z)\bigr) $$ is skew-symmetric in $X$, $Y$, and $Z$.

Proof: $\nabla g = \nabla^* g = 0$ implies $\phi(X,Y,Z)+\phi(Z,Y,X)=0$, while they have the same geodesics if and only if $\phi(X,Y,Z)+\phi(X,Z,Y)=0$.

Corollary: If $g$ is a nondegenerate (pseudo-)Riemannian metric, then the space of linear connections $\nabla$ on $TM$ that satisfy $\nabla g = 0$ and have the same geodesics as $\nabla^g$, the Levi-Civita connection of $g$, is a vector space naturally isomorphic to $\Omega^3(M)$, the space of $3$-forms on $M$.

$\endgroup$
6
  • 1
    $\begingroup$ Robert, I think what you say in your first paragraph is how the Levi-Civita connection should be introduced, rather than having the symmetry condition be pulled out of thin air. To me it better motivates the definition. $\endgroup$
    – Deane Yang
    Commented Nov 15, 2020 at 16:34
  • 8
    $\begingroup$ Regarding the first paragraph: It seems to me that, on an oriented Riemmannian $3$-fold, there is another natural construction that only uses the metric and its first derivative (and the orientation). Using the orientation and the metric, we can define a cross product on each tangent space. For any scalar $c$, we can define $\nabla_X(Y) = \nabla^{LC}_X(Y) + c X \times Y$, where $\nabla^{LC}$ is the Levi-Cevita connection. (This nitpick aside, great answer!) $\endgroup$ Commented Nov 15, 2020 at 18:14
  • 7
    $\begingroup$ @DavidESpeyer: Using an orientation is using more than the metric. I think my answer stands correct as it is. Also, even if you add orientation, you won't get any new examples other than in dimension $3$. $\endgroup$ Commented Nov 15, 2020 at 18:43
  • 7
    $\begingroup$ @DavidESpeyer: I realize, on re-reading it, that my first response to your comment was not the best. I should have acknowledged that your construction in dimension 3 (with an orientation added to the data) is an interesting one, and one that I hadn't considered. Thanks for pointing it out, and thanks for your kind words! $\endgroup$ Commented Nov 15, 2020 at 20:53
  • 5
    $\begingroup$ This was a beautiful answer. When is Bryant's intro to Riemannian (and other) geometry coming out? $\endgroup$
    – Tom Mrowka
    Commented Nov 17, 2020 at 0:38
28
$\begingroup$

I will try to help with the title question. I think that the real motivation for the Levi-Civita connection comes from looking at surfaces in Euclidean 3-space. Differentate one tangent vector field $Y$ along another $X$ by first extending them to be defined in the ambient space, and then taking the tangential projection of $XY$, i.e. tangential projection of the Euclidean connection. Levi-Civita discovered that this process is intrinsic, i.e. invariant under isometry of surfaces without carrying along the ambient space, and described precisely by torsion freedom. This was clearly a long and difficult process. Dirac uses this view in his book General Theory of Relativity, and this is how I introduce the Levi-Civita connection in my lectures.

I have to agree that there is something missing in the textbook discussions of torsion. I have not found an intuitive understanding of torsion. Maybe the physicists can help.

$\endgroup$
7
  • 8
    $\begingroup$ @Gro-Tsen: yes, we can. Roughly speaking, taking one derivative and then projecting doesn't ever use second derivatives, so we never feel curvature, so everything looks the same in Euclidean space and in any Riemannian manifold. $\endgroup$
    – Ben McKay
    Commented Nov 15, 2020 at 10:54
  • 2
    $\begingroup$ @Gro-Tsen: oddly enough, that is exactly what Dirac does, embedding space time into a pseudo-Euclidean space (see page 10). He never worries about the physical interpretation of that pseudo-Euclidean space, since it is just a crutch to get the definitions going. $\endgroup$
    – Ben McKay
    Commented Nov 15, 2020 at 11:11
  • 4
    $\begingroup$ @AndrewNC, you don't actually need to extend $X$ and $Y$ to the ambient Euclidean space. Given $p \in M$, $XY(p)$ is the directional derivative of $Y$ in the direction $X$, which can be calculated by choosing any curve $c$ such that $c(0) = p$ and $c' = X$. In particular, $c$ can be chosen to lie in $M$. Then $$\left.XY = \frac{d}{dt}Y(c(t))\right|_{t=0}.$$ Then $\nabla_XY$ is the orthogonal projection of $XY$ onto $T_pM $\endgroup$
    – Deane Yang
    Commented Nov 15, 2020 at 22:03
  • 2
    $\begingroup$ In the presence of torsion, can the $\nabla$ still be seen as euclidean covariant differentiation from the ambient followed by not necessarily orthogonal projection to the tangent space? $\endgroup$
    – Qfwfq
    Commented Nov 16, 2020 at 13:50
  • 2
    $\begingroup$ @Qfwfq I doubt it. If $P:TM|_\Sigma\rightarrow T\Sigma$ is an idempotent vertical morphism (i.e. a tangential projection), $D$ is the projected connection and $\nabla$ is the ambient connection, we get $T^D(X,Y)=D_XY-D_YX-[X,Y]=P(\nabla_XY)-P(\nabla_YX)-[X,Y]=P(\nabla_XY)-P(\nabla_YX)-P[X,Y]=PT^\nabla(X,Y)=0$, where we have used the fact that for tangential vector fields $X,Y$ the bracket $[X,Y]$ is also tangential, therefore $[X,Y]=P[X,Y]$. $\endgroup$ Commented Dec 13, 2020 at 19:25
18
$\begingroup$

First, you should not dismiss the uniqueness of the connection too lightly. If you want to study a Riemannian metric per se, then you want to find invariants of it, things that are uniquely determined by the metric. Without the torsion-free assumption, there are many possible connections, and any properties derived from them will not be invariants of the metric. With the torsion-free assumption, the Levi-Civita connection is unique, so everything it implies is a property of the metric alone.

The next question is why not some other condition that might imply uniqueness of the connection? The torsion-free condition arises naturally enough to make it the natural one. The most important one is that, on a submanifold of Euclidean space, the flat connection on Euclidean space naturally induces a connection on the submanifold, and that connection is indeed torsion-free. Another property is that the Hessian of a function is always symmetric if and only if the connection is torsion-free.

Note also that when we study any mathematical object, we choose which properties we want to hold and that choice often depends on the depth and impact of the theory developed. Why do we assume that a Riemannian metric is symmetric? Why do we use an inner product metric and not a norm on the tangent space. When Anton says "it works", he is not talking specifically about parallel translation. He is referring to the entire rich subject of Riemannian geometry. People have studied connections that are not torsion-free, but so far the theory developed in that direction has not paid off nearly as much as Riemannian geometry has.

$\endgroup$
6
  • $\begingroup$ I think your answer might work better with (alternatively, I personally would appreciate) an example of the "payoff" that works with a torsion-free connection but not a torsion connection; is there some "gateway" theorem that only works with torsion-free connections, behind which you find the richness of torsion-free geometry? $\endgroup$
    – user44191
    Commented Nov 15, 2020 at 4:49
  • $\begingroup$ @user44191, I can only repeat what I said. If you want to study properties of a Riemannian metric together with a connection, then you can omit the torsion-free assumption and investigate the properties of the pair. If you want to study the properties of only the Riemannian metric itself, you could still study the Riemannian metric with a connection (possibly with torsion) and then identify which of the results do not depend on the connection chosen. It, however, is far easier to use a connection that is uniquely determined by the metric. $\endgroup$
    – Deane Yang
    Commented Nov 15, 2020 at 5:00
  • $\begingroup$ @user44191, here's an example: Note that since the Riemann curvature tensor is uniquely determined by the metric alone, you can define it without ever using the Levi-Civita connection. This is demonstrated in a little note I wrote, math.nyu.edu/~yangd/papers/riemann.pdf. However, this is all encoded more elegantly using the definition of the Levi-Civita connection and the definition of the curvature tensor using it. $\endgroup$
    – Deane Yang
    Commented Nov 15, 2020 at 5:23
  • $\begingroup$ "on a submanifold of Euclidean space, the flat connection on Euclidean space naturally induces a connection on the submanifold, and that connection is indeed torsion-free." You say Euclidean, but does this actually hold for a submanifold of any flat space, whether Riemannian or semi-Riemannian? If not, then this seems less attractive to me as a motivation. $\endgroup$
    – user21349
    Commented Nov 17, 2020 at 2:41
  • $\begingroup$ @BenCrowell, yes. Any flat Riemannian manifold is locally isometric to Euclidean space. The analogous fact is true for flat semi-Riemannian spaces, too. $\endgroup$
    – Deane Yang
    Commented Nov 17, 2020 at 3:16
18
$\begingroup$

Without loss of generality (Nash embedding theorem) we may assume the Riemannian manifold is an embedded submanifold of Euclidean space: its metric at any point is just the restriction of the Euclidean inner product to the tangent plane. Imagine we live on this submanifold (just like we live on a sphere called Earth) and we want to calculate things, such as our acceleration as we run around our planet.

Remember, the metric gives us a means of measuring distances and angles, but no direct way of computing rates-of-change of vector fields. A connection is what determines the rates-of-change of vector fields (such as acceleration, which is the rate-of-change of velocity vectors). And connections are just "infinitesimal limits" of parallel transport. So the question becomes, given a submanifold of Euclidean space, is there a canonical way of defining parallel transport which is useful in some way?

Often things are "useful" if they correspond to what happens in the real world. So how should parallel transport be defined on our planet? How is it defined on Earth?

The very first thing might be to agree on what path we would take if we are told to walk in a straight line. If we did this on Earth, we would walk along a great circle even though we think we are walking in a straight line. Why? Because after each level step we take, gravity pulls our foot back down to Earth. We think we are going straight, but gravity causes our path to curve in the ambient Euclidean space. (For what it is worth, we tend to interpret this "curve" that gravity induces in our path, as the least change required to keep us on the surface of our planet, so to speak.)

Requirement 1: When we are told to walk in a straight line, the curve we actually trace out (due to gravity, or mathematically, due to Euclidean projection back to the submanifold) should be a geodesic, i.e., have zero acceleration.

Now, imagine as we walk, we are holding a lance. Maybe the lance is pointing straight ahead, but maybe it is pointing to our left. Regardless, we are told not to move the lance as we walk in a straight line. Now, from the perspective of the ambient Euclidean space, where the lance points is going to change as we walk. But from our perspective, we are very comfortable being told to walk without moving the lance. We want the evolution of the lance's position to correspond to parallel transport. Indeed, parallel transport defines how a vector is moved along a curve, and it is quite natural/useful to define parallel transport to be what results if we are told to walk with the lance/vector in our hand without moving it at all. The curvature of the Earth causes it to move, but we believe we are not moving it.

Requirement 2: Parallel transport corresponds to carrying a "vector" with us as we walk along a path without consciously moving the vector. (This actually includes Requirement 1 as a special case when the vector is our own velocity vector.)

These requirements uniquely define the Levi-Civita connection and explain why it is natural/useful. It corresponds to the world we live in.

Now, a few words can be said about the usual axioms used to define the Levi-Civita connection: metric connection with zero torsion. The metric connection means when we parallel transport vectors, their norms and the angles between them do not change. Certainly, if we are carrying two lances and told not to move them, we expect the angle between them to stay the same, and we expect the length of each lance to stay the same too. This on its own is not enough for geodesics to be the "correct" curves, i.e., those curves that result when we are told to walk in a straight line. Torsion actually decomposes into two parts (see Millman's 1971 paper "Geodesics in Metrical Connections"). One part controls what geodesics look like, and the other part determines whether parallel transport will cause a vector to spin orthogonal to the direction of motion along a geodesic. If we start holding a lance straight up (it wouldn't be in the tangent plane but ignore this technicality or think in higher dimensions), but as we walk straight ahead, we rotate the lance so it goes from pointing up to pointing right, then down, then left, then up etc, then our parallel transport has torsion. Hence, taken together, a metric connection with zero torsion gives us the definition of parallel transport corresponding to "do not move the vector as you walk along the curve". This is the Levi-Civita connection.

ps. In Appendix 1.D of the second edition of "Mathematical Methods of Classical Mechanics" by Arnold, a geometric way of constructing parallel transport to have no torsion is explained. Given a tangent vector at a point on a geodesic, the aim is to transport it without altering it any more than necessary, as explained above. Without a Euclidean embedding, this can be done intrinsically by considering families of geodesic curves (see Appendix 1.D of Arnold's book). The infinitesimal requirement reduces to the no-torsion equation $\nabla_X Y - \nabla_Y X = [X,Y]$. Thus, the geometric meaning of $\nabla_X Y - \nabla_Y X = [X,Y]$ is that parallel transport will not induce any extraneous movement of the tangent vector. (The geometric picture in Appendix 1.D of Arnold takes a few paragraphs to explain even though the concept itself is straightforward enough.)

$\endgroup$
12
$\begingroup$

The other answers give good insight. Here's another perspective.

Since the Levi-Civita connection is the unique metric and torsion-free connection, to motivate its use we need to convince ourselves that both of these properties are desirable. I'll note that there is sometimes value in considering non-metric connections, but in the question you addressed why using metric connections make sense for studying geometry. So I guess the real issue is to tackle torsion-free-ness.

In order to address this, the first thing to do is try to understand what torsion really is anyway. There is another question on Mathoverflow about torsion with some great answers, but let me try to draw some pictures. We'll start with the standard picture of the curvature tensor (for a torsion-free connection). (Edit: I got several comments about how to interpret these pictures. I'll discuss this at the end of the answer)

enter image description here

The idea is that we have three vectors $X$, $Y$ and $Z$. Starting at a point $p$ in our space, we use our connection to parallel transport $Z$ an infinitesimal amount along a geodesic in the $X$ direction and then along a curve in the $Y$ direction. We then parallel transport $Z$ in an infinitesimal amount in the $Y$ direction and then in the $X$ direction. The curvature measures the difference between these two parallel transports. In the formula, the Lie bracket term is there to make sure that everything is nice and tensorial.

What changes if the torsion is non-zero?

Torsion diagram

In this case, if we parallel transport along a geodesic in the $X$ direction and then along a geodesic in the $Y$ direction (see below for how to make this precise), we get a different point from when we parallel transport in the $Y$ direction first then in $X$ direction. When we take the logarithm of the differences of these points, what's left is $\epsilon^2 T(X,Y)$ (modulo an error of $\approx \epsilon^3 R(X,Y)(X+Y)$, as Robert Bryant pointed out ). Dividing by $\epsilon^2$ and letting $\epsilon$ to zero, we find the picture above. Again, in the formula there is a Lie bracket term to make everything nice and tensorial.

So why do we want a torsion free connection?

In my opinion, torsion is complicated invariant and is somewhat hard to understand. For curvature, there is a very clear picture of what it means for a space to have positive versus negative curvature (infinitesimal planes coming together versus spreading apart). As such, it's possible to formulate all sorts of theorems in terms of curvature assumptions. On the other hand, torsion is this awkward vector that you get when you compute multiple derivatives. It's not really meaningful for it to be "positive" or "negative," and so it doesn't affect the analysis in predictable ways. As such, life is often a lot easier when it's not around, and is what makes the Levi-Civita connection so useful.

I should add that there are times where considering connections with torsion makes sense. For instance, on a Lie group it is possible to construct a curvature-free connection whose torsion encodes the Lie algebra. This is a very useful connection, but from an analytic perspective, it's not so clear geometrically how the respective torsions of $SO(3)$ versus the Heisenberg group (for instance) give rise to their very different geometries. Another example is in non-Kahler complex geometry, where we can study holomorphic, complex, metric connections, which must have non-zero torsion. But again, even though the torsion is present and necessary, it's often hard to really use it in a meaningful way.

How to interpret the pictures

There was a long discussion about how to interpret the pictures, so I should say a few words about what they mean. Thanks to Robert Bryant and Matt F for their helpful suggestions,

When I first learned about the concept, I found it helpful to use the diagrams as a schematic without worrying about which particular fiber everything is defined in. You can still use the diagram to see that the curvature and torsion are skew-symmetric in $X$ and $Y$ and that the curvature is a (3,1) tensor whereas the torsion is a (2,1) tensor.

To make the picture slightly more rigorous, we either parallel transport in the direction $X$ by a distance $\epsilon X$ or, (as shown in the picture) we make $X$ a tangent vector whose length is $O(\epsilon)$. We do the same thing with $Y$. On the other hand, we assume that the norm of $Z$ is $O(1)$. To obtain the diagram, we rescale the geometry by $\frac{1}{\epsilon^2}$ and let $\epsilon \to 0$. As Robert Bryant noted, for non-zero epsilon, the $XY$-parallelogram in the first picture does not fully close, but the displacement is essentially $R(X,Y)(X+Y)$, which is $O(\epsilon^3)$. When we rescale and take limits, this error vanishes, which is why the parallelogram closes in the picture. The fact that this picture is infinitesimal in $X$ and $Y$ is also the reason why the geodesics are drawn as straight lines.

If we want to make everything completely rigorous while keeping track of the various tangent spaces and making sure that the final expression lives in in $T_p M $, things get more complicated. However, in order to show that this can be done, here's one way to formalize it (using a suggestion by @RobertBryant).

We define the point $q = \exp_p(\epsilon(X+Y)$ to be the opposite corner of the parallelogram. We parallel transport $Z$ along the geodesic $\exp_p(tX)$ for $t$ between $0$ and $\epsilon$ and then parallel transport along the curve $\exp_p(\epsilon X+ t Y)$ until we reach $q$. This traces out the left path around the parallelogram, but the second part of the curve is not a geodesic.

We then do the same thing except that we transport first in the $Y$ direction and then in the $X$ direction. This gives us two vectors at $q$, and we take their difference to get a vector. To bring this back to $p$, we can parallel transport the result back to our original point using the geodesic from $q$ to $p$ (whose logarithm is $\epsilon(X+Y)$). The vector that we obtain by doing this is $$\epsilon^2 R(X,Y)Z+O(\epsilon^3),$$

As such, when we renormalize by $\epsilon^2$ and let $\epsilon \to 0$, we get the desired expression. I prefer drawing the curvature at $q$, rather than $p$ because it visually shows that I am commuting two covariant derivatives.

Unfortunately, we can't use this exact idea for the second picture, because here it really matters that all of the curves are geodesics with respect to the connection $\nabla$. Instead, we travel along the geodesic $\exp_p^\nabla(tX)$ until we hit the top left corner. Then we travel along a geodesic in the "direction" $Y$ (more precisely, the parallel translate of $Y$ along the geodesic from $p$ to $\exp_p^\nabla(\epsilon X)$. We then do the same thing except that we first travel in the $Y$ direction and then the "$X$ direction" (with the same caveat as before). When we do this, the resulting "parallelogram" doesn't close up, and if we take the logarithm of the differences, what we obtain is $$\epsilon^2 T^\nabla(X,Y)+\epsilon^3 R^\nabla(X,Y)(X+Y) + \epsilon^3 T^\nabla(T^\nabla(X,Y),X+Y)+O(\epsilon^4),$$ after we parallel transport the vector from $q$ back to $p$. Normalizing by $\epsilon^2$ and letting $\epsilon \to 0$, we get the torsion exactly.

$\endgroup$
16
  • 2
    $\begingroup$ I have seen these diagrams before, and I would love for them to inform my intuition, but the vectors in these diagrams don't live in the same vector spaces, so I don't know how to understand them... $\endgroup$
    – Andrew NC
    Commented Nov 15, 2020 at 17:43
  • 5
    $\begingroup$ @GabeK: Actually, one has to be rather careful about this: Even when the torsion is zero, the ' $XY$ parallelogram' described above will not generally close up when the curvature is nonzero. Thus, one cannot literally make sense of $R(X,Y)Z$ as a vector at the putative 'fourth vertex'. What is true is that for a general connection, what is drawn as the red difference in the second picture is an actual coordinate displacement that is essentially $T(X,Y)$, but, when the torsion is zero, this (nonzero) dispacement is essentially $R(X,Y)(X{+}Y)$, i.e., it is of total degree $3$ in $X$ and $Y$. $\endgroup$ Commented Nov 15, 2020 at 17:44
  • 3
    $\begingroup$ @C.F.G: As I pointed out above, you have to take Nakahara's statement with a grain of salt. It is just not true that when the torsion vanishes identically, then the parallelogram described above closes exactly. In the torsion-free case, the error that measures the failure of closure is third order in $(X,Y)$, but it is not identically zero unless $R$ also vanishes identically. You can easily check this yourself for parallel translation on the $2$-sphere (or the hyperbolic disk) using the Levi-Civita connection, where you can do the calculation very explicitly. $\endgroup$ Commented Nov 15, 2020 at 21:01
  • 4
    $\begingroup$ @GabeK: I hate to nitpick, but your final formula cannot be right. For example, given your definitions, the term $Tr(\exp_p( \epsilon X), \epsilon Y, Tr(p,\epsilon X,Z))$ does not make sense because $\epsilon Y$ is in $T_pM$, not $T_{\exp_p( \epsilon X)}M$. $\endgroup$ Commented Nov 15, 2020 at 21:18
  • 2
    $\begingroup$ @GabeK: If that's all that's bothering you, why not just compare the results of following the two ways around the boundary of the exponentiated paralleogram from $p=exp_p(0)$ to $q = \exp_p(\epsilon(X+Y))$? I think that's still simpler than what you are trying to do. $\endgroup$ Commented Nov 16, 2020 at 14:06

Not the answer you're looking for? Browse other questions tagged or ask your own question.