How would you discover Stokes's theorem?

Question

Let $S$ be a smooth oriented surface in $\mathbb R^3$ with boundary $C$, and let $f: \mathbb R^3 \to \mathbb R^3$ be a continuously differentiable vector field on $\mathbb R^3$.

Stokes's theorem states that $$ \int_C f \cdot dr = \int_S (\nabla \times f) \cdot dA. $$ In other words, the line integral of $f$ over the curve $C$ is equal to the integral of the curl of $f$ over the surface $S$. Here the orientation of the boundary $C$ is induced by the orientation of $S$.

Question: How might somebody have derived or discovered this formula? Where does this formula come from?

The goal is to provide an intuitive explanation of Stokes's theorem, rather than a rigorous proof.

(I'll post an answer below.)

This is not a useful answer in that brevity but all forms of Stokes theorem can be seen as special cases of the differential form version $\int_{\Omega}d\omega = \int_{\partial \Omega} \omega$ and this formula is the natural generalization of the fundamental theorem of calculus. — quarague, Commented Dec 5, 2021 at 11:43
The drawing is a bit misleading. People could think that the blue line is C, so maybe also dash the upper blue line (different dash) or make it gray. — lalala, Commented Dec 6, 2021 at 10:48
It's helpful to remind myself these things are rarely invented in terms of the modern formalism. They start with an intuitive insight and go through a lot of iterations of refinement and generalization before we get the nice theorems and equations we have today. (Obviously they didn't start with differential forms, etc) — Justin Meiners, Commented Dec 6, 2021 at 17:14
for a nice video by Aleph $0$ about Stokes' theorem from the point of view of homology theory see youtu.be/2ptFnIj71SM — glS, Commented Dec 8, 2021 at 13:35
"How might it be discovered" (this question) is suitable for this forum. "How was it discovered" would be suitable at hsm.stackexchange.com , and probably more interesting! — GEdgar, Commented Dec 8, 2021 at 16:58

littleO · Accepted Answer · 2022-02-07 05:53:52Z

Here's an intuitive way to discover Stokes's theorem.

Imagine chopping up the surface $S$ into tiny pieces such that each tiny piece is a parallelogram (or at least, each tiny piece is approximately a parallelogram).

Let $C_i$ be the boundary of the $i$th tiny parallelogram. I'll assume each $C_i$ has the orientation induced by the orientation of $S$. Notice that $$ \tag{1} \sum_i \int_{C_i} f \cdot dr = \int_C f \cdot dr. $$ This is because the sum on the left "telescopes". Everything in the middle cancels out and we are left only with boundary terms. This beautiful step in the derivation is reminiscent of the telescoping sum that appears when deriving the fundamental theorem of calculus in single variable calculus.

To complete our derivation of Stokes's theorem, we must compute the integral of $f$ around the boundary of a tiny parallelogram. Below is a picture of one single tiny parallelogram which is based at a point $x = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} \in \mathbb R^3$ and which is spanned by vectors $v = \begin{bmatrix} v_1 \\ v_2 \\ v_3 \end{bmatrix}$ and $w = \begin{bmatrix} w_1 \\ w_2 \\ w_3 \end{bmatrix} \in \mathbb R^3$. The orientation of the boundary of the parallelogram is indicated by the little direction arrows.

Since this is a very tiny parallelogram, I'll make the approximation that the integral of $f$ along edge 1 is approximately $f(x) \cdot v$, the integral of $f$ along edge 2 is approximately $f(x + v) \cdot w$, the integral of $f$ along edge 3 is approximately $f(x + w) \cdot (-v)$, and the integral of $f$ along edge 4 is approximately $f(x) \cdot (-w)$. Summing these four terms, and pairing edge 1 with edge 3 and edge 2 with edge 4, we find that the integral of $f$ along the boundary of this parallelogram is approximately \begin{align*} &\quad \langle f(x+v) - f(x), w \rangle - \langle f(x + w) - f(x), v \rangle \\ &\approx \langle f'(x) v, w \rangle - \langle f'(x) w, v \rangle \\ &= \langle v, (f'(x)^T - f'(x)) w \rangle \\ &= \left \langle v, \begin{bmatrix} 0 & \frac{\partial f_2(x)}{\partial x_1} - \frac{\partial f_1(x)}{\partial x_2} & \frac{\partial f_3(x)}{\partial x_1} - \frac{\partial f_1(x)}{\partial x_3} \\ \frac{\partial f_1(x)}{\partial x_2} - \frac{\partial f_2(x)}{\partial x_1} & 0 & \frac{\partial f_3(x)}{\partial x_2} - \frac{\partial f_2(x)}{\partial x_3} \\ \frac{\partial f_1(x)}{\partial x_3} - \frac{\partial f_3(x)}{\partial x_1} & \frac{\partial f_2(x)}{\partial x_3} - \frac{\partial f_3(x)}{\partial x_2} & 0 \end{bmatrix} w \right\rangle \\ &= \langle v, w \times (\nabla \times f) \rangle \\ &=\tag{2} \langle \nabla \times f, \underbrace{v \times w}_{\substack{\text{Area vector}\\ \text{for this tiny} \\ \text{parallelogram}}} \rangle. \end{align*}

Here $f_1, f_2$, and $f_3$ are the component functions of $f$ and $ f'(x) = \begin{bmatrix} \frac{\partial f_1(x)}{\partial x_1} & \frac{\partial f_1(x)}{\partial x_2} & \frac{\partial f_1(x)}{\partial x_3} \\ \frac{\partial f_2(x)}{\partial x_1} & \frac{\partial f_2(x)}{\partial x_2} & \frac{\partial f_2(x)}{\partial x_3} \\ \frac{\partial f_3(x)}{\partial x_1} & \frac{\partial f_3(x)}{\partial x_2} & \frac{\partial f_3(x)}{\partial x_3} \\ \end{bmatrix} $ is the Jacobian matrix of $f$ at $x$. The vector $\nabla \times f$, which is called the "curl" of $f$ at $x$, is defined by $$ \nabla \times f = \begin{bmatrix} \frac{\partial f_3(x)}{\partial x_2} - \frac{\partial f_2(x)}{\partial x_3} \\ \frac{\partial f_1(x)}{\partial x_3} - \frac{\partial f_3(x)}{\partial x_1} \\ \frac{\partial f_2(x)}{\partial x_1} - \frac{\partial f_1(x)}{\partial x_2} \end{bmatrix}. $$ This is the moment in math when we discover the curl for the first time. Technically, I should write the curl of $f$ at $x$ as $(\nabla \times f)(x)$.

The final step in our derivation of Stokes's theorem is to apply formula (2) to the sum on the left in equation (1). Let $\Delta A_i$ be the "area vector" for the $i$th tiny parallelogram. In other words, the vector $\Delta A_i$ points outwards, and the magnitude of $\Delta A_i$ is equal to the area of the $i$th tiny parallelogram. Let $x^i \in \mathbb R^3$ be the point where the $i$th tiny parallelogram is based. (The $i$ here is a superscript, not an exponent.) Combining formulas (1) and (2) reveals that \begin{align} \int_C f \cdot dr &\approx \sum_i (\nabla \times f)(x_i) \cdot \Delta A_i \\ &\approx \int_S (\nabla \times f) \cdot dA. \end{align} We have discovered the Stokes's theorem formula. It seems plausible that we can make the approximation as accurate as we like by chopping up $S$ into sufficiently small pieces. Thus, we conclude that $$ \int_C f \cdot dr = \int_S (\nabla \times f) \cdot dA $$

Comments:

I gave a similar derivation of Green's theorem here. I also wrote notes that attempt to give a similar derivation of the generalized Stokes's theorem here.
Physicists frequently use similar arguments when deriving Stokes's theorem. Feynman, for example, integrates a vector field around a little square in the $xy$-plane, then recognizes that the result can be expressed in terms of the curl vector. Here is the relevant passage from Feynman: However, how did Feynman discover the curl in the first place? He did it by treating the gradient operator $\nabla$ as a vector, and symbolically computing the cross product of this "vector" with $f$. I find that to be interesting and characteristically Feynman, but I also want a more direct way to discover Stokes's theorem, the same way that we discovered Green's theorem. (See section 3-6 and section 2-5 of volume II of the Feynman Lectures on Physics for reference.)

The book Div, Grad, Curl and All That computes the three components of the curl vector by integrating a vector field around small rectangles which are parallel to either the $xy$-plane or the $xz$-plane or the $yz$-plane. The author remarks, "It turns out that these three quantities are the Cartesian components of a vector. To this vector we give the name 'curl of $\mathbf F$,' which we write $\text{curl } \mathbf F$." In other words, now paraphrasing and switching to my notation, they assume the existence of a vector $(\nabla \times f)(x)$ which satisfies $$ (\nabla \times f)(x) \cdot \Delta A \approx \int_{\partial E} f \cdot dr $$ for any tiny planar surface $E$ containing $x$ with area vector $\Delta A$. By considering the special cases where $E$ is a rectangle and $\Delta A$ is parallel to either $\hat i$ or $\hat j$ or $\hat k$, they derive the components of $(\nabla \times f)(x)$. Here is the relevant passage:
When deriving Green's theorem and the divergence theorem, physicists typically chop up the region that we are integrating over into tiny rectangles or tiny boxes. I think the most clear and elegant way to make this strategy work for Stokes's theorem is to chop up $S$ into tiny parallelograms. In fact, I think we should also use parallelograms or parallelepipeds when deriving Green's theorem and the divergence theorem. This strategy can even be used to derive the generalized Stokes's theorem and to discover the exterior derivative (by chopping up a smooth manifold into tiny parallelepipeds).
One way to chop up $S$ into tiny parallelograms is to start with a rectangular region $R$ that is chopped up into tiny rectangles, then smoothly morph $R$ onto $S$. If $S$ is not diffeomorphic to a rectangular region, then $S$ can at least be broken into simpler pieces, each of which is diffeomorphic to a rectangular region.
When deriving equation (2), I used the first-order Taylor approximation $$ \tag{3} f(x + v) - f(x) \approx f'(x) v. $$ The approximation is good when $v$ is small. The Jacobian matrix $f'(x)$ is also called the derivative of $f$ at $x$. The approximation (3), which Terence Tao refers to as "Newton's approximation", is the key idea of calculus. It is essentially the definition of $f'(x)$. The fundamental strategy of calculus is to take a nonlinear function $f$ (difficult) and approximate it locally by a linear function (easy). When deriving the formulas of calculus, we always find that we use the approximation (3) at the crucial moment.
It would also be ok to evaluate $f$ at the midpoints of the edges when approximating the integral of $f$ along each edge of the tiny parallelogram. So the integral of $f$ along edge 1 is approximately $f(x + v/2) \cdot v$, the integral of $f$ along edge 2 is approximately $f(x + v + w/2) \cdot w$, etc. These are typically more accurate approximations and the calculation works out equally nicely. However, since our goal is just to provide an intuitive derivation of Stokes's theorem, we might as well keep the calculation as simple as possible.

I’ve studied Stokes’ theorem several times over the last 20 years (as a hobbyist) and this answer has given me the biggest breakthrough in understanding ever. In particular, I think I understand now how differential forms work in this context. Thank you so much!! One remaining challenge for me is how to create orientation on n-dimensional parallelpipeds, but that’s not relevant here. — Todd Wilcox, Commented Dec 4, 2021 at 18:38
Interestingly, almost this exact same logic can be used to derive Cauchy's Theorem in complex analysis — BlueRaja - Danny Pflughoeft, Commented Dec 4, 2021 at 23:40
@BlueRaja-DannyPflughoeft Isn't Cauchy's theorem basically just (Green's theorem) + (Cauchy-Riemann equations)? — tparker, Commented Dec 5, 2021 at 1:47
I think what's interesting about @BlueRaja-DannyPflughoeft's comment is that we can discover Cauchy's theorem directly without needing to use either Green's theorem or the Cauchy-Riemann equations. If $f:\mathbb C \to \mathbb C$ is complex differentiable then the contour integral of $f$ around a tiny parallelogram in the complex plane is approximately $(f(x+v) - f(x))w - (f(x+w) - f(x))v \approx f'(x)vw - f'(x) wv = 0$. That's a beautiful way to discover Cauchy's theorem. — littleO, Commented Dec 6, 2021 at 10:16
@john yes he use $\langle Ax,y\rangle=\langle x,A^{T}y\rangle$ for any matrix $A$ — ali, Commented Dec 7, 2021 at 11:00

Bruce · Accepted Answer · 2021-12-06 03:43:14Z

For me discovering is different from proving or deriving which seems to be the focus of some of the other responses. Of course - what is obvious is subjective. But, for my part I can say how it happened for me. That is, before I heard of Stoke's theorem, I was on my way to a similar conclusion, and the following is why.

First, I knew of the fundamental theorem of calculus of real functions of a real value. And I also had some idea that if something is flowing then the amount of it that passes out through a boundary has to give the chance in the amount inside. That means that there is a boundary integral that relates to a rate of change. If you think about the fundamental theorem of real function you can see the square bracket notation - the difference in the value of a function at the end points - as a kind of directed boundary integral. Integral and sum are clearly very closely related.

Now from a theory of flow one has that $-\nabla\dot{}f$ is the rate of change and $\hat{n} \cdot f$ is the flow through the boundary - where $f$ is the flow and $\hat{n}$ is the normal unit vector. So what we have is a interior integral of some kind of derivative of a function is equal to a boundary integral of some operator on the original function.

At this point I got a bit perplexed because the boundary integral was not the function but a projection of it. And there is an indefinite number of differential operators that could be involved. Then, it dawned on me that the relation was not really about $f$, but about $-\nabla\cdot$ and $\hat{n}\cdot$.

So, if we say that $-\nabla\cdot$ is the derivative of the operator $\hat{n}\cdot$ in some sense, then we have the following thought: the boundary operator and the differential operator can be exchanged inside an integral. And that is one way of looking at Stoke's theorem.

[Yes, details skipped, but this is intended to be about discovery].

Jules · Accepted Answer · 2022-01-22 23:52:18Z

A first simplification is that in Stokes' theorem, everything is happening on the surface. So in a certain sense it is a 2D theorem, not a 3D theorem. If we restrict to the parametrisation on the surface, Stokes' theorem is precisely Green's theorem.

So: how would you discover Green's theorem?

Let us start with a more intuitive theorem: the divergence theorem. The divergence theorem says that if you have a steady flow of water with a bunch of sources and sinks, then for any volume $V$ we pick, the net amount of water created inside $V$ is equal to the net amount water flowing through the surface of $V$.

I hope the idea behind this theorem is intuitive enough, even though translating it into formal language is another matter.

Notice now that there surely is a 2D version of the divergence theorem, where all the flowing of the water is happening in the plane. That theorem says that the net source in an area is equal to the net flow through the boundary curve of the area.

As it turns out, this is precisely Green's theorem. This is very unclear the way it is usually presented, where Green's theorem says something like "the amount of swirly in an area is equal to the line integral along the boundary", whatever that means.

Fortunately, it becomes much clearer when we rotate each vector of the vector field by $90$ degrees. Rotating the vector field by $90$ degrees turns every bit of swirly into a bit of divergence! And it turns the line integral along the curve into the flow through the curve, a much more intuitive concept.

If we look at the statement it becomes more clear. The usual statement is if you have a vector field $(L,M)$ then we have

$$\oint Ldx + Mdy = \iint \left(\frac{\partial M}{\partial x} - \frac{\partial L}{\partial y}\right)\,dxdy$$

Notice that rotating by $90$ degrees gives $(L,M) \mapsto (-M,L)$ so we get

$$\oint Ldy - Mdx = \iint \left(\frac{\partial L}{\partial x} + \frac{\partial M}{\partial y}\right)\,dxdy$$

We see the 2D divergence $\frac{\partial L}{\partial x} + \frac{\partial M}{\partial y}$ on the right hand side, and on the left we see $$Ldy - Mdx = (L,M) \cdot (dy,-dx) = (L,M) \cdot \hat{n},$$ i.e. the flow through the boundary.

OverLordGoldDragon · Accepted Answer · 2021-12-05 12:10:18Z

0

Stokes' Theorem arises in need to understand flow of a vector field along a surface. I explore several of its aspects, intuitively, with visuals and links to interactives, here.

For example, how can curl be zero everywhere, but circulation around a closed curve non-zero? Or why is circulation along a closed durve independent of the surface enclosing it? These are key to making sense of "conservation laws" in Physics, e.g. electromagnetic fields.

answered Dec 5, 2021 at 12:10

OverLordGoldDragon

2201 silver badge10 bronze badges

4

$\begingroup$ As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center. $\endgroup$
– Community Bot
Commented Dec 5, 2021 at 12:10

Add a comment |

Stack Exchange Network

How would you discover Stokes's theorem?

4 Answers 4

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
multivariable-calculus
stokes-theorem
.

Linked

Hot Network Questions

How would you discover Stokes's theorem?

4 Answers 4

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged multivariable-calculusstokes-theorem.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
multivariable-calculus
stokes-theorem
.