86
$\begingroup$

Integration by parts comes up a lot - for instance, it appears in the definition of a weak derivative / distributional derivative, or as a tool that one can use to turn information about higher derivatives of a function into information about an integral of that function. Concrete examples of this latter category include: proving that $f \in C^2(S^1)$ implies that the Fourier series of $f$ converges absolutely and uniformly, and the Taylor series expansion with the integral formula for remainder.

However, I don't feel like I really understand what integration by parts is really doing. To me, it is just an algebraic trick that follows from the fundamental theorem of calculus and the product rule. Is there some more conceptual way to think about it?

How do you think about this useful idea?

$\endgroup$
3
  • 48
    $\begingroup$ It is the product rule. $\endgroup$ Commented Jun 9, 2014 at 7:13
  • 8
    $\begingroup$ And integral substitution corresponds to differentiation's chain rule. Same idea. $\endgroup$
    – mvw
    Commented Jun 9, 2014 at 13:26
  • 2
    $\begingroup$ It's simply an expression that's obtained by integrating the product rule for differentiation and rearranging the result. That's why the same conditions for differentiablity of a real valued map on an interval need to apply when IBP is valid.Why make something simple difficult-isn't analysis hard enough for you?lol $\endgroup$ Commented Jul 16, 2016 at 20:11

8 Answers 8

112
$\begingroup$

I've always found it helpful to think about it like this: (picture source)

enter image description here

The area of the gray areas combined is $u_2v_2 - u_1 v_1$, which is where the $uv$ term comes from.

$\endgroup$
7
  • 1
    $\begingroup$ I saw something like this in my high school calc textbook, but I found this one through Google. If I were to make one like this, I'd probably use a vector graphics tool like Inkscape. $\endgroup$ Commented Jun 9, 2014 at 23:24
  • 3
    $\begingroup$ @boardbite - the linked blog shows the entire python source of that plot. $\endgroup$
    – Bach
    Commented Jun 10, 2014 at 9:08
  • 5
    $\begingroup$ I don't really understand the picture...maybe you can explain what is going on. I see how it is a mnemonic for remembering the formula. $\endgroup$
    – Eric Auld
    Commented Nov 23, 2015 at 18:11
  • 2
    $\begingroup$ @Eric Auld From the limits $a,A$ (the lower black circle) and $b,B$ (the upper black circle) we have the integral of a function and its inverse in such a way that, as integration by "parts", we have the combined area of the two integrals that can be given by the difference between the larger square ($area=bB$) and the smaller square ($area=aA$). The area of the larger square minus the smaller square ($bB-aA$) is therefore equal to the sum of the integral of $u$ and the integral of its inverse $v$. In other words, $bB-aA=\int_a^budv+\int_A^Bvdu$ or whatever equivalencies. $\endgroup$
    – user521846
    Commented Apr 10, 2018 at 12:12
  • 2
    $\begingroup$ limit points*, but you understand what I mean $\endgroup$
    – user521846
    Commented Apr 10, 2018 at 22:15
46
$\begingroup$

Integration by parts is a corollary of the product rule:

$(uv)' = uv' + u'v$

Take the integral of both sides to get $uv = \int u \ dv + \int v \ du$.

If you were supposed to remember it separately from the product rule then it's not as easy to work with as you have to make guesses as to what to assign $u$ and what to assign $dv$ (usually $dv = f(t) dt$). But if you have knowledge of product rule then you take the integrand (in terms of $t$) call it $F(t)$ and use the product rule on it first. Then your choices of $u, dv$ are readily obvious.

$\endgroup$
4
  • 13
    $\begingroup$ More precisely, $uv = \int u\ dv + \int v\ du\ \mathbf{+\ C}$. Otherwise, you run into trouble. $\endgroup$ Commented Jun 9, 2014 at 18:53
  • 14
    $\begingroup$ This is the crapiest answer I've ever made and it nets me 160 points. Wtf... Just a consequence of the number of views I guess. $\endgroup$ Commented Jun 11, 2014 at 6:43
  • 14
    $\begingroup$ Welcome to economics 101. Society's utility from your answer is a function of the demand as well as the quality. ;-) $\endgroup$ Commented Jun 11, 2014 at 8:23
  • $\begingroup$ Please upvote some of my recent posts, not this one lol $\endgroup$ Commented May 1, 2022 at 5:35
38
$\begingroup$

One idea is that integration by parts expresses the fact that the adjoint of $\frac{d}{dx}$ is $-\frac{d}{dx}$ (in a setting where boundary terms vanish). In the multivariable case, integration by parts expresses the fact that the adjoint of $\nabla$ is $-\text{div}$.

$\endgroup$
2
  • 9
    $\begingroup$ +1 This is a very good way to look at it. In terms of matrices, it might be helpful to think about the (symmetric) discretized first derivative operator which looks, down the diagonal away from the edges, like $$D = \pmatrix{ 0 && \frac 1 2 && \\ -\frac 1 2 && 0 && \frac 1 2 \\ && -\frac 1 2 && 0 && \frac 1 2 \\ && && -\frac 1 2 && 0 }$$ and obviously has the adjoint $-D$, again away from the boundaries. $\endgroup$ Commented Jun 9, 2014 at 9:06
  • 1
    $\begingroup$ I like the adjoint interpretation. It encourages thinking of an integral of a product as a scalar product on $L^2$ $\endgroup$ Commented Jun 9, 2014 at 11:29
18
$\begingroup$

Might not be rigorous but this takes the cake for me:

$$\sum_{k=m}^n f_k(g_{k+1}-g_k) = [f_{n+1}g_{n+1} - f_m g_m] - \sum_{k=m}^n g_{k+1}(f_{k+1}- f_k).$$

$$\int u\,dv = uv - \int v\,du$$

The first formula takes a sum that includes a $f\Delta g$ term and transforms it to a sum containing a $g\Delta f$. Integration by parts takes an integral with a $u\,dv$ term and transforms it to an integral with a $v\,du$ term.

(Wikipedia- Summation by Parts)

$\endgroup$
2
  • 2
    $\begingroup$ This is not intuitive at all, which is what the OP was looking for; it is simply a statement regurgitated directly from Wikipedia. $\endgroup$
    – beep-boop
    Commented Jul 2, 2014 at 22:28
  • 11
    $\begingroup$ This shows how integration by parts and summation by parts are related using Riemann Sums. Summation by parts is easily verified, so this gives an understandable validation of integration by parts. $\endgroup$
    – robjohn
    Commented Jul 7, 2014 at 0:02
14
$\begingroup$

Consider the integral $I=\int f(x)g(x)dx$, ie. the integral of the product of two functions.

Now imagine sliding $f(x)$ a small distance $\epsilon$ along the $x$-axis relative to $g$. With reasonable assumptions about differentiability, the integral becomes $$\int f(x+\epsilon)g(x)dx=\int (f(x)+\epsilon f'(x))g(x)dx+O(\epsilon^2)$$ which becomes $$I+\epsilon\int f'(x)g(x)dx+O(\epsilon^2).$$

So $\int f'(x)g(x)dx$ tells you how $I$ changes as you slide $f$. But this must be the opposite of sliding $g$ the other way. So it should also equal $\int f(x)g'(x)dx$, modulo some bits that fall off the end as you slide if your integration region has endpoints.

If you look at many applications of integration by parts, you may find this explanation fits well. It can help make the derivation of the Euler-Lagrange equations clearer and gives insight into its frequent use in domains like electromagnetism and quantum mechanics. For example it becomes completely obvious that the $p$ operator from quantum mechanics is Hermitian and it's clear how this is directly related to its role as the generator of translations.

$\endgroup$
2
  • 1
    $\begingroup$ Suppose that my integral is computed over the interval $[a,b]$. Could you elaborate on why terms that fall off at the endpoints have the form $fg|_{a}^b$? I have half-baked ideas about that, but I'm not finding them very convincing. Otherwise I like this explanation very much, thank you for posting it. $\endgroup$
    – Elle Najt
    Commented Jun 11, 2014 at 17:58
  • 1
    $\begingroup$ Imagine we're standing still watching $f$ sliding along between $a$ and $b$, with $g$ "at rest", and calculating $I$. Now we switch to a frame of reference where $f$ is at rest and $g$ is sliding along. The situation is almost the same, and we can use what I said above, but in this new frame of reference the endpoints $a$ and $b$ are moving. So to calculate the rate at which $I$ is changing we need to include the fact that to first order, if we slide a distance $\epsilon$, $\epsilon f(a)g(a)$ drops off one end and $\epsilon f(b)g(b)$ appears at the other end. $\endgroup$
    – Dan Piponi
    Commented Jun 11, 2014 at 19:28
13
$\begingroup$

I like to think of integration by parts as Fubini's Theorem. So if $$ F(x) = \int_a^x f(y) \, dy, \quad G(x) = \int_a^x g(y) \, dy ,$$ then $$ \int_a^b F(x) g(x) \, dx + \int_a^b f(x) G(x) \, dx $$ $$ = \int_{x=a}^b \int_{y=a}^x f(y) g(x) \, dy \, dx + \int_{x=a}^b \int_{y=a}^x f(x) g(y) \, dy \, dx $$ $$ = \int_{x=a}^b \int_{y=x}^b f(x) g(y) \, dy \, dx + \int_{x=a}^b \int_{y=a}^x f(x) g(y) \, dy \, dx $$ where in the first half I switched the roles of $x$ and $y$, and then interchanged the order of integration $$ = \int_{x=a}^b \int_{y=a}^b f(x) g(y) \, dy \, dx = F(b) G(b) = [ F(x)G(x) ]_{x=a}^b $$ remembering that $F(a) = G(a) = 0$. (I know in essence this is the same as Henry Swanson's answer, but this is a different perspective.)

$\endgroup$
1
  • 3
    $\begingroup$ This looks similar to the fact that the product rule is an instance of a higher dimensional chain rule. If we have a product of two functions, $f(x)g(x)$, then we may write that as a composition of $(x,y)\mapsto (f(x), g(y))$ followed by the map, $(x,y)\mapsto xy$. If we apply the higher dimensional chain rule to this, then we get the product rule. $\endgroup$ Commented Jun 11, 2014 at 16:13
6
$\begingroup$

A mathematical idea is useful for what you can use it for. A "conceptual" explanation may be attractive, but is relatively worthless for actually using integration by parts to do anything.

Instead, you should think of integration by parts in terms of how it lets you manipulate integrals; e.g. when you have $x$ or $\ln x$ or $\arctan x$ in an integrand, you can arrange to apply integration by parts differentiate it away into $1$ or $\frac{1}{x}$ or $\frac{1}{1+x^2}$ respectively, which potentially gives an integrand simpler than what you started with, depending on what the antiderivative of the cofactor is.

Or as another example, in the case of the distributional derivative, it let's you remove a derivative from one factor in an integrand by differentiating the other factor.

This isn't really something one understands a priori: instead, by working through problems, one gains experience and eventually an intuitive understanding of how it can be used to simplify an integrand.

$\endgroup$
6
  • 6
    $\begingroup$ I don't understand why you are being downvoted, as there is certainly some merit to an answer like this. However, I do disagree with your first paragraph - conceptual explanations can sometimes help one be creative with applications, in a way that simply getting used integration by parts as a calculational tool does not. For instance, I think that the description of this idea as an adjoint (when the boundary terms vanish) leads very naturally to considering distributions as a dual of the space of test functions. I think that that insight took a long time to achieve, historically speaking. $\endgroup$
    – Elle Najt
    Commented Jun 9, 2014 at 13:35
  • 11
    $\begingroup$ Dear @user54092 : I dunno: invalidating the user's pursuit of a conceptual explanation and then emphasizing focus what a process is useful for seems like it is encouraging all the bad thought patterns of students who think mathematics is "memorization and repetition of techniques that are useful for solving math problems." I know what Hurkyl is referring to, but in this case it seems like there are plenty of useful conceptual explanations to be had, so maybe that's why someone was critical... (didn't downvote btw) $\endgroup$
    – rschwieb
    Commented Jun 9, 2014 at 13:41
  • $\begingroup$ @rschwieb That make sense. I do think that this response conveys something of that negative attitude, but I have also found that sometimes the best way to understand an idea is to understand how it is useful to other ideas. It can sometimes be easier to understand what something does than what it is (this is especially true outside of mathematics), though I don't think one should generalize that experience to say that ideas should only be thought of as tool for other means - they have an independent life. I was asking for how each user thinks of this idea, so this is valid response in my mind. $\endgroup$
    – Elle Najt
    Commented Jun 9, 2014 at 14:40
  • $\begingroup$ @user54092 : Right, I think so too :) $\endgroup$
    – rschwieb
    Commented Jun 9, 2014 at 14:42
  • 2
    $\begingroup$ This answer is not very useful since OP clearly understands what the superficial utility of by-parts is and wants a deeper understanding. I wouldn't downvote it though, nor upvote. $\endgroup$
    – PA6OTA
    Commented Jun 10, 2014 at 3:49
4
$\begingroup$

I learned this point of view recently:

"Integration by parts is a consequence of the translation invariance of the Lebesgue measure"

What this means is the following:

$\int \frac{f(x + h) - f(x)}{h} g(x) dx = \frac{1}{h} [\int f(x + h)g(x) dx - \int_{\mathbb{R}} f(x) g(x) dx] = \frac{1}{h} [ \int_{\mathbb{R}} f(y)g(y - h) dy - \int_{\mathbb{R}} f(y)g(y) dy ] = \int_{\mathbb{R}} f(y) \frac{ g(y - h) - g(y)}{h} dy$

Translation invariance appears in the second equality, where we substitute $x + h = y$ into the first integral on the RHS.

Sending $h \to 0$ gives:

$\int_{\mathbb{R}} f'(x) g(x) dx = - \int_{\mathbb{R}} f(y) g'(y) dy$.

The boundary conditions can be obtained in the following way:

$$\int_a^b \frac{f(x + h) - f(x)}{h} g(x) dx = \frac{1}{h} [\int_a^b f(x + h)g(x) dx - \int_a^b f(x) g(x) dx] = \frac{1}{h} [ \int_{a + h}^{b + h} f(y)g(y - h) dy - \int_a^b f(y)g(y) dy ]$$ $$= \frac{1}{h} [ \int_{a}^{b} f(y)g(y - h) dy - \int_a^{a+h} f(y)g(y -h) dy + \int_b^{b+h} f(y) g(y - h) dy - \int_a^b f(y)g(y) dy ]$$ $$ = \frac{1}{h} [ \int_{a}^{b} f(y)[g(y - h) - g(y)] dy - \int_a^{a+h} f(y)g(y -h) dy + \int_b^{b+h} f(y) g(y - h) dy $$

Whereupon sending $h \to 0$ gives $\int_a^b f(x) g'(x) dx = - \int_a^b f(y) g'(y) dy- f(a)g(a) + f(b) g(b) $.

The broader picture that this story fits into is that this shows what must change in order to obtain the integration by parts formula in the Malliavin calculus. This also shows how to obtain an integration for parts formula against a measure whose translation in a particular direction is absolutely continuous.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .