4
$\begingroup$

I'm studying calculus of variations and Lagrangian mechanics and i don't understand something about the variational operator

Let's say for example that i got a Lagrangian $L [x(t), \dot{x}(t), t] $ which is a functional of a position function $ x(t) $ its derivatives and time.

If i take the variation of the Lagrangian this is

$ \delta L = \frac{\partial L}{\partial x} \delta x + \frac{\partial L}{\partial \dot{x}} \delta \dot{x} $

Which is

$ \delta L = \frac{\partial L}{\partial x} \delta x + \frac{\partial L}{\partial \dot{x}} \frac{d}{dt} (\delta x) $

My question is, what is $\delta x $ ?

Is it defined like $ \delta L $ like

$ \delta x = \frac{\partial x}{\partial t} dt $

But this is just the differential $ dx $

What is the difference? Or are they the same thing?

$\endgroup$

4 Answers 4

3
$\begingroup$

In this answer I want to highlight the final paragraph of the answer to this question by contributor 'hft'.

A common approach in exposition of calculus of variations is to implement the variation with a helper function, often denoted as $\eta(t)$

This helper function $\eta(t)$ is defined exclusively between two time coordinates, $t_1$ and $t_2$ (In order to facilitate derivation of the Euler-Lagrange equation the helper function must be such that at those two time coordinates its value is zero.)

With that setup sweeping out variation is implemented by multiplying $\eta(t)$ with a factor, often notated as $\epsilon$

(The setup with multiplication allows the helper function $\eta(t)$ to be an arbitrary function; the multiplication ensures: at $\epsilon = 0$ the function $\epsilon\eta(t)$ is at zero variation.)

Using $\epsilon\eta(t)$ to sweep out variation:

$$ \int_{t_1}^{t_2} F \big( \ x(t) + \epsilon \eta(t), \ x'(t) + \epsilon \eta'(t) \ \big) dt \tag{1} $$

With 'F' representing that the expression is for a functional.

To obtain the information that you need: obtain the derivative with respect to $\epsilon$ at the point where $\epsilon$ is zero. That is, you take the derivative with respect to the variation at the point in variation space where the value of the variation is zero.

One way to notate that is as follows:

$$ \frac{d}{d\epsilon}\int_{t_1}^{t_2} F \big( \ x(t) + \epsilon \eta(t), \ x'(t) + \epsilon \eta'(t) \ \big) dt \ \bigg\rvert_{\epsilon=0} \tag{2} $$

Notated in this way: it isn't necessary to think of this process as taking the limit of $\epsilon$ going to zero. It's just: usually when you take a derivative you are interested in that derivative for the entire domain. Here you need that derivative only for a single point in the variation space: $\epsilon = 0$.

Since the derivative with respect to $\epsilon$ is for a single value of $\epsilon$ the factor $\epsilon$ drops out: the Euler-Lagrange equation does not have it.

I prefer this notation/implementation of the process of sweeping out variation; it makes the distinction clear.

$\endgroup$
3
  • $\begingroup$ Upvoting and agreeing. I'll suggest to write the answer for a function $F(x(t), x'(t), t)$ with an explicit dependence on the independent variable $t$. This way, getting the very same Lagrange equation as the result of the stationary condition of the functional, we answer the typical and frequent question about the absence of time partial derivative of the function in Lagrange equation $\endgroup$
    – basics
    Commented Jan 20 at 23:29
  • $\begingroup$ I'll add an answer later $\endgroup$
    – basics
    Commented Jan 21 at 8:11
  • $\begingroup$ I've added an answer below. Try to have a look, if it's more clear what I meant before $\endgroup$
    – basics
    Commented Jan 21 at 9:56
3
$\begingroup$

Let's say for example that i got a Lagrangian $L [x(t), \dot{x}(t), t] $ which is a functional

No, $L$ is not a functional. In physics we reserve the term "functional" to refer to a function that takes as input an entire function and returns a number.

For example, the action $$ S[x]=\int_{t_1}^{t_2} L(x(t),\dot x(t))dt $$ is a functional of the function $x(t)$. Physicists use the square bracket, $[\ldots]$, notation to indicate a functional's argument list, whereas we use the parenthesis notation, $(\ldots)$ to indicate a function's argument list.

The Lagrangian $L(q,v)$ is still just a function of two arguments, whose arguments you can think of as being evaluated at the values $x(t)$ and $\dot x(t)$.

My question is, what is $\delta x $ ?

Here, $x(t)$ is a function of time, and $\delta x(t)$ is another function of time, but the notation $\delta x(t)$ is used because we are supposed to think of $\delta x(t)$ as "small."

Here, when we think of something as "small" all this really means is that we throw away all but the first order changes with respect to the small parameter.

For example, if $$ L_1(q,v) = q^2 v\tag{1} $$ then $$ L_1(x(t)+\delta x(t), \dot x(t) + \delta \dot x(t)) = (x(t) + \delta x(t))^2(\dot x(t) + \delta \dot x(t)) $$ and the first order change is: $$ \delta L_1 = 2x(t)\delta x(t)\dot x(t) + x(t)^2\delta\dot x(t) $$ $$ =\left.\frac{\partial L_1}{\partial q}\right|_{q=x(t),v=\dot x(t)}\delta x(t) + \left.\frac{\partial L_1}{\partial v}\right|_{q=x(t), v=\dot x(t)}\delta \dot x(t) $$

Similarly, if we are interested in the variation of the action $S$, when we treat $\delta x(t)$ as "small" we throw away all by the first order change in the action: $$ S[x +\delta x] - S[x] \equiv \delta S + O(\delta x^2)\;, $$ where my $\delta S$ means only the first order change.

We thus have: $$ \delta S = \int_{t_1}^{t_2}\underbrace{\left(\frac{\partial L}{\partial x} - \frac{d}{dt}\frac{\partial L}{\partial \dot x}\right)}\delta x(t)dt\tag{A} $$

In analogy to partial differentiation, we call just the underbraced quantity in Eq. (A) above the "functional deriviative" and thus write: $$ \frac{\delta S}{\delta x(t)} = \left(\frac{\partial L}{\partial x} - \frac{d}{dt}\frac{\partial L}{\partial \dot x}\right)\;.\tag{B} $$

Setting the functional derivative of the action to zero results in the Euler-Lagrange equations of motion.


Is it defined like $ \delta L $ like

$ \delta x = \frac{\partial x}{\partial t} dt $

No, $\delta x(t)$ is not defined like that. It is just an (almost) arbitrary "small" function defined over the same time period that $x(t)$ is. The only constraints on $\delta x$ are usually that $\delta x(t_1)=0$ and $\delta x(t_2)=0$, which allows us to perform a trivial integration by parts and more easily write down the functional derivative.

If you would prefer to think of the "small" function as an arbitrary function $\eta(t)$ times a constant number $\epsilon$, in the limit that $\epsilon$ goes to zero, that is fine too.


Update:

To say a little more about the $\epsilon$ method, as mentioned by other answers, the functional derivative can be generated by considering the action $S[x+\epsilon\eta]=S(\epsilon)$ to be a function of epsilon and identifying: $$ \left.\frac{dS}{d\epsilon}\right|_{\epsilon=0}\equiv \int dt\left(\frac{\delta S}{\delta x(t)}\right)\eta(t)\;. $$

For example, using the example Lagrangian of Eq. (1) above, we have: $$ S_1(\epsilon)=\int_{t_1}^{t_2}dt L_1(x(t)+\epsilon\eta(t), \dot x + \epsilon\eta(t))=\int_{t_1}^{t_2}dt (x(t)+\epsilon\eta(t))^2(\dot x(t)+\epsilon\dot \eta(t)) $$ and then differentiating under the integral sign and applying the chain rule gives: $$ \left.\frac{dS_1}{d\epsilon}\right|_{\epsilon=0}=\int_{t_1}^{t_2}dt\left(2x(t)\dot x(t) - \frac{d x^2(t)}{dt}\right)\eta(t) $$ $$ =\int_{t_1}^{t_2}dt\left(\frac{\partial L_1}{\partial x}-\frac{d}{dt}\frac{\partial L_1}{\partial \dot x}\right)\eta(t) $$


I often find it helpful to work out simple examples like $L_1(x,\dot x) = x^2\dot x$ explicitly, since it helps see what is happening in the general case.

$\endgroup$
2
$\begingroup$

$\delta x(t) = x_2(t) - x_1(t)$ measures the difference between two functions (or path) and itself is a function. If the two functions are very "close", we say that it is an infinitesimal variation. The sense of "close" can be defined by the "distance" between two functions. One way is to define : $$\text{distance}(x_1,x_2) = \sqrt{||x_1-x_2||}:= \bigg( \int_{I} dt \; |x_1(t)-x_2(t)|^{2} \bigg)^{1/2}$$ There are many ways to define such distance.

As for $dx$, one may consider it as the difference : $dx = x(t + \delta t) - x(t)$. They have different meanings.


Under this sense, the Euler-Lagrange equation is derived in the following steps :

  1. Assume that $S = \int dt\; L(x_{s}(t),\dot{x}_{s}(t),t)$ reaches a local minimum, in which $x_{s}$ is the true physic path we desire.
  2. Since $S = \int dt \; L(x_{s}(t),\dot{x}_{s}(t),t)$ reaches a local minimum, for any other path $x(t)$, which is close enough to $x_{s}(t)$, me have : $$S[x] - S[x_s] \sim \mathcal{O}( (\delta x)^2)$$ just like in elementary calculus that a function $f(x)$ reaches a local minimum at $x_0$, then $$\rightarrow f'(x_0) = 0 \rightarrow f(x) = f(x_0) + f'(x_0) (x-x_0) + \mathcal{O}((x-x_0)^2) \rightarrow f(x) - f(x_0) \sim \mathcal{O}((x-x_0)^2)$$ As a result, $$S[x] - S[x_s] = \int dt \bigg(\frac{\partial L}{\partial x} - \frac{d}{dt}\frac{\partial L}{\partial \dot{x}}\bigg)\bigg|_{x = x_s} \delta x + \mathcal{O}((\delta x)^2)$$ Hence, we see that $$\bigg(\frac{\partial L}{\partial x} - \frac{d}{dt}\frac{\partial L}{\partial \dot{x}}\bigg)\bigg|_{x = x_s} = 0$$
$\endgroup$
1
$\begingroup$

Broad definition of functional

Let's write the a function $F(\dot{x}(t), x(t), t)$, whose first two arguments are a function $x(t)$ and it's time derivative $\dot{x}(t)$ with respect to $t$, the independent variable, that could be an argument (the third one, here) of the function $L$.

Being $F$ a function of functions (as the first two arguments), we could call it a functional.

As suggested by @hft, in Physics and in other fields, we're usually interested in functionals defined with a definite integral \begin{equation} I[x(t)] = \int_{t \in \Omega} F(\dot{x}(t), x(t), t) dt \ . \end{equation}

Why functionals and variations?

Now, many problem in Math and Physics can be formulated as a principle of stationarity of a functional with respect to the function argument $x(t)$. How to find that function? Approximately like we do for derivatives:

  • introducing a (small) "increment"/variation function $\varepsilon \eta(t)$, satisfying all the constraints of the problem (if any); here the "small" parameter $\varepsilon$ is introduced to make it goes to zero later
  • evaluating the difference between the function $L$ with the "variated function argument" $x(t)+\varepsilon \eta(t)$ and the "original functional argument" $x(t)$;
  • taking this difference as a function of $\varepsilon$ and performing a series expansion around $\varepsilon = 0$, \begin{equation}\begin{aligned} \delta F(\dot{x}(t), x(t), t; \varepsilon) & = F(\dot{x}(t)+\varepsilon \dot{\eta}(t), x(t)+\varepsilon\eta(t),t) - F(\dot{x}(t), x(t),t) = \\ & = \varepsilon \, \dot{\eta}(t) \, \dfrac{\partial F}{\partial \dot{x}}(\dot{x}(t), x(t),t) + \varepsilon \, \eta(t) \, \dfrac{\partial F}{\partial \dot{x}}(\dot{x}(t), x(t),t) + o(\varepsilon^2) \ \ , \end{aligned}\end{equation}
  • perform the incremental variation for $\varepsilon \rightarrow 0$ \begin{equation} \lim_{\varepsilon \rightarrow 0} \dfrac{ \delta F(\dot{x}(t), x(t), t; \varepsilon)}{ \varepsilon} = \dot{\eta}(t) \, \dfrac{\partial F}{\partial \dot{x}}(\dot{x}(t), x(t),t) + \eta(t) \, \dfrac{\partial F}{\partial \dot{x}}(\dot{x}(t), x(t),t) \ . \end{equation}

Now, and only now (when you understood a bit what we've done so far), I'd suggest to call $\eta(t) = \delta x(t)$ to remind you it's the function we used to vary the function $x(t)$, and $\delta F = \lim_{\varepsilon \rightarrow 0} \frac{ \delta F(\dot{x}(t), x(t), t; \varepsilon)}{ \varepsilon}$, for brevity (and remind you it's the result of a variation). With this notation, we can now re-write \begin{equation} \delta F(\dot{x}(t), x(t), t) = \delta \dot{x}(t) \dfrac{\partial F}{\partial \dot{x}}(\dot{x}(t), x(t), t) + \delta x(t) \dfrac{\partial F}{\partial x}(\dot{x}(t), x(t), t) \ . \end{equation}

Note: variations and differentials.

This last paragraph aims at answering the last question about the equivalence of differentials and variations, and to highlight the difference between them in the absence of the term $\frac{\partial F}{\partial t}$ (and thus completing the comment under @Cleonis answer).

If we take $F(\dot{x}(t), x(t), t)$ as a function of time $\tilde{F}(t) := F(\dot{x}(t), x(t), t)$, its differential reads \begin{equation} dF = d \dot{x}(t) \dfrac{\partial F}{\partial \dot{x}}(\dot{x}(t), x(t), t) + d x(t) \dfrac{\partial F}{\partial x}(\dot{x}(t), x(t), t) + \dfrac{\partial F}{\partial t} (\dot{x}(t), x(t), t) \ , \end{equation} and thus, beside the different notation, it differs from the variation of $F$ for the presence of the term $\frac{\partial F}{\partial t} (\dot{x}(t), x(t), t)$.

The reason why this term doesn't appear in the variation of the functional come from the process used to evaluate variation, and in detail from the very fact that we can only vary the unknown function argument $x(t)$ while we can't vary there the independent variable $t$.

Functionals, variations, mechanics, virtual work at fixed time. This is the very same reason why if you have to evaluate virtual work (or use any other variation method in mechanics) you need to compute that work at "fixed time", i.e. without the variation of the independent variable time $t$.

Lagrangian and action

Sometimes the functional is written as an integral, like the relationship between the action $S$ and the Lagrangian function $L$,

\begin{equation} S[x(t)] = \int_{t_1}^{t_2} L(\dot{x}(t), x(t), t) dt \ . \end{equation}

If you proceed in the same way we did before, introducing a "small" variation $\varepsilon \eta(t)$ to the unknown function $x(t)$, and evaluate the incremental variation for $\varepsilon \rightarrow 0$, you end up with

\begin{equation} \delta S = \int_{t_1}^{t_2} \left[ \dot{\eta}(t) \dfrac{\partial F}{\partial \dot{x}}(\dot{x}(t), x(t), t) + \eta(t) \dfrac{\partial F}{\partial x}(\dot{x}(t), x(t), t) \right] dt \ . \end{equation}

Prescribing the stationarity of the functional $S$, i.e. $\delta S = 0$, and performing integration by part, and the arbitrarity of $\eta(t)$ except for the conditions $\eta(t_1) = \eta(t_2) = 0$, it's easy to get now Lagrange equations

\begin{equation} \dfrac{d}{dt} \dfrac{\partial L}{\partial \dot{x}} - \dfrac{\partial L}{\partial x} = 0 \ . \end{equation}

$\endgroup$
3
  • $\begingroup$ We have: in dynamics the objective is to end up with a function that gives the trajectory as a function of time. In a cartesian coordinate system the position coordinate axis is perpendicular to the time coordinate axis. I think of the following expression as an operator: $$ \dfrac{d}{dt} \dfrac{\partial L}{\partial \dot{x}} - \dfrac{\partial L}{\partial x} $$ I like to call it the 'Euler-Lagrange operator'. In dynamics the Euler-Lagrange operator specifies differentiation with respect to the position coordinate. Exclusively differentiation wrt to position. [continuing...] $\endgroup$
    – Cleonis
    Commented Jan 21 at 11:11
  • $\begingroup$ [...continued] With the previous in place: the following notation doesn't make sense to me: $$ S[x(t)] = \int_{t_1}^{t_2} L(\dot{x}(t), x(t), t) dt$$ The difference: the following notation does make sense to me: $$ S[x(t)] = \int_{t_1}^{t_2} L(\dot{x}(t), x(t)) dt$$ Related example: the wikipedia article about stationary action features an often used picture with multiple trajectories, among them a trajectory that loops back on itself. That loop is in violation of how the Euler-Lagrange equation is derived; Euler-Lagrange is exclusively variation of the position coordinate. $\endgroup$
    – Cleonis
    Commented Jan 21 at 11:11
  • $\begingroup$ try to write the Lagrangian of a system with a prescribed coordinate; that coordinate may depend of time, and it's not among the free generalized coordinates $x(t)$: in such a situation, an explicit dependence of time may arise in $L$. (Maybe) obviously, with such an explicit dependence on time, energy conservation is no more granted. And such a time-dependence is not related to any "loop trajectory" $\endgroup$
    – basics
    Commented Jan 21 at 11:16