Calculus of Variations | Rudresh's Notes

Derivatives answer a simple question¹:

How much does an output change when we perturb an input in a given direction?

That “perturb it and watch what happens” idea scales up nicely:

from numbers,
to vectors,
to functions (this is where calculus of variations lives).

Functions of a real variable

Let $f:\mathbb{R}\to\mathbb{R}$ . The derivative at $x$ can be viewed as: push $x$ a tiny amount $\varepsilon$ in a direction $\delta x$ , and measure the rate of change at $\varepsilon=0$ :

f'(x)\,\delta x = \left.\frac{d}{d\varepsilon}\, f(x+\varepsilon\,\delta x)\right|_{\varepsilon=0}.

If you pick $\delta x = 1$ , you recover the usual definition:

f'(x)=\left.\frac{d}{d\varepsilon}\, f(x+\varepsilon)\right|_{\varepsilon=0}.

So $f'(x)$ is the linear map that takes a small input change $\delta x$ and predicts the corresponding output change.

Functions on vector spaces

Let $f:\mathbb{R}^k\to\mathbb{R}$ . The right generalization of “derivative” is the differential (a linear map):

df(x)[\delta m] = \left.\frac{d}{d\varepsilon}\, f(x+\varepsilon\,\delta m)\right|_{\varepsilon=0}, \qquad \delta m\in\mathbb{R}^k.

This is the directional derivative in direction $\delta m$ .

In Euclidean space, the differential is represented by the gradient:

df(x)[\delta m] = \nabla f(x)\cdot \delta m.

So:

$df(x)[\cdot]$ is a linear functional (takes a vector and returns a scalar),
$\nabla f(x)$ is the vector that represents that functional via the dot product.

From vectors to functions: functionals

Now we level up.

A functional is a map that takes a function as input and returns a real number: $J[u]\in\mathbb{R}$ .

Here $u$ might be a function $u:[a,b]\to\mathbb{R}$ , or $u:\Omega\subset\mathbb{R}^n\to\mathbb{R}$ .

A variation is a small perturbation of the function:

u_\varepsilon = u + \varepsilon v,

where $v$ is a “test direction” (often assumed smooth, and often with $v(a)=v(b)=0$ if boundary values are fixed).

First variation (the analogue of the directional derivative)

\delta J[u](v) = \left.\frac{d}{d\varepsilon}\, J[u+\varepsilon v]\right|_{\varepsilon=0}.

This is the calculus of variations version of $df(x)[\delta m]$ .

We say $u$ is a stationary point of $J$ (a candidate minimizer/maximizer) if:

\delta J[u](v)=0 \quad \text{for all admissible } v.

That condition is the analogue of “gradient equals zero”.

The classic setup: integral functionals

A very common functional looks like:

J[u] = \int_a^b L\big(x, u(x), u'(x)\big)\,dx,

where $L$ is called the Lagrangian.

Let’s compute the first variation.

Perturb $u$ as $u_\varepsilon = u + \varepsilon v$ . Then:

u_\varepsilon' = u' + \varepsilon v'.

Differentiate under the integral sign:

\delta J[u](v) = \left.\frac{d}{d\varepsilon}\right|_{\varepsilon=0} \int_a^b L\big(x, u+\varepsilon v, u'+\varepsilon v'\big)\,dx = \int_a^b \left( \frac{\partial L}{\partial u}v + \frac{\partial L}{\partial u'}v'\right)\,dx.

Remove the derivative on $v'$ using integration by parts:

\int_a^b \frac{\partial L}{\partial u'}v'\,dx = \left[\frac{\partial L}{\partial u'}v\right]_a^b - \int_a^b \frac{d}{dx}\left(\frac{\partial L}{\partial u'}\right)v\,dx.

If boundary values are fixed, we require $v(a)=v(b)=0$ , so the boundary term vanishes. Then:

\delta J[u](v) = \int_a^b \left( \frac{\partial L}{\partial u} - \frac{d}{dx}\left(\frac{\partial L}{\partial u'}\right) \right)v\,dx.

For this to be zero for all admissible $v$ , the only way out is:

\boxed{ \frac{\partial L}{\partial u} - \frac{d}{dx}\left(\frac{\partial L}{\partial u'}\right) = 0 }

This is the Euler–Lagrange equation.

That is the main engine of classical calculus of variations.

Boundary conditions (quick sanity)

If the endpoints are fixed: $u(a), u(b)$ prescribed, then $v(a)=v(b)=0$ .

If an endpoint is free, the boundary term does not automatically vanish, and you get a natural boundary condition:

\left.\frac{\partial L}{\partial u'}\right|_{x=a} = 0 \quad\text{or}\quad \left.\frac{\partial L}{\partial u'}\right|_{x=b} = 0,

depending on which end is free (details depend on the exact constraints).

A tiny example (so this feels real)

Minimize “smoothness energy”:

J[u] = \int_a^b \frac{1}{2}(u')^2\,dx, \quad \text{with } u(a)=\alpha,\; u(b)=\beta.

Here $L = \frac12 (u')^2$ . Then:

\frac{\partial L}{\partial u} = 0, \qquad \frac{\partial L}{\partial u'} = u'.

Euler–Lagrange gives:

0 - \frac{d}{dx}(u') = -u'' = 0 \quad\Rightarrow\quad u(x) = cx + d.

So the minimizer is a straight line between the boundary values. This matches the intuition: among all curves that hit the endpoints, the least “bendy” one is linear.

Multi-dimensional version (PDE form)

If $u:\Omega\subset\mathbb{R}^n\to\mathbb{R}$ and

J[u] = \int_\Omega L(x, u(x), \nabla u(x))\,dx,

then the Euler–Lagrange equation becomes:

\boxed{ \frac{\partial L}{\partial u} - \nabla\cdot\left(\frac{\partial L}{\partial (\nabla u)}\right) = 0. }

So calculus of variations is one of the cleanest “factories” for producing PDEs.

What to remember

A derivative is “rate of change under perturbation”.
A gradient is how that derivative looks in Euclidean coordinates.
A functional $J[u]$ eats a function and outputs a scalar.
The first variation $\delta J[u](v)$ is the functional analogue of a directional derivative.
Setting $\delta J[u](v)=0$ for all admissible $v$ yields the Euler–Lagrange equation.

That’s basically the whole game. Everything else is: “what kind of $J$ ?”, “what constraints?”, and “how do I solve the resulting ODE/PDE?”.

Content is rephrased with ChatGPT. ↩