14
$\begingroup$

I have been given this formula from optics here, with no background: $$\vec\nabla{n} = \frac{d(n\hat{u})}{ds}$$

Where $n$ is the refractive index and $\hat{u}$ is a unit vector tangent to the path $s$ that light takes inside a medium.

Does anyone know if this formula has a name? I am looking specifically for a derivation of it. I have looked through the optics book by Hecht with no luck - I assume it comes from fermats principle of least time in some form.

Any help greatly appreciated.

$\endgroup$
1
  • 1
    $\begingroup$ Check out Darryl D. Holm, Fermat’s Principle and the Geometric Mechanics of Ray Optics, Summer School Lectures, Fields Institute, Toronto, July 2012 (PDF). You get the proof also for anisostropic media. $\endgroup$
    – margareta
    Commented Dec 10, 2014 at 0:15

3 Answers 3

11
$\begingroup$

Lionel Brits's answer and Ondřej Černotík's answer are both correct and indeed derive the ray equation from Fermat's principle.

However, if you want to begin with Maxwell's equations, you derive $\vec\nabla{n} = \frac{d(n\hat{u})}{ds}$ from Maxwell's equations and then derive Fermat's principle from your ray equation. Here's how.

We begin with Maxwell's equations, where we write the electomagnetic fields in the form $\mathbf{E}\left(\mathbf{r}\right) = \mathbf{e}\left(\mathbf{r}\right) e^{i\,\varphi\left(\mathbf{r}\right)}$, $\mathbf{H}\left(\mathbf{r}\right) = \mathbf{h}\left(\mathbf{r}\right) e^{i\,\varphi\left(\mathbf{r}\right)}$ where $\mathbf{e}$ and $\mathbf{h}$ are slowly varying vectors and the phase $\varphi\left(\mathbf{r}\right)$ is real valued. Then Faraday's and Amp`ere's laws become:

$$ \begin{array}{lcl} \nabla \times\mathbf{e} + i\, \nabla \varphi \times\mathbf{e} &=& -\mu\,\partial_t \mathbf{h}\\ \nabla \times\mathbf{h} + i\, \nabla \varphi \times\mathbf{h} &=& \epsilon\,\partial_t \mathbf{e} \end{array} \tag{1}$$

So far there is no approximation; one then makes the slowly varying envelope approximation: that the envelopes $ \mathbf{e}$ and $ \mathbf{h}$ vary much more slowly with position than does the phase, {\it i.e.} $\left|\mathbf{e}\right|^{-1} \left|\nabla \times\mathbf{e}\right| \ll \left|\nabla \varphi_e\right| \approx \left|k\right|$ and $\left|\mathbf{h}\right|^{-1} \left|\nabla \times\mathbf{h}\right| \ll \left|\nabla \varphi_h\right| \approx \left|k\right|$ and we also assume a monochromatic (time-harmonic) field. The equations then become:

$$ \begin{array}{lcl} \nabla \varphi \times\mathbf{e} &\approx& \omega\,\mu\,\mathbf{h}\\ \nabla \varphi \times\mathbf{h} &\approx& -\omega\,\epsilon\, \mathbf{e} \end{array} \tag{2} $$

Both of these equations individually say that $\mathbf{e}$ and $\mathbf{h}$ are orthogonal. On forming $-\omega^2\,\mu\,\epsilon\, \mathbf{e}\times\mathbf{h} = \left(\nabla \varphi \times\mathbf{h}\right)\times\left(\nabla \varphi \times\mathbf{e}\right)$ and simplifying, one gets:

$$ \omega^2\,\mu\,\epsilon\, \mathbf{e}\times\mathbf{h} = \nabla \varphi \cdot \left(\mathbf{e}\times\mathbf{h}\right) \nabla \varphi \tag{3}$$

so that $\nabla \varphi$ is orthogonal to both $\mathbf{e}$ and $\mathbf{h}$ and aligned with $\mathbf{e}\times\mathbf{h}$, whence $\omega^2\,\mu\,\epsilon\, \left|\mathbf{e}\times\mathbf{h}\right| = \left|\nabla \varphi\right|^2 \left|\mathbf{e}\times\mathbf{h}\right| $, therefore $\mathbf{e}$, $\mathbf{h}$ and $\nabla \varphi$ in that order are a mutually orthogonal, right-handed triple and:

$$ \left|\nabla \varphi\right|^2 = \omega^2\,\mu\,\epsilon = \left|\mathbf{k}\right|^2 \tag{4}$$

which is the Eikonal Equation: the fundamental equation defining raytracing. More fully, we can summarise everything inferred from Eq.(2) as the "Eikonal Equations" as follows:

$$ \begin{array}{lcl} \mathbf{k}\left(\mathbf{r}\right) &\stackrel{def}{=}& \nabla \varphi\left(\mathbf{r}\right)\\ n\left(\mathbf{r}\right) &\stackrel{def}{=}& \sqrt{\frac{\mu\left(\mathbf{r}\right)\,\epsilon\left(\mathbf{r}\right)}{\mu_0\,\epsilon_0}} = \sqrt{\mu\left(\mathbf{r}\right) \epsilon\left(\mathbf{r}\right)} \;c\\ \mathcal{Z}\left(\mathbf{r}\right) &\stackrel{def}{=}& \sqrt{\frac{\mu\left(\mathbf{r}\right)}{\epsilon\left(\mathbf{r}\right)}}\\ \mathcal{Y}\left(\mathbf{r}\right) &\stackrel{def}{=}& \mathcal{Z}\left(\mathbf{r}\right)^{-1} = \sqrt{\frac{\epsilon\left(\mathbf{r}\right)}{\mu\left(\mathbf{r}\right)}}\\ k\left(\mathbf{r}\right) &=& \left|\mathbf{k}\left(\mathbf{r}\right)\right| = \frac{\omega}{c} \, n\left(\mathbf{r}\right) \\ \mathbf{\hat{k}}\left(\mathbf{r}\right) &\stackrel{def}{=}& \frac{\mathbf{k}\left(\mathbf{r}\right)}{k\left(\mathbf{r}\right)} = \frac{ \nabla \varphi\left(\mathbf{r}\right) }{\left|\nabla \varphi\left(\mathbf{r}\right)\right|}\\ \mathbf{e}\times\mathbf{h} &=& \mathcal{Y} \left|\mathbf{e}\right|^2 \mathbf{\hat{k}}= \mathcal{Z} \left|\mathbf{h}\right|^2 \mathbf{\hat{k}} \\ \mathbf{e} &=& -\mathcal{Z}\, \mathbf{\hat{k}} \times\mathbf{h} \\ \mathbf{h} &=& \mathcal{Y}\, \mathbf{\hat{k}} \times\mathbf{e} \\ \end{array} \tag{5}$$

These equations are exactly fulfilled when $\mathbf{e}$ and $\mathbf{h}$ are constant (independent of $\mathbf{r}$) vectors and $\varphi\left(\mathbf{r}\right) = \mathbf{k} \cdot \mathbf{r}$ for some constant wavevector $\mathbf{k}$, when they define a plane wave, and plane waves are the only exact solution of the Eikonal equations. The uniqueness of plane waves in fulfilling the Eikonal equations exactly is another way of stating the strict contradictory nature of a ray - a true, exact ray represents a wholly delocalised photon and only the axial component of $\mathbf{k} \cdot \mathbf{r}$ is important for setting its phase. The assertion that the Eikonal equations approximately describe more general waves is the intuitive one that $\mathbf{e}\left(\mathbf{r}\right)$ and $\mathbf{h}\left(\mathbf{r}\right)$ vary slowly enough with $\mathbf{r}$ that they locally differ only slightly from plane waves.

Now we consider a flow line of the vector field defined by the Eikonal equation. Let the space curve $\mathbf{r}\left(s\right)$ define such a flow line parameterised by the arclength $s$. The flow line is governed by $\mathbf{k}\left( \mathbf{r}\left(s\right)\right) = k\left( \mathbf{r}\left(s\right)\right) \,\mathrm{d}_s \mathbf{r}\left(s\right)$; by taking the derivative with respect to arclength we get $\mathrm{d}_s \mathbf{k} = \mathrm{d}_s \mathbf{r} \cdot \nabla \mathbf{k} = k^{-1}\, \mathbf{k}\cdot \nabla \mathbf{k} = \frac{1}{2}\, k^{-1}\,\nabla \left(\left|\mathbf{k}\right|^2\right) = \nabla k$ so that:

$$ \frac{\mathrm{d}}{\mathrm{d}\,s}\left(k\left(\mathbf{r}\left(s\right)\right)\,\frac{\mathrm{d}}{\mathrm{d}\,s} \mathbf{r}\left(s\right)\right) = \left.\nabla k\left( \mathbf{r}\right)\right|_{\mathbf{r}\left(s\right)};\quad \frac{\mathrm{d}}{\mathrm{d}\,s}\left(n\left( \mathbf{r}\left(s\right)\right)\,\frac{\mathrm{d}}{\mathrm{d}\,s} \mathbf{r}\left(s\right)\right) = \left.\nabla n\left( \mathbf{r}\right)\right|_{\mathbf{r}\left(s\right)} \tag{6}$$

are two equivalent forms of the ray path equation, where $n = \sqrt{\mu \epsilon}$ is the refractive index. These are the equations you seek.

Appendix: Back the other way.

These two equations imply Fermat's principle, whereas Lionel Brit's answer shows how Fermat's principle implies your equation. Therefore, Fermat's principle and your equation are *logically equivalent. We show that your equation implies Fermat's principle and Snell's law as follows The time of flight of a ray through an arbitrary medium whose refractive index has continuous first derivatives, i.e. its Lagrangian, along a space curve defined by the position vector $\mathbf{r}\left(t\right)$ parameterised by the generalised "time" $t$ is:

$$ \mathcal{L} = \frac{1}{c} \int_0^t n\left(\mathbf{r}\left(u\right)\right) \left| \frac{\mathrm{d}\,\mathbf{r}}{\mathrm{d}\,u}\right| \, \mathrm{d} u \tag{7}$$

so if we vary the path from $\mathbf{r}$ to $\mathbf{r} + \delta \mathbf{r}$ and work through the wonted Euler-Lagrange theory we get:

$$ \begin{array}{lcl} \delta \mathcal{L} &=& \frac{1}{c} \int_0^t \left[\nabla n\left(\mathbf{r}\right) \cdot \delta \mathbf{r} \left| \frac{\mathrm{d}\,\mathbf{r}}{\mathrm{d}\,u}\right| + n\left(\mathbf{r}\right) \delta \sqrt{\frac{\mathrm{d}\,\mathbf{r}}{\mathrm{d}\,u} \cdot \frac{\mathrm{d}\,\mathbf{r}}{\mathrm{d}\,u}}\right] \, \mathrm{d} u\\ &=& \frac{1}{c} \int_0^t \left[\nabla n\left(\mathbf{r}\right) \left|\frac{\mathrm{d}\,\mathbf{r}}{\mathrm{d}\,u}\right| -\frac{\mathrm{d}}{\mathrm{d}\,u}\left( n\left(\mathbf{r}\right) \frac{\mathrm{d}\,\mathbf{r}}{\mathrm{d}\,u} \left|\frac{\mathrm{d}\,\mathbf{r}}{\mathrm{d}\,u}\right|^{-1}\right)\right] \cdot \delta \mathbf{r} \, \mathrm{d} u \end{array} \tag{8}$$

and so the vector forming the inner product with $\delta \mathbf{r}\left(u\right)$ in the integrand must be identically nought. If we choose the generalised time $t$ to be the same as the arclength $s$ along the path, then $\left|\frac{\mathrm{d}\,\mathbf{r}}{\mathrm{d}\,u}\right| = \left|\frac{\mathrm{d}\,\mathbf{r}}{\mathrm{d}\,s}\right| = 1$ so that Eq.(6) follows straight away. If now we wish to include discontinuous interfaces between $N$ different mediums reached by the ray at generalised times $t_1,\,t_2,\,\cdots$, then Eq.(7) takes the form:

$$ \mathcal{L} = \frac{1}{c} \sum\limits_{j = 0}^{N-1}\int_{t_{j}}^{t_{j+1}} n\left(\mathbf{r}\left(u\right)\right) \left| \frac{\mathrm{d}\,\mathbf{r}}{\mathrm{d}\,u}\right| \, \mathrm{d} u \tag{9}$$

\noindent and Eq.(8) takes on some further terms because now $\delta \mathbf{r}\left(t_j\right)$ for $j = 1,\,2,\,\cdots,\,N-1$ are not nought and indeed can be varied in any direction within the interfaces between the mediums (here we choose the generalised time to be the arclength for simplicity):

$$ %\begin{array}{lcl} \delta \mathcal{L}= \frac{1}{c}\sum\limits_{j = 0}^{N-1} \int_{s_{j}}^{s_{j+1}} \left[\nabla n\left(\mathbf{r}\right) -\frac{\mathrm{d}}{\mathrm{d}\,s}\left( n\left(\mathbf{r}\right) \frac{\mathrm{d}\,\mathbf{r}}{\mathrm{d}\,s}\right)\right] \cdot \delta \mathbf{r} \, \mathrm{d} s \,+\, \frac{1}{c}\sum\limits_{j = 1}^{N-1} \left[n_{j-1}\left(\mathbf{r}\right) \frac{\mathrm{d}\,\mathbf{r}_{j-1}}{\mathrm{d}\,s} -n_j\left(\mathbf{r}\right) \frac{\mathrm{d}\,\mathbf{r}_j}{\mathrm{d}\,s}\right]\cdot \delta \mathbf{r}_j %\end{array} \tag{10}$$

Now, as above, $\delta \mathbf{r}$ is arbitrary, so the vector forming the inner product with $\delta \mathbf{r}\left(u\right)$ in the integrand must be identically nought, as before. Therefore, we still get Eq.(6) from Fermat's principle. But the perturbations $\delta \mathbf{r}_j$ are also arbitrary vectors within the tangent planes of the relevant interfaces. Therefore, further to Eq.(6), we must have:

$$ %\begin{array}{lcl} \left[n_{j-1}\left(\mathbf{r}\right) \frac{\mathrm{d}\,\mathbf{r}_{j-1}}{\mathrm{d}\,s} -n_j\left(\mathbf{r}\right) \frac{\mathrm{d}\,\mathbf{r}_j}{\mathrm{d}\,s}\right] \times\hat{\mathbf{p}}_j = \mathbf{0} %\end{array} \tag{11}$$

where $\hat{\mathbf{p}}_j$ is the unit normal to interface $j$. Eq.(11) is readily shown to be Snell's law at an interface. Since the Euler-Lagrange theory can make the argument reversible given suitable assumptions about the function $\mathbf{r}\left(s\right)$ we can show that the eikonal equations Eq.(5) imply and are implied by Fermat's principle.

$\endgroup$
3
  • 1
    $\begingroup$ Beautiful answer. One might as well consider the solution of the Maxwell with the ansatz $\mathbf{E}\left(\mathbf{r}, t\right) = \mathbf{e}\left(\mathbf{r}\right) e^{i\,\varphi\left(\mathbf{r}\right)}e^{i\omega t}$, $\mathbf{H}\left(\mathbf{r}, t\right) = \mathbf{h}\left(\mathbf{r}\right) e^{i\,\varphi\left(\mathbf{r}\right)}e^{i\omega t}$ with the temporal dependence, so that one does not wonder how to carry out the partial time derivation on the r.h.s. of Eq.(1). right? $\endgroup$
    – gamebm
    Commented Aug 21, 2021 at 5:10
  • $\begingroup$ @gamebm Indeed. I also realize i made a slight logical error here, because i have had to justify the following as the result of a consistently asked question about this answer: i have to add the assumption of an affine parameter is the definition of the refractive index, the ratio of the vacuum c to material c/n speed. But there is no requirement to use an affine parameter. You must do so only if you require the parameterization to define the physical phasefronts. You must use a nonaffine parameterization for ray transfer analysis, see here physics.stackexchange.com/a/339516/26076 $\endgroup$ Commented May 21 at 2:20
  • $\begingroup$ @gamebm See also here: physics.stackexchange.com/a/337961/26076 . If you're interested in this stuff, also physics.stackexchange.com/questions/338865/… And, Randal Monroe's VERY FUN article "Fire from Moonlight" what-if.xkcd.com/145 is OBLIGATORY for anyone pondering these notions! $\endgroup$ Commented May 21 at 2:26
10
$\begingroup$

This equation is called the ray equation and it can indeed be derived from Fermat's principle. I guess you can find more about its derivation in, e.g., Born and Wolf's Principles of optics or in Fundamentals of Photonics by Saleh and Teich.

$\endgroup$
0
6
$\begingroup$

The optical path length $$ S = \int\! n \, ds\, $$ for a ray travelling along some curve is extremized by the path that the ray actually takes (from a wave point of view, the phase is stationary, and so we get constructive interference). We can parameterize the path by some parameter $\lambda$, i.e., $\vec{x}(\lambda)$, so $$ ds = \sqrt{ {\dot x_1}^2 + \dot{x_2}^2} d\lambda $$

and the variation is $$ \delta S = \int\! \left[\left(\frac{\partial n}{\partial x^i} \delta x^i \right) \sqrt{ {\dot x_1}^2 + \dot{x_2}^2} + n \frac{1}{\sqrt{...}} \dot{x_i} \delta \dot x_i\right] d\lambda $$ Now we integrate by parts on the second term, to get

$$ \delta S = \int\! \left[\left(\frac{\partial n}{\partial x^i} \right) \sqrt{ {\dot x_1}^2 + \dot{x_2}^2} - \frac{d}{d \lambda} \left( n \frac{1}{\sqrt{...}} \dot{x_i} \right) \right] \delta x_id\lambda + \mathrm{boundary\,terms} $$. This should vanish for paths that extremize the path length, so the coefficient of $\delta x_i$ must be zero (fundamental theorem of calculus of variations): $$\left(\frac{\partial n}{\partial x^i} \right) \sqrt{ {\dot x_1}^2 + \dot{x_2}^2} - \frac{d}{d \lambda} \left( n \frac{1}{\sqrt{...}} \dot{x_i} \right) =0 $$ At this point, we can choose $\lambda = s$, in which case $\sqrt{ {\dot x_1}^2 + \dot{x_2}^2} = 1$, and we get: $$\left(\frac{\partial n}{\partial x^i} \right) - \frac{d}{d s} \left( n \dot{x_i} \right) =0 $$

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.