37
$\begingroup$

I made an implementation of Newton's method (starting at $x=\pi$) for polynomials in the programming language Uiua (here it is if you are interested, but be warned that Uiua looks weird). It works fine for most polynomials but there are a few that it gets stuck on. I tried [12 ¯9 ¯21 37 ¯41 ¯10 31 49], (a degree 7 polynomial with coefficients listed from lowest degree to highest) and the execution time ran out. Inspecting further, it seems to be oscillating somewhat randomly in the range $[0.3,0.7]$, even after 500+ iterations, when the actual answer is $-0.568$. Another online implementation also fails to converge when starting at $x=\pi$.

My question is whether Newton's method should be working in this case or if I should expect to find there is some problem in my code or in the way that Uiua handles floats. Does Newton's method always converge for polynomials?

$\endgroup$
9
  • 14
    $\begingroup$ Nope. It can easily diverge if you start far from the root. The common remedy is to run it in parallel with the bijection. If the Newton iteration falls into a bijection interval, you keep it. Otherwise, you discard it and start all over with the middle of the current bisection interval. $\endgroup$
    – fedja
    Commented Jan 7 at 6:48
  • 20
    $\begingroup$ You can also construct polynomials that will oscillate. For example $P(x)=-x^4+\frac{7}{3}x^2$ with an initial guess of $x=1$ will oscillate between $1$ and $-1$. $\endgroup$ Commented Jan 7 at 6:52
  • 5
    $\begingroup$ Math Youtuber 3blue1brown made a pair of videos exploring Newton's method. First video and second video. He does address the fact that in some particular cases the method does not converge, but those cases are rare. $\endgroup$
    – Arthur
    Commented Jan 7 at 8:35
  • 2
    $\begingroup$ See also math.stackexchange.com/a/3455757/589 $\endgroup$
    – lhf
    Commented Jan 7 at 10:38
  • 8
    $\begingroup$ @fedja: No, for polynomials that description is wrong. Far away from the roots the step direction is always toward the root set. One may get periodic sequences in some neighborhood of the root set. // There is a difference between "bijection" (one-to-one maps) and "bisection", a guard or bracketing interval method. $\endgroup$ Commented Jan 8 at 7:52

4 Answers 4

52
$\begingroup$

Here is a plot of the function $$f(x) = 49 x^7+31 x^6-10 x^5-41 x^4+37 x^3-21 x^2-9 x+12$$ in the region of interest. The blue curve is $f$. The orange lines represent the forward orbit of $x_0 = \pi$ under $x \mapsto x - f(x)/f'(x)$, connecting $(x_i, f(x_i))$ to the next iterate $(x_{i+1}, 0)$, then to $(x_{i+1}, f(x_{i+1}))$, and so forth. Note that the initial iterations are not shown because the value of the function at these points is large.

The thick red line shows the steady-state behavior of the orbit, and we can see this will converge to a $3$-cycle of values, which are approximately $$S = \{0.355634, 0.796309, 0.680185\}.$$

enter image description here

This suggests that for any $x \in S$, the rational function $$g(x) = x - \frac{f(x)}{f'(x)} = \frac{294 x^7+155 x^6-40 x^5-123 x^4+74 x^3-21 x^2-12}{343 x^6+186 x^5-50 x^4-164 x^3+111 x^2-42 x-9}$$ satisfies $$g(g(g(x))) = x.$$

Indeed, we can hypothesize that whenever $f$ has a neighborhood with a local minimum but no root, then there is the possibility of an initial choice $x_0$ for which the forward orbit does not converge to a root.

$\endgroup$
2
  • $\begingroup$ After a bit more experimentation, it seems that this iteration does not actually converge to either of the two possible 3-cycles, as all of these points are unstable in their neighborhoods. $\left|\frac{d}{dx} g(g(g(x)))\right|>1$ for all of them. Instead it moves chaotically in the range (but not the entire range). You can even see this in your image, as points in the neighborhood of the second point are more spread out around the first after 1 iteration. $\endgroup$
    – Tbw
    Commented Jan 9 at 8:26
  • $\begingroup$ @Tbw You're right--after plotting a histogram of $10^6$ iterates, it does appear to be chaotic. There is a distribution of values around each of the three neighborhoods that doesn't settle down. $\endgroup$
    – heropup
    Commented Jan 9 at 10:35
22
$\begingroup$

As the other answers already noted, there are cases where the Newton iteration does not converge.

One interesting question is for how "many" values this can occur and whether it's a set or full measure or a Null set. As it turns out, there are indeed polynomials for which the Newton process does not converge, and the set of such starting values is a set of full measure.

This won't occur for polynomials of degree less than 3, but it may occur for cubic polynomials like $$ p(z) = z^3-2z+2 $$ The dynamics of the Newton process $z\mapsto {\cal N}_p(z)$ is made visible by the graphic below which displays the dynamic for real starting values with $z\in[-4,4]$.

The image consists of 18 colored stripes, each color encoding a real number (−4=purple, −2=violet, −1=cyan, 0=black, 1=red, 2=orange, 4=yellow, ±∞=white). From top to bottom, the stripes show values of the $n$-th Newton iterate, where row 0 indicates the starting values.

enter image description here

Starting values smaller than approximately −0.83 converge to the violet zero of $p(z)$ at $z\approx-1.76929$, but all values in some interval around $0$ are attracted by the black-red cycle $$\cdots0\mapsto 1\mapsto 0\mapsto 1\mapsto 0\mapsto1\cdots$$

But how many such "bad" polynomials are there? Are such polynomials a null set in the set of all polynomials, or is there a substantial amount of such polynomials?

For cubic polynomials one can study the behavior of the critical point of ${\cal N}_p$. For example, $0$ is a critical point of ${\cal N}_{p_\lambda}$ for all cubics from the family $$ p_\lambda(z) = z^3+(\lambda-1)z-\lambda $$ Plotting how the critical point of ${\cal N}_{p_\lambda}$ behaves under iteration gives the following result, where $\lambda$ ranges over the complex plane where $-2.5\leqslant Re(\lambda),Im(\lambda)\leqslant 2.5$.

enter image description here

There are tiny black spots that are subsets of full measure in the $\lambda$ space for which the critical point does not converge to a zero of $p_\lambda$.

Magnifying the black spot around $\lambda\approx0.3+1.64i$ looks like this:

enter image description here

There are infinitely many of them.


Answer to a comment: The hue in the last two images represents the limit of $$\lim_{n\to\infty} {\cal N}_{p_\lambda}^n(0)$$ i.e. the fate of the critical point 0 of ${\cal N}_{p_\lambda}$ under iteration, provided the iteration converges. When it does not converge, then the point at $\lambda$ is colored in black. There are 3 hues around the island in the 3rd image because $p_\lambda$ has 3 complex zeros that may occur as limit. The hue indicates which zero and relates to the limit in a non-obvious way and such that the 3 hues are 120° apart in color space (red, green, blue). This comes with a grain of salt because the exact locations of the zeros of $p_\lambda$ depend on $\lambda$, so this only holds approximately and only is small regions. It's the reason for why the colors of the zeros in the overview picture transition smoothly over large stretches of the $\lambda$ plane.

The brightness indicates how many iterations where required to conclude that the iteration converges for that $\lambda$: brighter = more. For points close to the Mandelbrot islands, the saturation is increased again before it finally drops to white. This is for aesthetical reasons and results in the purple-white glow around the black islands.

The algorithm smoothes out the colors by interpolating the number of iterations (which are natural numbers) to real numbers; similar to how you would smooth the potential around an electrically charged Mandelbrot set. To that end, one has to introduce a notion of distance to the closest complex root of $p_\lambda$.

It's more than 10 years ago since I make these images; so bear with me. The summary of the wiki page contains and explains the relevant part of the C code I used back then. The idea for this specific family of cubics is from the book of Peitgen and Richter.

$\endgroup$
6
  • 4
    $\begingroup$ A Brot coming from convergence of Newton-Raphson? Amazing.. $\endgroup$
    – user21820
    Commented Jan 9 at 12:26
  • 2
    $\begingroup$ @user21820: A phenomenon sometimes called Mandelbrot Universality. $\endgroup$ Commented Jan 9 at 13:51
  • 2
    $\begingroup$ Do you know any high-level explanation for that phenomenon? $\endgroup$
    – user21820
    Commented Jan 9 at 15:30
  • 3
    $\begingroup$ @user21820: Something like On the dynamics of polynomial-like mappings from Douady and Hubbard? $\endgroup$ Commented Jan 9 at 16:55
  • 2
    $\begingroup$ what a beautiful visualization! I'd like to ask exactly what the hue and luminosity are representing in those Brots. $\endgroup$ Commented Jan 10 at 1:38
7
$\begingroup$

One does not have to go all the way to 7-th degree polynomials to see this behaviour. Consider $f(x) = x^3-5x$. If you start the Newton-Raphson iteration from initial guess $1$, you would simply oscillate between $1$ and $-1$. This is an unstable equilibrium, but clearly one that you might actually encounter. (There is no rounding error to save you from this cycle.)

For a stable equilibrium consider $f(x) = x^4-6x^2-11$. If you start the Newton-Raphson iteration from initial guess $1$, you would oscillate between $1$ and $-1$. Furthermore, if you start from an initial guess near $1$, say $1.2$ or $0.8$, the iterates will converge to the same oscillating cycle!

$\endgroup$
0
$\begingroup$

A simpler and wonderful example is what you get when you try to find, using Newton's method, the solution to $x^2 = -1$. (Yes, I know, what sort of dumbo tries to find $i$ in this silly way. But still...)

The iterations might go off to infinity (hitting 0 first); they might hit a cycle of any prime period, they might jump around forever without repeating a guess.

The cool thing is that you can express the n-th iterate in terms of the initial guess in a nice closed form! Hint: Look at the expression for $\tan (2\theta)$ in terms of $\tan \theta$ and note a similarity with the expression for the next guess in terms of the previous one. Far be it from me to deprive you of the enjoyment of working out the closed form.

I credit Carl Bender, who introduced me to this cute problem at the start of a perturbation theory course. (Of course, he is almost surely not the first person to come up with the observation...)

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .