Why does the Newton-Raphson method not converge for some functions?

Question

$f(x)=2x^2-x^3-2$. This is a cubic type graph as shown. The real root of this graph is $(-0.839,0)$.

So, the question is to use Newton's approximation method twice to approximate a solution to this $f(x)$.

I use an initial starting value of $x_0=1$. My first approximation is $x_1=2$, and my second one is $x_2=1.5$. I seem to not move any closer to the real solution as I keep iterating through the method.

Am I misunderstanding how to use this approximation? Is the issue that my first guess was too far from the actual solution?

Note that from $x_{n+1}=x_n+\dfrac{x_n^3-2x_n^2+2}{4x_n-3x_n^2}$, we should obtain $x_2=1.5$, not $2.5$. — projectilemotion, Commented Aug 27, 2017 at 14:20
Very nice example! The method seems to fail, but eventually is successful! — Peter, Commented Aug 27, 2017 at 14:23
@user163862 The graph shows that it is better to start from a value at left of root, like -1.2 — Narasimham, Commented Aug 27, 2017 at 19:17
@Narasimham To the left of the root does not really matter, the point is that it should be to the left of that minimum at $0$. — Ian, Commented Aug 27, 2017 at 22:13
In short: your starting point was too near to an extremum. If you can pick starting points that are as far away from extrema as possible (and of course close to the desired root!), do so. — J. M. ain't a mathematician, Commented Aug 28, 2017 at 4:47

Simply Beautiful Art · Accepted Answer · 2017-08-27 18:33:26Z

69

In fact, you gave up too early ; The method eventually converges :

1   2.000000000000000000000000000
2   1.500000000000000000000000000
3   0.3333333333333333333333333333
4   2.148148148148148148148148148
5   1.637079608343976160068114091
6   0.9483928480399477528436835979
7   1.910874140183680201544963299
8   1.405089904362402921055022221
9   -1.324018083676046424512855515
10  -0.9614381794507316717924414480
11  -0.8500221808505758631523579893
12  -0.8393807176849843501240483025
13  -0.8392867625049899194321196645
14  -0.8392867552141611764525252322
15  -0.8392867552141611325518525647
16  -0.8392867552141611325518525647
17  -0.8392867552141611325518525647
18  -0.8392867552141611325518525647
19  -0.8392867552141611325518525647
20  -0.8392867552141611325518525647
?

edited Aug 27, 2017 at 18:33

Simply Beautiful Art

75k12 gold badges124 silver badges284 bronze badges

answered Aug 27, 2017 at 14:17

Peter

85.1k17 gold badges78 silver badges223 bronze badges

3

$\begingroup$ Oops, very fast upvote :) $\endgroup$
– Peter
Commented Aug 27, 2017 at 14:17
1

$\begingroup$ Yes, I was going to transfer my previous comment to an answer, but you were a bit faster than me. $\endgroup$
– projectilemotion
Commented Aug 27, 2017 at 14:18
8

$\begingroup$ Note that the magic happens when you finally bounce around to hit ~1.4 which then sends you way over to ~-1.3 $\endgroup$
– Ian
Commented Aug 27, 2017 at 14:20
4

$\begingroup$ There are some collections of conditions sufficient to guarantee the convergence of the newton-method. Especially cubics can be dangerous, sometimes the method actually diverges or oscillates. In doubt, you can try the slower but more reliable numerical methods as the bisection-method or regula-falsi. Or you can combine the methods and first search an approximation and then use newton to get a more precise result. $\endgroup$
– Peter
Commented Aug 27, 2017 at 14:36
2

$\begingroup$ I don't think this answer actually addresses any point raised in the original question. So this particular initial value does lead to a converging sequence; fine. But why? Did it have to be that way? What about other possible starting values? $\endgroup$
– Marc van Leeuwen
Commented Aug 29, 2017 at 8:51

| Show 5 more comments

Ian · Accepted Answer · 2017-08-27 14:50:52Z

Newton's method does not always converge. Its convergence theory is for "local" convergence which means you should start close to the root, where "close" is relative to the function you're dealing with. Far away from the root you can have highly nontrivial dynamics.

One qualitative property is that, in the 1D case, you should not have an extremum between the root you want and your initial guess. If you have an odd number of extrema in the way, then you will start going away from the root you want, as you see here. If you have an even number of extrema in the way, then you will start going the right way, but you may later find yourself in a spot with an odd number of extrema in the way, leading to problems later.

Of course you may eventually find an occasion where there are an even number of extrema in the way, and then you manage to skip over all of them and get to the right side. At that point things will usually work out (not always, though). In this problem with your initial guess, that eventually happens, because the system eventually finds its way just slightly to the right of the extremum on the right, which sends it far off to the left.

$\begingroup$ Wow! What a great explanation. Now I totally get it. $\endgroup$
– user163862
Commented Aug 27, 2017 at 14:36 — user163862, Commented Aug 27, 2017 at 14:36

Ramillies · Accepted Answer · 2017-08-28 13:06:30Z

The other answers are great. I'd just like to add a concrete example of weird behavior of which the Ian's answer speaks.

Let's consider a function $f(x) = \operatorname{sgn} x \sqrt{|x|}$. According to the algorithm, we iterate $$x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}.$$ Now for the derivative (if we're not doing the derivative at $x = 0$, the $\operatorname{sgn} x$ is just a constant, and $|x| = x \operatorname{sgn} x$): $$f' = [\operatorname{sgn} x \sqrt{x \operatorname{sgn} x}]' = \operatorname{sgn} x \frac{1}{2\sqrt{x \operatorname{sgn} x}} \operatorname{sgn} x = \frac{1}{2\sqrt{|x|}}.$$ Plugging in: $$x_{n+1} = x_n - \frac{\operatorname{sgn} x \sqrt{|x|}}{1/(2\sqrt{|x|})} = x_n - 2 \operatorname{sgn} x \left(\sqrt{|x|}\right)^2 =\\ =x_n - 2\operatorname{sgn} x |x| = x_n - 2 x_n = - x_n.$$

So if we start iterating in $x = a$ (where $a \not = 0$), we get the sequence $a, -a, a, -a, \ldots$ and the method loops forever between those two points, never getting to the root $x = 0$!

Edit: Here's a gnuplotted image: (In each iteration, we make a tangent in the current point (the blue dashed line) and the $x$ for which the tangent crosses 0 is taken to be the next approximation (so we go along the magenta line in order to get the starting point for the next iteration).)

By the way, have a look at this image from Wikipedia:

It shows the complex plane colored with 5 colors, each color corresponding to one root of the complex equation $z^5 = 1$. Each point then has the color corresponding to the root to which Newton's method converges, if we start from that point. The "flowers" are beautiful to behold but totally abhorrent from the numerical point of view.

"[...]Flowers are beautiful to behold but totally abhorrent from a numerical point of view" now range alongside "Sufficiently accurate for Poetry" as my favorite expressions ever! — Beyer, Commented Aug 28, 2017 at 7:05
I like the "Sufficiently accurate for Poetry" much more! :-). // By the way, thank you, Simply Beautiful Art, for nicely inlining the image. — Ramillies, Commented Aug 28, 2017 at 10:55
I'm sorry, but I can't see why not. $f(x) = \operatorname{sgn} x \sqrt{|x|}$, so $f(0) = \operatorname{sgn} 0 \sqrt{|0|} = 0 \cdot 0 = 0$. Hence $x = 0$ is a root. (Btw.: I say root, not stationary point.) — Ramillies, Commented Aug 28, 2017 at 12:03

user541686 · Accepted Answer · 2024-06-24 16:31:29Z

Newton's method has no global convergence guarantee for arbitrary functions, as you just learned.

Now, people have posted examples of where Newton's method doesn't converge, but they're all rather "unusual" functions (some being very non-smooth), so it's natural to assume they're pathological and won't happen in practice.

Your example is one where Newton just takes more iterations than expected to converge, so it's not too bad. But here is an example of a cubic polynomial for which Newton's method won't converge!

\begin{align*} f(x) &= -0.74 + 0.765 x + 1.1 x^2 - 3.55 x^3 \\ x_0 &= 5/9 \end{align*}

Not only that, but it's in fact a stable oscillation—small perturbations won't change the behavior.

And for bonus points, you can generate as many of these as you want! Just let the Newton step be $$g(x) = x - f'(x)^{-1} f(x)$$ and then you're just looking for a nontrivial solution to the equation $$x_0 = g^3(x_0) = g(g(g(x_0)))$$ where a solution $x_0$ would be trivial if $g(x_0) = x_0$ (a.k.a. a fixed point).
Notice the equation above is just another polynomial equation (although of a much higher order)—which means its solutions can be readily found numerically.
The only caveat is that these solutions might not necessarily be stable. I suspect you should be able to place a second-derivative condition to ensure stable solutions, but the exact equation is not obvious to me at the moment, so I'll leave it as an exercise for the reader. :-)

Mathematica code for the plot:

Manipulate[
  With[{f = Evaluate[Rationalize[d + c # + b #^2 + a #^3]] &}, Plot[
    f[x], {x, -0.61, 1},
    PlotStyle -> {Thickness[Tiny]},
    Prolog -> {Thickness[Tiny],
      Line[Flatten[Map[
     {{#, 0}, {#, f[#]}} &,
     NestList[Compile[t, t - f[t]/f'[t]], x0, n]], 1]]}]],
  {{a, -3.55}, -4, 4},
  {{b, 1.1}, -2, 2},
  {{c, 0.765}, -1, 1},
  {{d, -0.74}, -1, 1},
  {{x0, 5/9}, 0, 1},
  {{n, 100}, 0, 1000, 1}]

Update:

I wrote some code to purposefully find both stable and unstable iterations:

Newton[f_] := t \[Function] t - f[t]/f'[t];
NewtonPlot[f_, xmin_, xmax_, x0_, n_, args___] :=
  Plot[f[x], {x, xmin, xmax}, args,
   Prolog -> {Thickness[Tiny],
     Line[Flatten[Map[
        {{#, 0}, {#, f[#]}} &,
        NestList[Compile[t, Newton[f][t]], x0, n]], 1]]}];
FindOscillatoryNewtonSolutions[f_, n_, h_](* {Stables,Unstables} *):=
  With[{step = Newton[f]},
   With[{nstep = t \[Function] Nest[step, t, n]},
    GroupBy[
     Map[#[[1]][[2]] &, 
      Solve[{nstep[t] == t, Abs[step[t] - t] >= h}, t, Reals]],
     t \[Function] 
      With[{step1hp = nstep[t + h], step1hm = nstep[t - h]}, True \[And]
        Abs[N[step1hp - t]] >= Abs[N[nstep[step1hp] - t]] \[And]
        Abs[N[step1hm - t]] >= Abs[N[nstep[step1hm] - t]]]]]];
With[{z = 400, xmin = -1.1, xmax = +0.65, h = 10^-3},
 Manipulate[
  With[{f = 
     t \[Function] Evaluate[Rationalize[d + c t + b t^2 + a t^3]], 
    n = 3, m = 8},
   With[{solcategories = FindOscillatoryNewtonSolutions[f, n, h]},
    If[Length[solcategories] >= 2,
     Map[{Transpose@{N@SortBy[NestList[Newton[f], #1, n - 1], N]},
        NewtonPlot[f, xmin, xmax, N[#1 + dx], n*m,
         PlotStyle -> {Thickness[Tiny]},
         ImageSize -> Scaled[0.2],
         PerformanceGoal -> "Quality"]} &, {First@Lookup[solcategories, True], 
       First@Lookup[solcategories, False]}],
     Throw["Did not manage to find both a stable and an unstable solution."]]]],
  {{dx, h}, -4 h, +4 h, h/4},
  {{a, -355}, -z, +z, 1}, {{b, 110}, -z, +z, 1},
  {{c, 77}, -z, +z, 1}, {{d, -74}, -z, +z, 1},
  SynchronousUpdating -> False]]

Notice, below, that the first point iterates stably, whereas the second one iterates unstably. (The iterations would diverge further, but I cut them off at some point.)

Thank you, I was hoping someone would post an example like this. — user856, Commented Aug 28, 2017 at 15:39

Luca Citi · Accepted Answer · 2017-08-27 22:09:51Z

11

Newton-Raphson can behave badly even in seemingly easy situations. I am considering the use of N-R for minimization (rather than root finding, but the same applies). Even in the case of convex functions, N-R may not converge.

For example: $$f(x)=\ln(e^x+e^{-x})$$ is $C^{\infty}$, strictly convex and admits a single (global) minimum in 0. $f(x)=\ln(e^x+e^{-x})$

Yet, if we try to use N-R to find the minimum (i.e. the root of the derivative), the algorithm fails if started from a point $|x|>1.09$ (approx).

To see this, consider that $f'(x)=\tanh(x)$ and $f''(x)=\cosh^{-2}(x)$. Therefore, the N-R update rule maps the current $x$ to the new $x$ according to: $$x \leftarrow x - \frac{f'(x)}{f''(x)} = x - \frac12 \sinh(2x)$$ Whenever $|\frac12 \sinh(2x)| > |2x|$, the new iterate will be bigger in modulus than the previous one, i.e. N-R will diverge away from the solution. This happens for $|x|>1.09$ (approx).

edited Aug 27, 2017 at 22:09

answered Aug 27, 2017 at 22:03

Luca Citi

1,83914 silver badges14 bronze badges

5

$\begingroup$ Note that this happens because $\tanh$ is extremely flat for large $x$, so although you go in the right direction toward the root (since $\tanh$ has no extrema as I said in my answer), you massively overshoot. (By the way, this is a good example but I think it would be better to just go for root finding applied to $\tanh$ in the first place instead of talking about optimization.) $\endgroup$
– Ian
Commented Aug 27, 2017 at 22:11
1

$\begingroup$ @Ian Indeed, the quadratic approximation fits locally but is very poor globally. A better approach in this case would be to use a majorization minimization algorithm, which guarantees monotonic convergence towards a local (global in this case) minimum. $\endgroup$
– Luca Citi
Commented Aug 27, 2017 at 22:15
1

$\begingroup$ I agree, it's far easier to see what's going on if you do root-finding with tanh than minimization with its antiderivative. $\endgroup$
– user541686
Commented Aug 28, 2017 at 4:11
1

$\begingroup$ @Ian and Mehrdad I see your point but the reason I showed this example of minimization rather than root finding is that I wanted to contrast the apparent niceness of the function ($C^{\infty}$ and convex) with the fact that N-R does not converge. In minimization convexity is a big deal and I have heard many colleagues blindly assume that N-R would converge. I do not know an equivalent on the case of root finding. $\endgroup$
– Luca Citi
Commented Aug 28, 2017 at 11:02
2

$\begingroup$ My other objection to such an example is that if you have a $C^2$ convex function and you use Newton's method to minimize it, then Newton's method always goes in the correct direction (in multiple dimensions, the correct direction and the Newton direction always have a positive dot product). It just might overshoot. You can't always detect overshoots, but you can detect that the objective function gets bigger. If it did, then by convexity you must have overshot, and so you should trigger a line search to find a better point in between the new point and the previous point. $\endgroup$
– Ian
Commented Aug 28, 2017 at 13:27

| Show 7 more comments

Eric Towers · Accepted Answer · 2017-08-27 23:01:58Z

The relationship between starting point and eventual convergence point for Newton's method is complicated. Consider this image (from Wolfram's MathWorld, Newton's Method):

showing the basis of attraction in the complex plane for $x^3 - 1$, $x^4 - 1$, ... $x^{11} - 1$. Each colour corresponds to a root of the polynomial. Each point is colored to match the root that Newton's method eventually takes that point to.

These basins are typically quite complicated, but if you look at what is going on in the complex plane, it can be a little clearer what's going on. The real line runs horizontally, halfway down these plots. If all you know is what is happening on the real axis, which is what you are seeing in your problem, it can be difficult to predict where a given point is going to end up.

George N. Missailidis · Accepted Answer · 2017-09-01 03:55:50Z

The Newton-Raphson method basically asks you to draw the tangent to the function at the point $x_0$, and $x_1$ is the point where that tangent hits the $x$-axis. This obviously works well if the function follows reasonably close to the tangent up to the point where $y = 0$, and not so well otherwise.

You can see that between $x = 0$ and $x \approx 1.4$ the tangent actually points in the wrong direction, away from the $0$. And where the tangent is parallel or almost to the $x$-axis, close to $x = 0$ and $x = 1.4$, following the tangent will get you far away from the zero.

For $x_0 < -0.346, \ x_1$ will be closer to the solution than $x_0$, and will be in the same range again, so the sequence will converge towards the solution. For $x_0 > 1.5, \ x_1$ will also be closer to the solution than $x_0$, but you need to get over the area from $-0.346 \to 1.5$.

And $\approx 1.081792$ there is a point where $3$ iterations get you back to the point where you started :-)

Stack Exchange Network

Why does the Newton-Raphson method not converge for some functions?

7 Answers 7

Update:

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
calculus
numerical-methods
.

Linked

Hot Network Questions

Why does the Newton-Raphson method not converge for some functions?

7 Answers 7

Update:

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged calculusnumerical-methods.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
calculus
numerical-methods
.