12
$\begingroup$

I understand the algorithms and the formulae associated with numerical methods of finding roots of functions in the real domain, such as Newton's Method, the Bisection Method, and the Secant Method. Because their formulae are constructed differently, innately they will differ numerically at certain iterations. However, what are the exact advantages of each one algorithm.

All I know about these algorithms, other than their formualae are:

  1. Newton's Method converges quadrilaterally

  2. Secant Method bypasses the need to compute a derivative, however converges superlinearly.

  3. Bisection method converges linearly

$\endgroup$

1 Answer 1

27
$\begingroup$

In the real world situations I have encountered (and I have encountered several), evaluating the function is by far the most expensive thing you can do. Even massive amounts of side calculation to avoid a single function evaluation is well worth the while. So the faster a method converges, the better choice it can be - provided you can meet its requirements

  1. Newton's method is great for speed, but it does require that you know the derivative, and I have yet to encounter a real-world application where this was available. That is not to say that they don't occur. But I have not been so lucky. Another problem with Newton's method is instability. If you hit a place where the function is close to flat, it may send your next iteration out beyond Pluto. And in fact, there is no guarantee of convergence. You can find it getting caught in a loop.

  2. Secant Method. Well if you can't find the tangent line because you don't know the derivative, estimate it with a secant line instead. There is a school of thought that this can be faster than Newton's method despite the slower convergence, because it only requires one new function evaluation for each iteration, while Newton's method requires two. I am not sure if I buy it, but as I said, I have no practical experience with Newton's method (though plenty of academic experience with it). This method has exactly the same instability problems as Newton's method.

  3. Bisection Method. Guaranteed convergence, provided you can straddle the root at the start. Easily understood, easily programmed, easily performed, slow as blazes. Never sends your iteration off into the wild blue yonder. But still slow as blazes. This is your fallback method when all else fails.

  4. [Edit:] Recently, the ITP Method has been published. At least in my limited experience, it out-performs Brent's Method (see original post below) by about 25%. Like the Bisection method, you must bracket a root by finding values where the function is positive and negative before you start. It also gives guaranteed convergence. Further, you can choose a maximum number $n_0$ of additional steps beyond those needed for Bisection that it will take before converging in a worst-case scenario. You can even choose $n_0 \le 0$, though at the cost of making more common scenarios converge more slowly. In my own use cases, I've found $n_0 = 2$ works best overall. Two other parameters $k_1$ and $k_2$ may also be adjusted to improve convergence. In my use cases, I found $k_1 = 0.001$ divided by the width of the original interval and $k_2 = 2$ worked best, but your milage may vary.


[Original;} 4. Brent's Method. No, you did not mention this one. But in practice, some variant of Brent's Method is usually what you want to use. This method combines the Secant and Bisection methods, and another method called "Inverse Quadratic", which is like the secant method, but approximates the function with an inverse quadratic function instead of a line. It results in a slight improvement in convergence speed. Occasionally it fails, so the secant method is used as a back-up. Brent's method also monitors convergence, and if it gets too slow or tries to go off into the wild, the method drops in a bisection step instead to get things back on track. The problem I've found with Brent's method is that one side of your interval will converge quickly to a root, but the other side will remain rarely moved, because the Secant/Inverse Quadratic steps will keep having their iterations on the same side of the root. Brent's method eventually brings in the other side by slow Bisection. A trick I've employed to some improvement is to intentionally double the step size of the Secant/Inverse Quadratic steps in an effort to intentionally overshoot the root, thereby bringing that side in as well. Bringing in the other side of the interval by a large amount usually significantly improves convergence speed even on the side that was converging already.

$\endgroup$
11
  • $\begingroup$ You might maybe try out variants of regula falsi, like the Illinois method. Easier to code than Brent's method, following the same idea of preventing stalling, thus faster than normal regula falsi. $\endgroup$ Commented Oct 5, 2015 at 6:55
  • 1
    $\begingroup$ This is a fabulous post, and I heavily, heavily appreciate the time and effort you took into typing this. $\endgroup$ Commented Oct 6, 2015 at 0:36
  • $\begingroup$ @Paul Sinclair, do you by chance know of a function where Newton's Method would end up getting caught in a loop as you mentioned? $\endgroup$
    – Dragonite
    Commented Sep 29, 2017 at 15:40
  • 1
    $\begingroup$ @Dragonite - $y = 1 + |x|$ is a simple example. Of course in this case, it has no root to find and is not differentiable at one point, but the function could be modified close to 0 to solve both issues. But if your starting point is outside the modified zone, it will just flip back and forth between $1$ and $-1$. $\endgroup$ Commented Sep 29, 2017 at 23:21
  • $\begingroup$ The issue you are experiencing with Brent's method is actually an improper use of tolerance. After the last interpolating iteration is used, it should be applied again and rounded towards the midpoint by an amount less than the desired error (say half the error). This will cause the result to land on the opposite side of the root and snap the interval size down to the desired error. (Note: This marginal rounding can simply be applied onto every result.) (Additional note: what you are suggesting be used instead will actually slow the convergence, whereas the suggested modification doesn't.) $\endgroup$ Commented Aug 28, 2020 at 19:08

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .