<code_&_cosine>
Root Finding #1: Bisection Method

Image by Gerd Altmann from Pixabay

Root Finding #1: Bisection Method

20 January 2024

What are roots?

We will start this series by learning the easiest root finding algorightm: Bisection Method. We will explore the ins and outs of this algorithm, how it works and why it works.


To understand roots, it is essential to understand the definition of functions. Think of a function as a machine that follows precise rules - you put something in, and it gives you something back. For example, a simple function might take a number, double it, and subtract 4. But here's where roots come in: sometimes we're particularly interested in finding what we need to put into our function to get zero as the output. Why zero? Zero is special - it's like a balancing point where positive numbers meet negative numbers. When a function's output crosses zero, it switches from positive to negative (or vice versa), making zero a crucial threshold.

f(x)=2x4f(x) = 2x - 4

Approaximated upto 0.05

These "special inputs" that give us zero are what we call roots. Finding roots might sound abstract, but they show up in surprisingly practical situations. Imagine you're tracking two cars driving toward each other on a highway - finding when they meet means finding when the distance between them becomes zero. Or picture throwing a ball into the air - finding when it hits the ground means finding when its height becomes zero. Even in finance, if you're watching your bank account gradually decrease, finding when you'll run out of money means finding when your balance hits zero. In each case, we're really hunting for roots - those special input values that make something become zero.


Different types of roots require different way of handling. There are 3 types of roots:

  1. Simple Roots: x2=0x - 2 = 0 has root at x=2x = 2. They converge the fastest.
  2. Multiple Roots: (x2)2=0(x - 2)^2 = 0, root at x=2x = 2 with multiplicity 2. They require modified algorithms.
  3. Complex Roots: x2+1=0x^2 + 1 = 0, roots are ±i\pm i. They exist in complex plane in α+βi\alpha + \beta i form (real + imaginary component). They have conjugate pair property. They need specialized complex-number techniques.

Why Do We Need Numerical Methods?

Remember the root finding formula of ax2+bx+c=0ax^2 + bx + c = 0? It was,x=b±b24ac2ax = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}It was the first root finding formula we learned and still use to this day. It was so easy and straigtforward. However, certain equations, such as ex=x2e^x = x^2 or sin(x)=x3\sin(x) = x^3, cannot be solved algebraically. These equations don't have closed-form solutions, meaning there is no straightforward formula to compute their roots or values. Numerical methods allow us to approximate these solutions by iteratively getting closer to the real root or value. This iterative process ensures that we can get as close as needed to the exact solution, even if it cannot be expressed explicitly. Exact math often gives you perfect theoretical answers (like π\pi or ee) which lack true measurement (π\pi and ee are irrational), but real-world usage require real value or approximations. Numerical methods help bridge that gap by giving us answers that are “close enough” to be practical.

Introduction to Numerical Methods for Root Finding

Two important thing to remember about numerical root finding is that,

  1. In numerical root finding, we run an iterative process that approximates the root. In each iteration/step, we do some calculation and get close to the original root.
  2. Since numerical root finding approximates the original root, there will be some error present in our approximation and in each iteration/step, we reduce this error and get a more accurate approximation.

Because we are dealing with approximations, it's crucial to understand the concept of convergence. A good root-finding method will converge towards the true root, meaning that with each iteration, the approximation gets progressively better. We continue this process until the change between successive approximations becomes sufficiently small, indicating that we are close enough to the actual root for our purposes.


Now, let's take a look at the Bisection Method, the simplest algorightm to find roots.

Bisection Method

Bisection Method is one of the easiest, reliable, begginer-friendly and convergence guarenteed method for finding roots of equations. It only works on continuous functions.

Core Idea (Optional)

A function has a root at x=cx=c if cc is within an interval [a,b][a,b] and f(a)f(b)<0f(a) \cdot f(b) < 0. This concept comes from the Intermediate Value Theorem.

Intermediate Value Theorem: If ff is a continuous function whose domain contains the interval [a,b][a,b], then it takes on any given value between f(b)f(b) and f(b)f(b) at some point within the interval.


It means that for any value LL between f(a)f(a) and f(b)f(b), there's a value cc in [a,b][a,b] for which f(c)=Lf(c) = L. Bolzano's Theorem further specifies it in the context of roots.


Bolzano's Theorem: If a continuous function has values of opposite sign inside an interval, then it has a root in that interval.

In Bisection Method, we divide the interval in half in each step and see if the midpoint of that interval evaluates to zero in any of the divided section. If it does then that midpoint is our root. If it doesn't then we choose one of the divided interval (based on a condition discussed below) and again check if the midpoint of that interval evaluates to zero. We keep dividing the interval until the interval reaches our tolerance. Tolerance is the smallest interval we are willing to check. For example, if our tolerance is 0.0010.001, then that means, when our interval is so small that the difference between the starting and ending point is 0.0010.001 or lower, we stop and select the midpoint of that interval as our root.

Steps (Mandatory)

  1. Initial Interval: Find two values, aa and bb, such that f(a)f(a) and f(b)f(b) have opposite signs. This ensures a root lies within the interval [a,b][a, b]. So, we calculate and see if, f(a)cdotf(b)<0f(a) cdot f(b) < 0.
  2. Midpoint: Calculate the midpoint c of the interval, c=a+b2c = \frac{a + b}{2}
  3. Evaluate: Calculate the value of f(c)f(c).
  4. Set New Interval:
    • If f(c)=0f(c) = 0, then cc is the root.
    • If f(a)cdotf(c)<0f(a) cdot f(c) < 0, then the root lies in the interval [a,c][a, c]. Set b=cb = c.
    • If f(b)cdotf(c)<0f(b) cdot f(c) < 0, then the root lies in the interval [c,b][c, b]. Set a=ca = c.
  5. Repeat: Repeat steps 2-4 until the interval becomes sufficiently small (ba<tolerance|b - a| < tolerance) or f(c)f(c) is close enough to zero.

Practical Example

Let's find a root of the function f(x)=x5x312xf(x) = x^5 - x^3 - \frac{1}{2}x in the interval [1,2][1,2]. We can see from the graph that there are multiple roots (-1.16877, 0 and 1.16877) and in our selected interval the actual root is 1.168771.16877.

f(x)=x5x312xf(x) = x^5 - x^3 - \frac{1}{2}x

Notice how the interval shrinks in each iteration and gets closer to the actual root. Zoom in on the graph to get a better view of the interval as it gets smaller.

Iteration 1
a=1.0000000, b=2.00000a = 1.0000000, \ b = 2.00000
f(a)=f(1.000000)=0.5000000f(a) = f(1.000000) = -0.5000000
f(b)=f(2.000000)=23.00000f(b) = f(2.000000) = 23.00000
c=a+b2=1.00000+2.0000002=1.50000c = \frac{a+b}{2} = \frac{1.00000+2.000000}{2} = 1.50000
f(c)=f(1.500000)=3.468750f(c) = f(1.500000) = 3.468750
f(a:1.00000)f(c:1.500000)<0\because f(a: 1.00000) \cdot f(c: 1.500000) < 0, the new interval is [1.000000,1.50000][1.000000, 1.50000]
Iteration 2
a=1.0000000, b=1.50000a = 1.0000000, \ b = 1.50000
f(a)=f(1.000000)=0.5000000f(a) = f(1.000000) = -0.5000000
f(b)=f(1.500000)=3.468750f(b) = f(1.500000) = 3.468750
c=a+b2=1.00000+1.5000002=1.25000c = \frac{a+b}{2} = \frac{1.00000+1.500000}{2} = 1.25000
f(c)=f(1.250000)=0.4736328f(c) = f(1.250000) = 0.4736328
f(a:1.00000)f(c:1.250000)<0\because f(a: 1.00000) \cdot f(c: 1.250000) < 0, the new interval is [1.000000,1.25000][1.000000, 1.25000]
Iteration 3
a=1.0000000, b=1.25000a = 1.0000000, \ b = 1.25000
f(a)=f(1.000000)=0.5000000f(a) = f(1.000000) = -0.5000000
f(b)=f(1.250000)=0.4736328f(b) = f(1.250000) = 0.4736328
c=a+b2=1.00000+1.2500002=1.12500c = \frac{a+b}{2} = \frac{1.00000+1.250000}{2} = 1.12500
f(c)=f(1.125000)=0.1842957f(c) = f(1.125000) = -0.1842957
f(b:1.25000)f(c:1.125000)<0\because f(b: 1.25000) \cdot f(c: 1.125000) < 0, the new interval is [1.250000,1.12500][1.250000, 1.12500]
Iteration 4
a=1.1250000, b=1.25000a = 1.1250000, \ b = 1.25000
f(a)=f(1.125000)=0.1842957f(a) = f(1.125000) = -0.1842957
f(b)=f(1.250000)=0.4736328f(b) = f(1.250000) = 0.4736328
c=a+b2=1.12500+1.2500002=1.18750c = \frac{a+b}{2} = \frac{1.12500+1.250000}{2} = 1.18750
f(c)=f(1.187500)=0.09308147f(c) = f(1.187500) = 0.09308147
f(a:1.12500)f(c:1.187500)<0\because f(a: 1.12500) \cdot f(c: 1.187500) < 0, the new interval is [1.125000,1.18750][1.125000, 1.18750]
Iteration 5
a=1.1250000, b=1.18750a = 1.1250000, \ b = 1.18750
f(a)=f(1.125000)=0.1842957f(a) = f(1.125000) = -0.1842957
f(b)=f(1.187500)=0.09308147f(b) = f(1.187500) = 0.09308147
c=a+b2=1.12500+1.1875002=1.15625c = \frac{a+b}{2} = \frac{1.12500+1.187500}{2} = 1.15625
f(c)=f(1.156250)=0.05732092f(c) = f(1.156250) = -0.05732092
f(b:1.18750)f(c:1.156250)<0\because f(b: 1.18750) \cdot f(c: 1.156250) < 0, the new interval is [1.187500,1.15625][1.187500, 1.15625]
Iteration 6
a=1.1562500, b=1.18750a = 1.1562500, \ b = 1.18750
f(a)=f(1.156250)=0.05732092f(a) = f(1.156250) = -0.05732092
f(b)=f(1.187500)=0.09308147f(b) = f(1.187500) = 0.09308147
c=a+b2=1.15625+1.1875002=1.17188c = \frac{a+b}{2} = \frac{1.15625+1.187500}{2} = 1.17188
f(c)=f(1.171875)=0.01480922f(c) = f(1.171875) = 0.01480922
f(a:1.15625)f(c:1.171875)<0\because f(a: 1.15625) \cdot f(c: 1.171875) < 0, the new interval is [1.156250,1.17188][1.156250, 1.17188]
Iteration 7
a=1.1562500, b=1.17188a = 1.1562500, \ b = 1.17188
f(a)=f(1.156250)=0.05732092f(a) = f(1.156250) = -0.05732092
f(b)=f(1.171875)=0.01480922f(b) = f(1.171875) = 0.01480922
c=a+b2=1.15625+1.1718752=1.16406c = \frac{a+b}{2} = \frac{1.15625+1.171875}{2} = 1.16406
f(c)=f(1.164063)=0.02200547f(c) = f(1.164063) = -0.02200547
f(b:1.17188)f(c:1.164063)<0\because f(b: 1.17188) \cdot f(c: 1.164063) < 0, the new interval is [1.171875,1.16406][1.171875, 1.16406]
Iteration 8
a=1.1640625, b=1.17188a = 1.1640625, \ b = 1.17188
f(a)=f(1.164063)=0.02200547f(a) = f(1.164063) = -0.02200547
f(b)=f(1.171875)=0.01480922f(b) = f(1.171875) = 0.01480922
c=a+b2=1.16406+1.1718752=1.16797c = \frac{a+b}{2} = \frac{1.16406+1.171875}{2} = 1.16797
f(c)=f(1.167969)=0.003787778f(c) = f(1.167969) = -0.003787778
f(b:1.17188)f(c:1.167969)<0\because f(b: 1.17188) \cdot f(c: 1.167969) < 0, the new interval is [1.171875,1.16797][1.171875, 1.16797]
Iteration 9
a=1.1679688, b=1.17188a = 1.1679688, \ b = 1.17188
f(a)=f(1.167969)=0.003787778f(a) = f(1.167969) = -0.003787778
f(b)=f(1.171875)=0.01480922f(b) = f(1.171875) = 0.01480922
c=a+b2=1.16797+1.1718752=1.16992c = \frac{a+b}{2} = \frac{1.16797+1.171875}{2} = 1.16992
f(c)=f(1.169922)=0.005463023f(c) = f(1.169922) = 0.005463023
f(a:1.16797)f(c:1.169922)<0\because f(a: 1.16797) \cdot f(c: 1.169922) < 0, the new interval is [1.167969,1.16992][1.167969, 1.16992]
Iteration 10
a=1.1679688, b=1.16992a = 1.1679688, \ b = 1.16992
f(a)=f(1.167969)=0.003787778f(a) = f(1.167969) = -0.003787778
f(b)=f(1.169922)=0.005463023f(b) = f(1.169922) = 0.005463023
c=a+b2=1.16797+1.1699222=1.16895c = \frac{a+b}{2} = \frac{1.16797+1.169922}{2} = 1.16895
f(c)=f(1.168945)=0.0008257340f(c) = f(1.168945) = 0.0008257340
f(a:1.16797)f(c:1.168945)<0\because f(a: 1.16797) \cdot f(c: 1.168945) < 0, the new interval is [1.167969,1.16895][1.167969, 1.16895]
Iteration 11
a=1.1679688, b=1.16895a = 1.1679688, \ b = 1.16895
f(a)=f(1.167969)=0.003787778f(a) = f(1.167969) = -0.003787778
f(b)=f(1.168945)=0.0008257340f(b) = f(1.168945) = 0.0008257340
c=a+b2=1.16797+1.1689452=1.16846c = \frac{a+b}{2} = \frac{1.16797+1.168945}{2} = 1.16846
f(c)=f(1.168457)=0.001483990f(c) = f(1.168457) = -0.001483990

(batolerance)\because (b - a \le tolerance) or (1.1689451.1679690.001)(1.168945 - 1.167969 \le 0.001), we stop the iteration and the last value of cc is our approximation of the actual root.

\therefore The approximated root is 1.168461.16846, which is very close to the actual root 1.168771.16877. The deviation is equal to the tolerance (0.001)tolerance \ (0.001). By adjusting the tolerancetolerance, we can get a better approximation. However, in that case the iteration count will increase.


As we can see from the graphs, the interval is shrinking by half in each iteration and recalculating our actual root. Clearly, our approximation is getting closer to the actual root. In the last iteration, take a careful look at the graph and zoom in to see that the value we got from the last iteration (1.16846)(1.16846) is almost equal to the actual root (1.16877)(1.16877). Take a quick look at the table below to get a bird's eye view of the entire process, it contains the essential data of each iteration.


Bisection Data
Iterationsaabbc=a+b2c = \frac{a+b}{2}f(c)f(c)
Iteration 11.0000002.0000001.5000003.468750
Iteration 21.0000001.5000001.2500000.4736328
Iteration 31.0000001.2500001.125000-0.1842957
Iteration 41.1250001.2500001.1875000.09308147
Iteration 51.1250001.1875001.156250-0.05732092
Iteration 61.1562501.1875001.1718750.01480922
Iteration 71.1562501.1718751.164063-0.02200547
Iteration 81.1640631.1718751.167969-0.003787778
Iteration 91.1679691.1718751.1699220.005463023
Iteration 101.1679691.1699221.1689450.0008257340
Iteration 111.1679691.1689451.168457-0.001483990
Approximated Root: 1.168457

Implementation

Below is the psuedocode for the implementation. Try to understand the steps and implement it on your own before looking at the solution.

1. Define the function f(x) that you want to find the root of.

2. Input:
a - Start of the interval
b - End of the interval
tolerance - Acceptable error in the root value

3. Repeat until the width of the interval (b - a) is greater than tolerance:
a. Calculate the midpoint:
c = (a + b) / 2

b. Check if the function value at the midpoint is zero:
If f(c) == 0:
Return c (The exact root is found)

c. Narrow down the interval:
If f(a) * f(c) < 0:
The root is in the left half, so set b = c
Else:
The root is in the right half, so set a = c

4. When the interval is small enough:
Return (a + b) / 2 as the approximate root.

Here's the implementation of Bisection Method in a few commonly used languages:

function bisectionMethod(a, b, tolerance) {
  while (b - a > tolerance) {
      let c = (a + b) / 2;
      if (f(c) === 0) {
          return c;
      }
      if (f(a) * f(c) < 0) {
          b = c;
      } else {
          a = c;
      }
  }
  return (a + b) / 2;
}
const f = (x) => x**5 - x**3 - 1/2 * x
const root = bisectionMethod(1, 2, 0.001)
console.log("Root: ", root)

Important Terms

  • Continuous Function: A continuous function is a smooth curve with no breaks, jumps, or sudden changes. It means you can draw the function without lifting your pencil from the paper.
  • Below is a non-continuous function 1x\frac{1}{x}, where Bisection Method won't work because for this function x=0x=0 is undefined.

  • Convergence/Converging/Converge: This means an algorithm steadily approaches the true solution, systematically reducing the gap between the current estimate and the exact answer. It's like getting closer to a target with each step, where the distance between your current position and the target keeps shrinking.
  • Multiplicity: Multiplicity refers to how many times a root appears or touches the x-axis. A root with multiplicity 2 means the function just touches the x-axis without crossing it. Higher multiplicity means the function "lingers" near the x-axis more.
  • For example, f(x)=(x3)2(x+1)f(x) = (x-3)^2(x+1) has roots: x=3x=3 with multiplicity 2 and x=1x=-1 with multiplicity 1.

    At x=3x = 3, function touches x-axis gently. At x=1x = -1, function crosses x-axis normally.

  • Complex Plane: The complex plane is a 2D graph where real numbers (α)(\alpha) are plotted horizontally and imaginary numbers (βi)(\beta i) vertically. Each point represents a complex number as a coordinate, with α\alpha representing the real part and β\beta the imaginary coefficient.
  • Complex Conjugates: Complex conjugates are pairs of complex numbers that have the same real part but opposite imaginary parts. For example, 3 + 2i and 3 - 2i are complex conjugates: they're mirror images across the real axis.

Conclusion

Bisection Method is one of the easiest numerical methods. Understanding the core concepts and the implementation of this algorithm will help us get familiarized with the world of computational mathematics. The upcoming posts in this series will refer back to this post because this is one of the foudational algorithms in numerical analysis. So, if there's still any doubts or confusions, read the steps slowly and try to understand why each step was done, ask yourself, "If this step was not executed, how would it affect the end result?" and try to visualize using the graphs how each iteration depends on the previous iteration's data. This concludes the Bisection Method. In the next post, we will explore Newton-Raphson Method .

Part of series: Numerical Methods for Root Finding

Part 1 of 2