Sensor Fusion with the Extended Kalman Filter in ROS 2

29 min readMay 29, 2024

This is the second article in a series dedicated to introducing Gaussian Filters. Specifically, it is the second of three in-depth introductions to the Kalman Filter family. If you are not already familiar with Kalman Filters, I recommend reading the first article before continuing with this one. The subsequent article will introduce the Unscented Kalman Filter. Data and code to reproduce the results discussed here can be found towards the end of this article.

Introduction

As introduced in the previous article, successful robotic systems can perceive and manipulate the physical world to complete useful tasks. To achieve this, robots must account for the significant uncertainty in their environment. One of the most fundamental problems in modern robotics is State Estimation. State Estimation involves determining the most probable state (e.g., position, orientation, speed) of a robot and its environment (e.g., location of landmarks and other objects) based on uncertain (noisy) and possibly incomplete information.

The Kalman Filter addresses state estimation in an uncertain world. The filter models each element of the state (e.g., x, y coordinates, heading) as a Gaussian random variable. Gaussians enable us to represent the probability density over a state x using only the vector μ (mu) and the covariance matrix Σ (Sigma). In this parameterization, our state x is represented by its expected value μ, and Σ captures the inherent uncertainty in the state due to control, motion, and observation noise.

Under the Bayesian framework, which the Kalman Filter employs, the entire state is referred to as the belief. The advantage of using Gaussians (or Normals) lies in their mathematical properties, which simplify the Kalman Filter equations. A key property is their closure under linear transformations: when a Gaussian belief undergoes a linear transformation, the result remains a Gaussian random variable. This property ensures that the equations of the Kalman Filter remain elegant and manageable.

Linear transformation of a Gaussian variable. Source: Probabilistic Robotics slides

To calculate the posterior belief, the Kalman Filter uses a motion model to propagate the prior belief forward in time. An observation model then integrates data from the robot’s sensors, updating the predicted belief and converting it into the posterior. The Kalman Filter algorithm is summarized below:

The Kalman Filter algorithm. Source: Probabilistic Robotics book.

The “linearity” of the Linear Kalman Filter is most evident in lines 2 and 5 of the algorithm. In line 2, the predicted state is a linear function of the previous state μ_t−1 and the control input u_t. In line 5, the predicted observation (y = Cμ) is a linear function of its argument μ^bar_t. This linearity maintains the Gaussian properties, making the filter easy to implement. However, it also represents one of its major weaknesses.

The reason the Linear Kalman Filter is inadequate for real-world problems is that such problems are often not linear. Even the simple state estimation introduced in my previous article, where the state was represented by the pose in 2D of a mobile robot (x, y, θ), was not exactly linear. This can be easily seen from the motion model used, which is reintroduced below.

The constant velocity motion model equations

The constant velocity motion model using matrices and vectors

As you can easily notice, the model has trigonometric functions to update the coordinates of the robot, and such functions are not linear. With this model, the Linear Kalman Filter is likely to diverge at some point. For this reason, after Dr. Kalman introduced the Linear Kalman Filter, work immediately began on devising extensions to account for the non-linear nature of most real-world phenomena.

The Extended Kalman Filter

The problem with nonlinearities

To further illustrate the problem with nonlinearities, we can look at the following animations. In the first one, we are in a world where the assumptions of the Linear Kalman Filter hold true, and the new state is simply linear in its arguments. To make the visualization easy to understand, we assume a 1-dimensional state x. The animation shows how a Gaussian, when passed through a linear function g, results in another Gaussian. In this case, g = -0.5*x+1.

Passing a Gaussian through a linear function *g = -0.5*x+1*..

If we start from a Gaussian representation of x but choose a nonlinear function g, the resulting probability density function is no longer Gaussian. There is no closed-form method for computing this new density. Instead, we must sample points from the input distribution, pass them through g, and build an output histogram to construct the output distribution. It is clear from the shape of this output, shown below, that it is not Gaussian. This output distribution also violates the unimodality assumption of the Kalman Filter, which requires a single peak. A Gaussian approximation of P(y) was obtained by fitting a Gaussian to the output data. This highlights the limitations of the Linear Kalman Filter in handling nonlinearities in real-world models.

Passing a Gaussian through a nonlinear function *g =* cos(3 * (x/2 + 0.7)) * sin(1.3 * x) — 1.6 * x.

In summary, when dealing with nonlinear models, the output density is not Gaussian, can’t be computed in closed form, and often has multiple peaks, rendering our Kalman Filter equations useless. To address this, the Extended Kalman Filter (EKF) drops the linearity assumption. Instead, it accepts that the state transition probability and the measurement probabilities are governed by nonlinear functions g and h, respectively.

Estimate of state x via the non-linear action model. Source: Probabilistic Robotics slides

Likelihood of a measurement z via the non-linear sensor model. Source: Probabilistic Robotics slides.

This is all good, but to leverage the elegant equations of the Kalman Filter, we still need to keep representing our belief as a Gaussian. This necessity forces us to find good Gaussian approximations of the true belief, effectively shifting the focus of the EKF from computing the exact posterior to efficiently estimating its mean and covariance. As shown in the figure below, after using Monte Carlo to compute the output distribution, we can fit a Gaussian to it and obtain the necessary parameters μ (mu) and Σ (Sigma). However, the issue remains that we are not able to compute the Gaussian in closed-form.

Linearization via Taylor Expansion

To solve this issue, the Extended Kalman Filter applies an additional approximation called linearization. The key idea of linearization is to approximate a nonlinear function g by a linear function that is tangent to g at the point of interest. Among the various techniques used to linearize nonlinear functions, the one employed by the EKF is the first-order Taylor Expansion. The Taylor Expansion of a function is an infinite sum of terms (polynomials) expressed in terms of the function’s derivatives at a single point a.

The Taylor Expansion of a function g

From the image above, we can see that a higher-order Taylor Expansion gives a closer approximation of g around a point a = 0. However, as the degree of the polynomials increases, so does the required computation, making the problem quickly intractable. Fortunately, for a Kalman Filter that updates frequently (small Δt), the different points of interest a should be very close to each other. Thus, we can use a polynomial of degree 1 (a line) to obtain a linear approximation of the function g from its value at each subsequent (and very near) point a and its slope (derivative at a). The problem is essentially reduced to:

First order Taylor Expansion of g

To illustrate why this works, imagine we have a function g as follows:

The first-order Taylor expansion of g at a point a is shown below in red. It clearly does not provide a good approximation if we observe it over a large range of x values.

First order Taylor Expansion for the non-linear function g

However, since we are only concerned with the approximation at the values of the posterior, and we know that we will be recomputing such approximations for the new posteriors after a very short period of time, we can see from the graph below that our approximation is quite good in the close vicinity of the point of interest. This ensures that the first-order Taylor expansion provides a sufficient approximation for our purposes, enabling us to apply the Extended Kalman Filter effectively despite the inherent non-linearities in the system.

First order Taylor Expansion for the non-linear function g, zoomed-in near the point of interest.

Finally, the image below attempts to compare the two processes: the first starts with the original Gaussian, passes it through the non-linear function g, uses Monte Carlo to obtain the non-Gaussian output distribution, and fits a Gaussian to this output. The second process, employed by the EKF, linearizes g, passes the original Gaussian through this linear approximation, and directly obtains the output in closed form via linearization. This comparison highlights the efficiency and practicality of the EKF approach for dealing with non-linear systems.

Two output Gaussians from an original Gaussian passed though a nonlinear function g

For clarity, the output alone is shown below. As you can see, the EKF Gaussian is not exactly the same as the one fitted from the Monte Carlo simulation, but it is close enough. This small discrepancy is the price we pay to efficiently obtain a closed-form estimate of the true distribution.

The simple examples described above were for the scalar case, but our state is a vector. Hence to obtain the slope, we compute the partial derivative of g with respect to the state.

The slope of g. Source: Probabilistic Robotics book.

The values of both g and its slope g′ depend on their arguments (u_t, and x_t-1), which are our points of interest (as opposed to a in the scalar case). For the value of u_t, we simply use the control command provided to the robot. An obvious choice for x_t-1 is the value of the state that is most likely at the time of the linearization. For a Gaussian, the most likely value is the expected state, or the mean of the posterior, computed at the previous time step and denoted by μ_t-1. This linearization works well for a rapidly updating filter (very small Δt), where the difference between the value μ_t-1 and the current state we are trying to estimate is not too large.

With the slope now calculated, we can approximate g as:

The approximation of g via first order Taylor Expansion. Source: Probabilistic Robotics book.

As a Gaussian, the motion model or state transition probability is written as below, with R_t as the usual process noise covariance.

Gaussian of the linearized motion model. Source: Probabilistic Robotics book.

Notice that, unlike the scalar case where the slope is a number, g′(u_t, μ_t-1), known as G_t, is a matrix. This matrix, containing all of the first-order partial derivatives of the motion model represented by the nonlinear function g with respect to the state x, is known as the Jacobian. This Jacobian matrix has dimensions n×n, where n is the dimension of the state. Since the value of the Jacobian depends on the current control and the mean of the previous posterior, its value changes for each point in time.

The Extended Kalman Filter deals with nonlinear observation models represented by the function h in the same way. In particular, the Taylor Expansion is performed around the newly predicted belief μ^bar_t, which is clearly the most likely state at the moment of linearizing h.

The slope of h. Source: Probabilistic Robotics book.

The approximation of h via first order Taylor Expansion. Source: Probabilistic Robotics book.

As a Gaussian, the measurement model is written as below, with Q_t as the usual measurement noise covariance. Here, the Jacobian H_t is the m×n matrix of first-order partial derivatives of the observation model represented by the nonlinear function h with respect to the state x. Hence, m is the dimension of the observations, and n is the dimension of the state.

Gaussian of the linearized observation model. Source: Probabilistic Robotics book.

The Extended Kalman Filter Algorithm

In essence, the EKF algorithm shown below is very similar to the LKF algorithm shown earlier in this article, with the main differences being the linearization of the motion and observation models at lines 2 and 5.

The image below restates the EKF algorithm side by side with the LKF to highlight the main differences. For the prediction step, the EKF uses the nonlinear function g to evolve the state in time, rather than the linear system matrices A and B. It uses the Jacobian G to evolve the uncertainty in time, rather than using A. For the update or correction step, the EKF predicts the measurement with the nonlinear function h rather than the matrix C. It also uses the Jacobian H instead of the matrix C to update the covariance.

Most important differences between the LKF and the EKF. Source: Probabilistic Robotics slides.

Implementation

To demonstrate the Extended Kalman Filter, this article will first introduce a linearized version of the models used by the Linear Kalman Filter implemented in the previous article. Next, more complex variations, including one that uses sensor fusion, will be introduced and their performances compared.

Linearizing the Models

Recall that the motion model used with the Linear Kalman Filter was a simpler constant velocity model that was not exactly linear, but was still simple enough for the LKF to handle. For reference, we show it again below. This model will serve as the basis for introducing more complex variations that will be used to demonstrate the capabilities of the Extended Kalman Filter.

These equations were converted into the linear system matrices required by the LKF as below:

The velocity motion model using matrices and vectors

To use the same model with the EKF, we need to linearize it. First, we restate the format of the state of the robot.

Vector representation of the state x of the robot

We then define the nonlinear function g.

Next, we define the Jacobian matrix G, which includes the partial derivatives of the function g with respect to each of the state variables x, y, and θ. This matrix captures how changes in the state variables affect the motion model. The Jacobian matrix G will be used in the EKF algorithm to update the covariance matrix, allowing us to account for the non-linearities in the motion model while maintaining the computational efficiency of the Kalman Filter framework.

The Jacobian matrix G of the nonlinear function g

Looking at the first row of the Jacobian matrix G, which represents the derivatives of the x update equation with respect to x, y, and θ, we see the following:

The first entry is 1 because this is the partial derivative of x with respect to itself.
The second entry is 0, indicating that x does not depend on y.
The third entry is −vsin(θ)Δt, showing that x directly depends on θ through the term −vsin(θ)Δt. This reflects the effect of changes in θ on the x coordinate.

Next, we need the measurement function h. Similar to the previous implementation of the Linear Kalman Filter, the only measurements we will use are those provided by the robot’s odometry system. This system directly gives us the estimated pose of the robot as calculated from the wheel encoders. Since the format of that data matches our state format, h(μ^bar_t) simply returns μ^bar_t.

And the Jacobian matrix H, which contains the partial derivatives of the measurement functions with respect to the state is simple the identity matrix and is shown below.

The Jacobian matrix H of the nonlinear function h

As shown in an earlier figure comparing the EKF with the LKF, the linearization is essentially the only difference between the two. Having determined the relevant functions and corresponding Jacobians, the implementation of the EKF closely follows that of the LKF. Below is the Python implementation of the linearized velocity motion model. It returns both the matrix g and the Jacobian G.

def velocity_motion_model_linearized_1():

 def state_transition_function_g(mu = None, u = None, delta_t = None):
  x = mu[0]
  y = mu[1]
  theta = mu[2]
  
  v = u[0]       
  w = u[1]       
  
  g = np.array([
            x + v * np.cos(theta) * delta_t,
            y + v * np.sin(theta) * delta_t,
            theta + w * delta_t
        ])

  return g

 def jacobian_of_g_wrt_state_G(mu = None, u = None, delta_t = None):
  theta = mu[2]
  v = u[0]       
  w = u[1]       
  
  G = np.array([
   [1, 0, -v * np.sin(theta) * delta_t],
   [0, 1, v * np.cos(theta) * delta_t],
   [0, 0, 1]
  ])

  return G

 return state_transition_function_g, jacobian_of_g_wrt_state_G

Next, we implement the simple observation model as below.

def odometry_observation_model_linearized():
 def observation_function_h(mu):
  return mu
 
 def jacobian_of_h_wrt_state_H(mu):
  return np.eye(3)

 return observation_function_h, jacobian_of_h_wrt_state_H

Finally, the Extended Kalman Filter is implemented below. Notice how strikingly similar it is to the code for the Linear Kalman Filter introduced in the previous article.



import numpy as np 

from rse_motion_models.velocity_motion_models import velocity_motion_model_linearized_1
from rse_observation_models.odometry_observation_models import odometry_observation_model_linearized

class KalmanFilter:

 def __init__(self, initial_state, initial_covariance, proc_noise_std = [0.02, 0.02, 0.01], obs_noise_std = [0.02, 0.02, 0.01]):

  self.mu = initial_state # Initial state estimate 
  self.Sigma = initial_covariance # Initial uncertainty

  self.g, self.G = velocity_motion_model_linearized_1() # The action model to use.
  
  # Standard deviations of the process or action model noise
  self.proc_noise_std = np.array(proc_noise_std)
  # Process noise covariance (R)
  self.R = np.diag(self.proc_noise_std ** 2) 

  self.h, self.H = odometry_observation_model_linearized() # The observation model to use

  # Standard deviations for the observation or sensor model noise
  self.obs_noise_std = np.array(obs_noise_std)
  # Observation noise covariance (Q)
  self.Q = np.diag(self.obs_noise_std ** 2)

 def predict(self, u, dt):
  # Predict state estimate (mu) 
  self.mu = self.g(self.mu, u, dt)
  # Predict covariance (Sigma)
  self.Sigma = self.G(self.mu, u, dt) @ self.Sigma @ self.G(self.mu, u, dt).T + self.R # V_t @ M_t @ V_t.T

  return self.mu, self.Sigma

 def update(self, z, dt):
  # Compute the Kalman gain (K)
  K = self.Sigma @ self.H(self.mu).T @ np.linalg.inv(self.H(self.mu) @ self.Sigma @ self.H(self.mu).T + self.Q)
  
  # Update state estimate (mu) 
  self.mu = self.mu + K @ (z - self.h(self.mu))

  # Update covariance (Sigma)
  I = np.eye(len(K)) 
  self.Sigma = (I - K @ self.H(self.mu)) @ self.Sigma

  return self.mu, self.Sigma

Performance:

To evaluate the performance of the Extended Kalman Filter (EKF), we will compare it to the performance of the Linear Kalman Filter (LKF) that uses the same constant velocity model and odometry observation model. First, we will set a large initial process noise and a very low observation noise. This setup essentially tells the filter to trust the observations more than the action model. Unsurprisingly, both versions of the Kalman Filter perform similarly by closely following the observation. Given the simplicity and linearity of the observation model, this result is expected.

Performance evaluation of the LKF (left) and the EKF (right) with noise parameters that favor the observation over the motion model

The next test is more challenging. We will set the process noise to be very low and the observation noise to be very high. This setup will cause the filter to mostly ignore the observations and rely heavily on the motion model for state estimation. Although the motion equations are the same, we should expect significantly different results due to the linearization applied by the EKF.

Performance evaluation of the LKF (left) and the EKF (right) with noise parameters that favor the motion over the observation model

As expected, the difference is substantial. The LKF performs poorly under these noise settings. When relying entirely on the nonlinear motion model, the LKF’s output deviates significantly from both the ground truth and the odometry reported by the robot’s sensors. In contrast, the EKF performs much better, staying closer to the observation data. It is also worth noting that the EKF correctly indicates high uncertainty, evident from the large ellipse in the right plot. On the other hand, the LKF reports a much lower uncertainty, which is inaccurate.

Alternative Motion Model

The motion model used by the previous two filters is fairly simple and assumes that our robot is moving in a straight line in the direction of the heading angle θ (theta). More sophisticated motion models exist, and it is worth checking if they can perform better. The new constant velocity motion model that we will explore next also allows us to control the robot through the translational (linear) and rotational (angular) velocities v and ω, respectively. Instead of assuming straight-line motion, this model assumes that the robot is moving on a circle with radius r, as depicted below.

A more sophisticated constant velocity motion model

The circular motion assumption, like the straight-line motion assumption of the simpler model, is just an approximation and will only be valid if the time delta is very small (so small that we can’t tell if the motion is on a circle or straight line). Note that as the value of ω approaches 0, the radius gets very large, almost representing motion on a straight line. The vector of equations corresponding to the nonlinear function g that will follow this model to evolve the state in time is shown below.

The nonlinear motion function g for the more sophisticated motion model

Now, to linearize this model, we need to compute the Jacobian of g, which corresponds to the partial derivative of g with respect to the state. Recall that our state is:

The state of the robot, which corresponds to its pose.

The Jacobian G is therefore represented by:

As shown earlier, after having determined the relevant matrices, the implementation is straightforward. First, we define the code for the velocity model below.

def velocity_motion_model_linearized_2():

 def state_transition_function_g(mu = None, u = Noneg, delta_t = None):
  x = mu[0]
  y = mu[1]
  theta = mu[2]
  
  v = u[0]       
  w = u[1]       
  if w == 0:
   w = 1e-6   # Avoid division by zero for straight-line motion

  g = np.array([
     x + -v/w * np.sin(theta) + v/w * np.sin(theta + w * delta_t),
     y + v/w * np.cos(theta) - v/w * np.cos(theta + w * delta_t),
     theta + w * delta_t
  ])

  return g

 def jacobian_of_g_wrt_state_G(mu = None, u = None, delta_t = None):
  theta = mu[2]
  v = u[0]       
  w = u[1]       
  if w == 0:
   w = 1e-6   # Avoid division by zero for straight-line motion

  G = np.array([
   [1, 0, -v / w * np.cos(theta) + v / w * np.cos(theta + w * delta_t)],
   [0, 1, -v / w * np.sin(theta) + v / w * np.sin(theta + w * delta_t)],
   [0, 0, 1]
  ])

  return G

 return state_transition_function_g, jacobian_of_g_wrt_state_G

The observation model is exactly the same as the one used in the previous example and we will restate it here for completeness.

def odometry_observation_model_linearized():
 def observation_function_h(mu):
  return mu
 
 def jacobian_of_h_wrt_state_H(mu):
  return np.eye(3)

 return observation_function_h, jacobian_of_h_wrt_state_H

Finally, the second Extended Kalman Filter is implemented below. The code is exactly the same as the previous one, with the only exception being the line that defines the motion model.

class KalmanFilter:

 def __init__(self, initial_state, initial_covariance, proc_noise_std = [0.02, 0.02, 0.01], obs_noise_std = [0.02, 0.02, 0.01]):

  self.mu = initial_state # Initial state estimate 
  self.Sigma = initial_covariance # Initial uncertainty

  self.g, self.G = velocity_motion_model_linearized_2() # The action model to use.
  
  # Standard deviations of the process or action model noise
  self.proc_noise_std = np.array(proc_noise_std)
  # Process noise covariance (R)
  self.R = np.diag(self.proc_noise_std ** 2) 

  self.h, self.H = odometry_observation_model_linearized() # The observation model to use

  # Standard deviations for the observation or sensor model noise
  self.obs_noise_std = np.array(obs_noise_std)
  # Observation noise covariance (Q)
  self.Q = np.diag(self.obs_noise_std ** 2)

 def predict(self, u, dt):
  # Predict state estimate (mu) 
  self.mu = self.g(self.mu, u, dt)
  # Predict covariance (Sigma)
  self.Sigma = self.G(self.mu, u, dt) @ self.Sigma @ self.G(self.mu, u, dt).T + self.R 

  return self.mu, self.Sigma

 def update(self, z, dt):
  # Compute the Kalman gain (K)
  K = self.Sigma @ self.H(self.mu).T @ np.linalg.inv(self.H(self.mu) @ self.Sigma @ self.H(self.mu).T + self.Q)
  
  # Update state estimate (mu) 
  self.mu = self.mu + K @ (z - self.h(self.mu))

  # Update covariance (Sigma)
  I = np.eye(len(K)) 
  self.Sigma = (I - K @ self.H(self.mu)) @ self.Sigma

  return self.mu, self.Sigma

Performance:

At this point, it is clear that the EKF is superior to the LKF, and we will not continue using the LKF in our comparisons. Our interest now is in determining whether the more sophisticated motion model brings noticeable improvements. We will not perform an evaluation with noise settings where the observation model is explicitly preferred over the motion model, as we can expect the result to simply match the observation for any filter. Therefore, below I present results with noise settings that tend to mostly ignore the observation and focus more on the motion model.

Performance evaluation of the EKF 1 (left) and the EKF 2 (right) with noise parameters that favor the motion over the observation model

Surprisingly, the more advanced model did not outperform the simpler one. From the two plots above, the naked eye can’t notice significant performance differences. Even the reported noise ellipse seems quite the same.

One conclusion derived from the previous article, which we also reach here, is that no matter how sophisticated and advanced your filter gets, if you feed it erroneous data, it will provide erroneous estimations. There is nothing we can do to reconcile the fact that there is a problem with our odometry observation. From all previous plots, it seems like the major issue is the heading. No motion model, no matter how advanced, will make up for the absence of correct heading data. If the robot provides alternative sources of heading (and other useful sensor data), then we can use the filter to fuse the different sources and obtain a better estimate. Let’s explore sensor fusion next.

The Power of Sensor Fusion

There are different kinds of sensor fusion, and the one we are going to apply is known as low-level fusion. The goal of low-level data fusion is to combine several sources of raw sensor data that together are more informative. The idea is to leverage the strengths of the different sources to produce better estimates.

In this article, we will study how to fuse the original odometry data with data coming from an Inertial Measurement Unit (IMU). The IMU can typically provide us with orientation, angular velocities, and linear accelerations. This is very beneficial because we suspect our inaccurate estimates are due to bad orientation information. IMUs tend to be more accurate than odometers regarding this kind of information, so let’s explore how IMU data can be fused with odometry to obtain better results.

The first thing we will do is extend the state and the models. Previously, it made little sense to consider more than the robot’s pose because we did not have good sources of additional data, but now we have a sensor that can provide us with angular velocity and linear acceleration. Incorporating the new information into our state will help us better estimate the robot’s state. It is also necessary since we will extend our motion and sensor models. We will try two different extensions in this article, beginning with a new 7-dimensional state as follows.

Here, x, y, and θ correspond to the familiar robot pose. Unlike past implementations, where the velocities v and ω were only obtained from the control input, we now also keep estimates of them as part of the state. The state also includes the linear acceleration in x and y. Given the extended state, the motion model must also be extended to compute each state component. A motion model incorporating acceleration can no longer be a constant velocity model. Therefore, the new assumption is that the velocity changes between estimations, while the acceleration remains constant. The new constant acceleration motion model we will use is represented by the function g and the Jacobian matrix G, as shown below.

The nonlinear motion function g for the new constant acceleration motion model

The Jacobian matrix G of the constant acceleration motion model

Now things are getting complicated. Calculating this larger Jacobian matrix can be very error-prone, and although it is fun to try to do it by hand, it is always a good idea to use software to verify the result. Even after you are sure your calculation is correct and you have implemented it, you should have ways to verify that the implementation is correct. For every Jacobian implemented for this article, I had at least one small error that my testing scripts picked up. If you do not use software to compute the Jacobian and validate the implementation, it is guaranteed you will introduce bugs into your code.

We have successfully expanded our state and motion model. Now, to enable sensor fusion, we also need to expand the observation model. We will do so by concatenating the data from both sensors, which, in combination with the extended state and motion models, will perform the fusion. The new observation model is a combination of the usual robot pose (first three elements of the vector shown below) with the orientation θ_imu, angular velocity ω, and accelerations a_x and a_y in the x and y components, respectively, all of which come from the IMU sensor. Notice that we do not directly observe the linear velocity v, but we can still have it in our state, and it will be inferred by our Kalman Filter through the motion model. This kind of unobserved variable is often called a latent or hidden variable.

Next, the Jacobian matrix is defined below. Since there is a direct mapping from observation to state variables, when an observation (the rows) corresponds to a state variable (the columns) in the Jacobian, there will be a 1. Notice how the 3rd column, which corresponds to the state variable θ, has two 1s — one in the row for the θ from the odometry and one for the θ from the IMU. Also, the 4th column, which corresponds to the linear velocity, has all zeros as we do not directly observe this variable.

The implementation of the motion and observation models are shown below. The implementation of the filter is very similar to the previous one, and will be omitted for brevity.

def acceleration_motion_model_linearized_1():

 def state_transition_function_g(mu = None, u = None, delta_t = None):
  
  x, y, theta, v, w, a_x, a_y = mu

  v = u[0]      
  w = u[1] 
  
  g = np.array([
   x + v * np.cos(theta) * delta_t + 0.5 * a_x * delta_t**2,      
      y + v * np.sin(theta) * delta_t + 0.5 * a_y * delta_t**2,    
      theta + w * delta_t,
      v + a_x * np.cos(theta) * delta_t + a_y * np.sin(theta) * delta_t,
      w,                                                      
      a_x,                                                              
      a_y
  ])

  return g

 def jacobian_of_g_wrt_state_G(mu = None, u = None, delta_t = None):
  x, y, theta, v, w, a_x, a_y = mu

  v = u[0]       
  w = u[1]       

  G = np.array([[1.0, 0.0, -delta_t * v * np.sin(theta), delta_t  * np.cos(theta), 0.0, 0.5*delta_t**2, 0.0],   
                   [0.0, 1.0, delta_t * v * np.cos(theta), delta_t * np.sin(theta), 0.0, 0.0, 0.5*delta_t**2],       
                   [0.0, 0.0, 1.0, 0.0, delta_t, 0.0, 0.0],                                      
                   [0.0, 0.0, -delta_t * a_x * np.sin(theta) + delta_t * a_y * np.cos(theta), 
                   1.0, 0.0, delta_t * np.cos(theta), delta_t * np.sin(theta)],                  
                   [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0],                                          
                   [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0],                                          
                   [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]])                                         
  
  return G

 return state_transition_function_g, jacobian_of_g_wrt_state_G

def odometry_imu_observation_model_with_acceleration_motion_model_linearized_1():
 def observation_function_h(mu):
  x, y, theta, v, w, ax, ay = mu
  return np.array([[x], [y], [theta], [theta], [w], [ax], [ay]])
 
 def jacobian_of_h_wrt_state_H():
  return np.array([[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],    
                    [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0],         
                    [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0],         
                    [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0],        
                    [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0],         
                    [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0],         
                    [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]])

 return observation_function_h, jacobian_of_h_wrt_state_H

Performance:

It is now time to evaluate the improved EKF that uses a more sophisticated motion model and fuses odometry data with IMU data. In particular, we will set the noise parameters as follows:

proc_noise_std = [0.1, 0.1, 0.05, 0.1, 0.1, 0.1, 0.1] # [x, y, theta, v, w, a_x, a_y]
obs_noise_std = [100.0, 100.0, 1000.0, 6.853891945200942e-06, 1.0966227112321507e-06, 0.0015387262937311438, 0.0015387262937311438] #[x, y, theta, theta_imu, w, a_x, a_y]

This particular setup tells the filter that we have good confidence in our motion model but that for the observation model we trust the IMU data way more than the odometry data. In particular, we are telling the filter that we think the heading from the odometry is terrible whereas the heading from the IMU is very accurate, which is often the case as IMUs can be very precise and odometry data is notoriously bad. The hope is that this setup can help us correct our predicted trajectory, which as we have seen before is not too bad in shape, but very off in orientation.

From the video below and the two figures that follow, we can see that our new filter with the acceleration model and sensor fusion performs much better than the previous filter. In particular, we see that the estimated trajectory started much closer to the ground truth, thanks to the better orientation provided by the IMU. However, towards the end, it was starting to follow a zigzag pattern.

The 7-D Extended Kalman Filter with Sensor Fusion in action

Performance evaluation of the 3-D EKF (left) and the 7-D EKF with sensor fusion(right)

There is no recipe to follow in order to get better results; this is what makes designing filters an art more than a science. Some of you might feel that the zigzaggy trajectory shown above could be fixed with a better motion model, and I felt the same way when I saw this result. Let’s try improving the motion model.

Improving the Motion Model

Sensor fusion is proving to be helpful, but there is still a lot of room for improvement. Designing filters for state estimation requires making very educated guesses and following one’s intuition. As discussed earlier, the newly obtained results are encouraging and suggest that we should try to improve our model. In particular, it seems that we might be missing something by modeling only the global linear velocity in our state, instead of the velocities along the different axes, as we did for the acceleration. Let’s extend our state vector to account for that.

The new state vector includes the x and y components of the velocity, which should provide more detailed information about the robot’s motion. Let’s see how this extension affects our filter’s performance. The extended motion model represented by the function g and its Jacobian G are shown below:

The observation model function h remains the same, but the Jacobian H now has two columns filled with zeros corresponding to the unobserved state variables v_x and v_y.

The code for the models and the noise parameters used follow. Again, the filter code is very similar to the ones introduced earlier and will be omitted.

def acceleration_motion_model_linearized_2():

 def state_transition_function_g(mu = None, u = None, delta_t = None):
  
  x, y, theta, v_x, v_y , w, a_x, a_y = mu

  v = u[0]       
  w = u[1]       
  
  g = np.array([
      x + v * np.cos(theta) * delta_t + 0.5 * a_x * delta_t**2,  
      y + v * np.sin(theta) * delta_t + 0.5 * a_y * delta_t**2,  
      theta + w * delta_t,                                   
      v * np.cos(theta) + a_x * delta_t,                         
      v * np.sin(theta) + a_y * delta_t,                         
      w,                                                         
      a_x,                                                       
      a_y                                                        
  ])

  return g

 def jacobian_of_g_wrt_state_G(mu = None, u = None, delta_t = None):
  x, y, theta, v_x, v_y, w, a_x, a_y = mu

  v = u[0]       
  w = u[1]       

  G = np.array([[1.0, 0.0, -delta_t * v * np.sin(theta), 0.0, 0.0, 0.0, 0.5*delta_t**2, 0.0],   
                   [0.0, 1.0, delta_t * v * np.cos(theta), 0.0, 0.0, 0.0, 0.0, 0.5*delta_t**2],        
                   [0.0, 0.0, 1.0, 0.0, 0.0, delta_t, 0.0, 0.0],                                      
                   [0.0, 0.0, -v * np.sin(theta), 0.0, 0.0, 0.0, delta_t, 0.0],                       
                   [0.0, 0.0, v * np.cos(theta), 0.0, 0.0, 0.0, 0.0, delta_t],                        
                   [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0],                                          
                   [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0],                                          
                   [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]])                                         
  
  return G
 
 return state_transition_function_g, jacobian_of_g_wrt_state_G

def odometry_imu_observation_model_with_acceleration_motion_model_linearized_2():
 def observation_function_h(mu):
  x, y, theta, v_x, v_y, w, ax, ay = mu
  return np.array([[x], [y], [theta], [theta], [w], [ax], [ay]])
 
 def jacobian_of_h_wrt_state_H():
  return np.array([[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],       
                    [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],         
                    [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0],         
                    [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0],            
                    [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0],         
                    [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0],         
                    [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]])

 return observation_function_h, jacobian_of_h_wrt_state_H

proc_noise_std = [0.1, 0.1, 0.05, 0.1, 0.1, 0.1, 0.1, 0.1] # [x, y, theta, v_x, v_y, w, a_x, a_y]
obs_noise_std = [100.0, 100.0, 1000.0, 6.853891945200942e-06, 1.0966227112321507e-06, 0.0015387262937311438, 0.0015387262937311438] #[x, y, theta, theta_imu, w, a_x, a_y]

Performance:

The 8-D Extended Kalman Filter with Sensor Fusion in action

The new filter has shown impressive performance, as evident from the previous video and figures below. The predicted trajectory is now very close to the ground truth, thanks to the power of sensor fusion and a more sophisticated model. Additionally, the final uncertainty, indicated by the ellipse at the latest pose in the figures, is much smaller than in the previous version of the filter. This demonstrates the effectiveness of a well-designed EKF with sensor fusion. It is also worth noting that tweaking the noise parameters could lead to further improvements, but this will not be explored here due to the article’s length.

Performance evaluation of the 7-D EKF (left) and the 8-D EKF (right) both with sensor fusion

Try it Yourself

As with the previous article, code is provided for anyone wishing to inspect the full code, or simply run the algorithms and see the results live. If you want to run the code follow the instructions below.

The first thing that you need is ROS 2. You can find instructions on how to install ROS 2 Humble on Ubuntu Jammy Jellyfish (22.04) here. For other versions of Ubuntu or other operating systems, please consult with the official ROS 2 documentation.

To get the data, you need to download the ROS 2 bag from this link. Make sure to decompress the file before using it.

Finally, you will need to clone and build the ROS 2 packages. You can do so by following the steps below. Make sure to replace ros2_ws with your actual ROS 2 workspace.

# install some dependency
sudo apt install python3-pykdl

# clone and build the package
cd ros2_ws/src
git clone https://github.com/carlos-argueta/rse_prob_robotics.git
cd ..
colcon build --symlink-install

To run the Extended Kalman Filter, you will need to open 3 different terminals.

In terminal 1 (replace ros2_ws with your actual workspace) run the following commands to open Rviz and see what the robot is seeing.

source ~/ros2_ws/install/setup.bash
ros2 launch rse_gaussian_filters rviz_launch.launch.py

In terminal 2 run one of the following commands depending on the version of the Extended Kalman Filter you want to try. There won’t be any output at first, until you play the ROS 2 bag.

source ~/ros2_ws/install/setup.bash

# Run only one of the lines below

# 3D state, basic velocity model
ros2 run rse_gaussian_filters ekf_estimation_3d_v1 

# 3D state, advanced velocity model
ros2 run rse_gaussian_filters ekf_estimation_3d_v2 

# 7D state, acceleration model, sensor fusion
ros2 run rse_gaussian_filters ekf_estimation_7d 

# 8D state, acceleration model, sensor fusion
ros2 run rse_gaussian_filters ekf_estimation_8d

In terminal 3 navigate to where you extracted the ROS 2 bag and play it with the following command. You may ignore some warnings like “Ignoring a topic ‘/navrelposned’, reason: package ‘ublox_msgs’ not found”

ros2 bag play linkou-2023-12-27-2-med --clock

Following these steps will allow you to run the Extended Kalman Filter and observe the results in real time.

Practical Considerations and Final Words

This article introduced the Extended Kalman Filter (EKF) as an extension to the basic Linear Kalman Filter (LKF). The EKF handles the non-linear nature of most real-world systems by using an approximation. It linearizes the non-linear functions around the posterior’s mean using the first-degree Taylor Expansion. After linearization, the EKF operates similarly to the LKF.

As demonstrated, the EKF combined with sensor fusion can achieve remarkable results. However, designing good filters is challenging and error-prone, often requiring a balance between science and art. Finding a suitable motion and observation model is the first challenge. Even with available models for mobile robots, other scenarios might lack good models. Calculating Jacobians is also difficult and prone to mistakes, highlighting the need for software verification.

Several other considerations and decisions are crucial for effective EKF design. One important aspect is selecting the correct time delta (delta_t), which must be sufficiently small for effective linearization. In this case, a dynamic delta_t, as opposed to a fixed value, was used. Deciding when and how to perform the update step of the filter is also vital, especially when different sensors provide measurements at different rates (e.g., the IMU usually has a higher frequency than the odometry). Additionally, selecting appropriate noise parameters is a complex task with various approaches available.

In summary, the Extended Kalman Filter, especially when paired with sensor fusion, can provide excellent state estimation results. However, designing a robust EKF requires significant effort and practice. In the next article I will introduce the Uncented Kaman Filter (UKF). The UKF offers several advantages over the EKF. It more accurately captures the mean and covariance of nonlinear transformations using the Unscented Transform. The UKF does not require Jacobians, simplifying implementation. It reduces approximation errors from linearization, making it more effective for highly nonlinear systems. Additionally, the UKF handles non-Gaussian distributions better, enhancing robustness and versatility.

Reading List

The following is a list of excellent materials that I have consulted to learn about the Kalman Filter family:

I hope you found this article informative. If you have any feedback, please feel free to leave a comment. Additionally, I’m considering launching a series of more in-depth courses covering this and future topics, complete with videos, coding projects, and more. These courses are likely to be paid. If you’re interested in such a course, please let me know in the comments to gauge interest.

Want to connect? Find me on LinkedIn: https://www.linkedin.com/in/carlos-argueta

Want to learn Robotics with ROS 2 from me? Join my live class!

Sensor Fusion with the Extended Kalman Filter in ROS 2

Introduction

The Extended Kalman Filter

The problem with nonlinearities

Linearization via Taylor Expansion

The Extended Kalman Filter Algorithm

Implementation

Linearizing the Models

Alternative Motion Model

The Power of Sensor Fusion

Improving the Motion Model

Try it Yourself

Practical Considerations and Final Words

Reading List

Written by Carlos Argueta