State Estimation using Kalman Filters

I was looking into different ways of doing sensor fusion so it would be possible to get a more accurate robot pose and came across kalman filters. I understand that for autonomous, encoders are usually enough but I wanted to be able to get a somewhat accurate pose during teleop as well. It would be very much appreciated if anyone could give any help or advice.


Setting up a KF for localization this year was one of my goals. It’s still not done and probably won’t be for a while due to COVID-19, but I’ll be glad to tell you the approach I’ve taken!

If you just found out they exist, then there’s a lot of theory to learn first about state-space systems and estimation. There’s plenty of free resources in the interwebs to learn about them. While you could just know what each variable means, take measurements, and translate it into code, it is very useful to understand why it works (especially when something breaks, so you can fix it). A while ago I put some links about KFs in this thread, so you can check those out.

Once you know how they work, the first step is to model your system, in this case it’s probably going to be just the robot drive base. So you consider what states to use for the robot. I’m using x, y, yaw, left wheel velocities, right wheel velocities. You then have to consider things like what sensors will it have (encoders, vision (basically a must, or else it can’t correct for errors) maybe IMU) and how to calculate their measurements given the robot’s state.

A drive base is inherently nonlinear. The “normal” Kalman filter gives you the best result when the system is perfectly linear, so you have to think about how to deal with nonlinearities. I started off by using an Extended Kalman Filter that linearizes the system at every time step, and then switched to an Unscented Kalman Filter because it’s usually better at dealing with nonlinearities and it doesn’t require calculating Jacobians which can get messy.

Then, you need to get real measurements from your robot to plug into the system’s model, like motor or characterization constants. Luckily WPILib’s characterization tool outputs a raw JSON file that you can analyze with other programs determine how much noise the drive base has while moving, which is needed for the KF.

While it’s not necessary, I highly recommend simulating the KF. There’s a lot of places it could go wrong. I am very lucky that both my school and robotics team provide MATLAB licenses, which has made it a much smoother process since you can write and debug code without breaking a robot (also helps if you don’t have physical access to your robot right now).

Then after that, you may have to tune it. For an EKF, this usually looks like changing the update rate until it gives acceptable performance. For a UKF, there’s some tuning parameters you can slightly tweak.

Finally, you can translate your simulation code (if applicable) into your robot’s language, which involves installing libraries to deal with math. I’m using EJML for linear algebra and the Apache math library solving the differential equation that governs the robot’s motion. If you’re a C++ team, I’ve heard Jeigen is a good library for linear algebra.

There’s a very high barrier to entry to get into advanced control and estimation, but once you overcome it theoretically you can keep doing it for years to come. There’s currently a group of people working on adding a state space library to WPILib so hopefully that barrier will be much smaller next year

Here’s some threads on Chief Delphi with more info:

There’s been a lot of threads recently about advanced control and estimation. I’m glad more people are getting into it now that we all have some free time! It’s a very interesting field but is sadly so technical that few teams can get into it, but that’s no reason to get discouraged!


Oops, almost forgot: some students at WPI published a paper about localization for FRC. I found it was a useful read to understand what options are available, but I took a different approach (different system model and using a UKF instead of EKF). It also has a section defining what a successful localization accomplishes, which is definitely worth knowing since your error will never be 0 so it’s good to know what’s “acceptable”


Thank you for all the help and references! Because pretty much everything has been canceled, I have been trying to learn more about programming and control theory so we can implement it next year and I didn’t really know where to start, and this really helps with that!

1 Like

Note that the C++ library is Eigen and Jeigen is a Java wrapper for the C++ Eigen library. The upcoming WPILib state-space implementations use Eigen for C++ and EJML for Java.


I wrote an Extended Kalman Filter that we used on our robot this year to fuse vision, visual SLAM, and encoder measurements. The code is here. Note that some tuning issues prevented us from fusing everything “well” during our first competition.

Here are some charts:

That EKF will be in wpilib next year. I’m also in the process of writing (along with Prateek who’s doing the C++ versions) easy to use latency-compensated wrappers that fuse vision pose measurements and encoders to allow for full field localization. These will also be in WPILib next year.


Thank you for the clarification, and unrelated but thank you for your tutorial on trajectory following, it was really helpful for us this year!

1 Like

oh oops my bad! Thanks for correcting

Thank you for the reference! One question: What did you use for vSlam?

I wrote a custom Java wrapper for the Intel RealSense T265 camera that uses a version of librealsense which I patched to use less memory so it can run on the roboRIO. This is nice because it doesn’t require any coprocessor. I intend to release this as a vendordep (along with a better-explained code release), but my WPILib work has gotten in the way (it’s a higher priority for me.)

1 Like

That sounds really interesting, is there a specific reason you chose the T265?

Oh I just realized I linked to your own topic! I knew that username sounded familiar lol

1 Like

Haha yeah, since I have nothing to do all day I’ve just been looking into control theory applications in FRC so I’ve been on CD way too much.

1 Like

yeah I think we all have tbh :slight_smile:

1 Like

I evaluated ORB-SLAM-2 and rtabmap, and they had similar or worse accuracy characteristics and both required a coprocessor.

1 Like

If you have more questions then the FRC Discord’s #programming-discussion channel is a great place for more “interactive” discussions. Tyler (who wrote the Controls Engineering in FRC book) also hangs out there and is always there to help with all my terribly misguided questions :slight_smile:. He might be willing to help you too.


With the rate at which the error covariance increases for a drivetrain model, we didn’t notice a difference between an EKF and UKF in sim. In fact, the UKF had a bit more noise in the state estimate. This may have been due to the UKF impl not discretizing Q and R based on the previously measured sample period like the EKF impl does.

Jacobians don’t get messy if you compute them numerically. :slight_smile: That’s what my team did last year. Although, the analytical Jacobians of a 5-state drivetrain model aren’t that bad anyway. The screenshot below is with respect to x and u, to compute A and B respectively. (States are x, y, heading, left velocity, and right velocity. Inputs are left and right voltage.)


Linearizing the state dynamics is just fine, the problem was with the outputs! The LimeLight’s tx and ty have arctangents and distances in the calculations so I kept messing up the partial derivatives until I decided to use a UKF and not even deal with that lol. It didn’t occur to me to numerically calculate the Jacobians, it might be worth pursuing. But I do like that the UKF is slightly more thorough, so I’ll keep it for now.

Speaking of, why does R change with the sample time? I never understood that - it’s the same sensors and it feels like noise is inherent to the sensor

Here’s a numerical Jacobian implementation in EJML if you need it:

I haven’t measured the resource utilization of an EKF vs UKF implementation on the roboRIO yet. It’s a balance between the extra computation from the unscented transform and the numerical Jacobians required for an EKF. Their APIs are almost drop-in replacements for each other.

I thought so too originally. Why would the noise contribution change if it’s the same amount of noise being added per sensor update? The difference is that the Q and R we’re providing is continuous, not discrete, so the conversion is complicated. I’m just going to lift this from chapter 6 of for now, because I need to go to bed. I’m aware it skips over requisite concepts like “expected value” and “delta functions”, so I can explain that stuff later if you’re interested. If all you need is the formulas, here’s the overview. The discrete model is defined as follows:

\mathbf{x}_{k+1} = \mathbf{A}_d \mathbf{x}_k + \mathbf{B}_d \mathbf{u}_k + \mathbf{w}_k
\mathbf{y}_k = \mathbf{C}_d \mathbf{x}_k + \mathbf{D}_d \mathbf{u}_k + \mathbf{v}_k

where \mathbf{w} is the process noise, \mathbf{v} is the measurement noise, and both are zero-mean white noise sources with covariances of \mathbf{Q}_c and \mathbf{R}_c respectively. \mathbf{w} and \mathbf{v} are defined as normally distributed random variables.

\mathbf{w} \sim N(0, \mathbf{Q}_c)
\mathbf{v} \sim N(0, \mathbf{R}_c)

\mathbf{A}_d = e^{\mathbf{A}_c T}
\mathbf{B}_d = \int_0^T e^{\mathbf{A}_c \tau} d\tau \mathbf{B}_c = \mathbf{A}_c^{-1} (\mathbf{A}_d - \mathbf{I}) \mathbf{B}_c
\mathbf{C}_d = \mathbf{C}_c
\mathbf{D}_d = \mathbf{D}_c
\mathbf{Q}_d = \int_{\tau = 0}^{T} e^{\mathbf{A}_c\tau} \mathbf{Q}_c e^{\mathbf{A}_c^T\tau} d\tau
\mathbf{R}_d = \frac{1}{T}\mathbf{R}_c

To see why \mathbf{R}_c is being divided by T, consider the discrete white noise sequence \mathbf{v}_k and the (non-physically realizable) continuous white noise process \mathbf{v}. Whereas \mathbf{R}_{d,k} = E[\mathbf{v}_k \mathbf{v}_k^T] is a covariance matrix, \mathbf{R}_c(t) defined by E[\mathbf{v}(t) \mathbf{v}^T(\tau)] = \mathbf{R}_c(t)\delta(t - \tau) is a spectral density matrix (the Dirac function \delta(t - \tau) has units of 1/\text{sec}). The covariance matrix \mathbf{R}_c(t)\delta(t - \tau) has infinite-valued elements. The discrete white noise sequence can be made to approximate the continuous white noise process by shrinking the pulse lengths (T) and increasing their amplitude, such that \mathbf{R}_d \rightarrow \frac{1}{T}\mathbf{R}_c.

That is, in the limit as T \rightarrow 0, the discrete noise sequence tends to one of infinite-valued pulses of zero duration such that the area under the “impulse” autocorrelation function is \mathbf{R}_d T. This is equal to the area \mathbf{R}_c under the continuous white noise impulse autocorrelation function.