"Tired of using encoders for localization? Try LIDARwith deep learning"

Introducing LocalNet


Demo GIF Link

Goal
Over the past few weeks I have been developing and researching the possibility of using a Deep Neural Net for the purposes of localization, using raw input from a LIDAR sensor and heading information from an IMU. This is with the goal of having a DNN process this complex and often noisy data and generating a X,Y coordinate for the field, all while being stateless which means this can be ran from anywhere on the field at any point in time with out any previous information about starting position.

Simulation
Currently we are using a Unity-based simulator to both acquire this training data and run these tests. Although many may believe this wont translate to IRL performance (which may be the case), its been a constant job for me to emulate noise and real life performance. Currently the LIDAR simulates noise for each point, errors in measurements, delays, and different positions.

Along with this, the concern of the walls being transparent was raised, so for these tests I have mounted the simulated LIDAR on the top of the bot, so it gets a view of the Scale/DS. However, other positions and environments are easy to train and test. This simulator, model, and testing ca, all be easily trained on new environments by just loading in new meshes, which has proven to work.

As see in the GIF above. The bot sends this simulated data to the Jetson over UDP and will get a prediction in return, this is shown as the yellow post.

The Data
The training data is gathered from a version of this simulation that includes 4-6 bots that go to random points on the field. Their X/Y, LIDAR data, and heading are recorded periodically and saved to a file. This is ran usually 50x time scale.

  • LIDAR is scaled 0-1 based on Max Range (20-40m in my testing)
  • Field Coors are -1 to 1 based on max field width and length
  • Heading is 0/1 based on 0/360

For LIDAR we have ran anywhere from 64-196 Data points with no issue besides training time increases, this is however one field of researching were still doing.

Usually for our training, we train on 3-5m entries, and test on 1-2m.

The Model
Currently the model consists of a 1D Conv net, connected then into a Fully Connected Net which is also where IMU data is inputed. Error is calculated as MSE. This was all writen in Keras built on Tensorflow.

I am still realivly new to ML and creating custom DNN’s so my explanation may be lacking

I train this model on a Nvidia 1080 GPU, currently running 75 Epochs, with each Epoch taking 10+ mins, however this has been changing constantly with new hyper parameters and varying datasets. We have been averaging around 0.002 - 0.003 Losses with noise.

Results and Plans
As see in the GIF the NN is currently successfully localizing based on LIDAR data, running on the Nvidia Jetson TX2 over UDP and ethernet. Currently the TX2 runs the model at <4ms per inference, so I have been running my tests at 100hz.

This net seems to handle noise and bad LIDAR data extremly well, however IRL testing is required, which I plan to perform extensivly once our club aquires a LIDAR unit, or partner with a team with a Jetson and LIDAR.

For future plans I am still actively developing this model and improving it, while also abusing it. There are plans to attempt to add additional sensor inputs such as vision and ultrasonics. Overall this project has been and continue to wil be an active project, and hopefully will prove to be an extremely powerful tool for FRC, with the goal of integrating this with additional NN’s to bring AI into FRC.

THIS IS ALSO ALL GOING TO (and is partly already) OPENSOURCE. ONCE I MAKE THE REPOS AND SITE CLEANER I WILL POST THOSE LINKS

Special Thanks to Michael (aka Loveless / Turing’s Ego) from Innovation DX for the mentorship and Jetson donation

Looks pretty good so far! I would be curious to see what happens if you pre-processed the data before feeding it to the network. Maybe use an EKF or something to smooth out the sensor data. I’ve been experimenting with a similar project but use Q-learning

Yea cleaning up the data has been one idea. Plus that reinforcement learning approach I’m using in other AI FRC projects. Just for localization I felt this was the best approach, as agreed on by my AI mentor.

Sent from my Pixel XL using Tapatalk

What’s the train/test accuracy of pose estimation (i.e. magnitude difference between actual pose and predicted pose on a dataset the network hasn’t seen)… This is really cool btw!

The dataset, which is roughly 5 million data points large, was split 70-30 for training and testing, as is standard in data science tasks. The loss is RMSE, and only outputs x,y pairings of where the network believes the the “robot” to be given the noisy lidar + heading input.

It would be trivial to adjust the dataset to add the heading input as an output then add noise the the heading input (to simulate an imperfect gyro). An update will be posted with this change, as well as some nice tensorboard charts.

The GIF demonstration was entirely on data it was not explicitly trained on. As you can see, the model’s accuracy is precise and consistent most of the time.

This is all an part of a larger project to make a model (more specifically a series of models) be able to play Power Up from raw sensor data and game state data.

This is super duper cool! If it does work out, is it easily adaptable to different environments?

I’d definitely love to keep up with this… seems like a cool project, is there a github repo link somewhere?

The mode and tools only require a new mesh of the environment that is to scale to be made. Now how the model performs is based on the env it’s self, but I’m confident in it’s ability. Probably soon I’ll show off different envs.

Sent from my Pixel XL using Tapatalk

To build on this, the gif was entirely new data as it was just me controlling a dummy bot with WSAD.

Sent from my Pixel XL using Tapatalk

Currently the GitHub is mostly hush hush just until I can properly support, document and stand by my work. Hopefully the fully project will be open and supported coming end of this month. Again updates will be posted.

Sent from my Pixel XL using Tapatalk

Seriously awesome work! Looks very promising so far.

When you say noise, do you mean sensor noise, or environmental noise? Basically, how does it handle if you were to include people walking around, camera men, other robots, etc in the Unity model?

Sensor noise. However the training data was gathered with other bots on the field so the model should be able to handle it. However I am planning to test env noise once I optimize the model, which I’m currently doing.

Sent from my Pixel XL using Tapatalk

First by hardening your NN generally i’ve seen people call this annealing in ml land. As for your NN, did you consider any other topology? As well, why LIDAR? I’ve been really wanting to do something similar to this for a while, but using a sterocamera and a jetson. But being a poor college mentor has made the barrier of entry to high. Not to say that LIDAR isn’t the better option, just wondering what considerations where made? Have you considered a tof sensor?

The use of Kalman filters for positional determination has been used for everything from the Apollo program to the Flight Management Computers of all large airplanes.

This technique also uses weighted averages and statistical modeling, and is designed to accommodate sensor noise. It leverages recursion of the prior location estimate.

By using the compass, acceleration data, and Lidar distance data to landmarks you could easily compute field location with great accuracy.

Nothing against deep neural networks; but in this case your trying to reinvent a solution to the already robustly solved the sensor fusion navigational problem.

This is really cool! Rapid, accurate localization has lots of great applications. How does (or, can) your model deal with environmental occlusions, for example, other robots blocking view to your target landmarks?

The purpose of this project is to have a learned agent be able to play the game on its own, without any hard coded information about the game or the environment. You are correct in that sensor fusion and methods like the perspective-n-point problem do exceptionally well at determining position within an environment, however that isn’t the exact intent of this project. This phase of the project is a mere stepping stone to reach a higher goal.

I choose LIDAR due to the possibility of it in knowing what’s around the bot in 360 degrees. Plus it has been easier to simulate. However I am planning to experiment with vision including depth input.

Sent from my Pixel XL using Tapatalk

Intel has unity support for all of their real sense cameras. And i believe Zed does as well.

So a quick update

  • Adding multiple bots to the field is proving to really throw it off, however I am working on fixing this. Possibly by moving LIDAR back down to ground level.
  • We have begun a website for this project, should be up soon. Offers information (totally not c/p from this thread) and links to the repos and weights
  • We have be donated access to a new training machine allowing for larger datasets and multiple runs at the same time. This should really aid in developement espically with increased LIDAR points. (Thanks Loveless)
  • Soon, most likely along with the Website update, we’ll be testing multiple diffrent enviorments such as previous FRC years, real life areas on our campus, and abstract “challenge” maps. GIFs will all be posted along with their files.
  • DeepMind has released a paper on using DNN’s for positioning, I’m looking into it and seeing if I can apply any of the research to this project.
  • Last up, I have been spending time making all the tools and models user friendly

Thank you all for the support and interest! Hopefully soon I can mess with getting navigation working :wink: