Step for Full Autonomous (localizing cones)

The Premise

By determining the location of five keypoints on the cones you can accurately localize them using PNP for automatic pickup from shelves and the ground at any angle through the use of Inverse Kinematics

Keypoints and where to find them

Keypoints can be found through a few methods including classic algorithmic methods but to find semi or even fully occluded keypoints for PNP you need to understand where they are in relation to each other by some means

Heatmaps!!

Using a Gaussian kernels you can create a series of heatmaps for each Keypoint to train a simple custom U Net style network on how to generate these heatmaps from an input image off an object detector

Example training layer

Example input image

THE OUTPUT!!



Note these images aggregate the five channels of a 3D tensor down to 2D without adjusting for varying channels having varying noise and confidence so the noise seems far more pronounced then it really is when viewed on a Keypoint by Keypoint basis

Why Take The Time?

This system in combination with on the fly path generation, a full inverse kinematics system with 6 degrees of freedom, an overall object detector, an independent cube range finding system, and a robot collision avoidance system allows for us to automate away all menial drive team tasks and shave off cycle time knowing how to pick up every piece before we’re there and finding the most efficient route at the greatest speed to any point

This system in combination with on the fly path generation, a full inverse kinematics system with 6 degrees of freedom, an overall object detector, an independent cube range finding system, and a robot collision avoidance system allows for us to automate away all menial drive team tasks and shave off cycle time knowing how to pick up every piece before we’re there and finding the most efficient route at the greatest speed to any point

Do you have an example of this in-practice/on the field? The most common challenge with ML is effectively integrating it into a robot, not necessarily the concept itself.

We’re working on deploying the system next week, the software side is completely done, path generation has been tested we just are waiting on final camera positions on the bot before setting up all the transforms for objects to be world relative instead of camera or bot relative but we have deployed all of our models on our hardware for the bot and tested the communication between our coprocessors and the Rio for sending final position data along with storing an active map of all game pieces and estimated robot positions.

What FPS / processing time does this take (and also what hardware are you running)? It sounds like a really cool idea.

The apriltag system for localization is around 7fps per camera over four cameras and two overclocked pi3Bs the ML pipelines for the detector and Keypoint heatmaps are on a pi4B with two coral accelerators and get about 10fps for the object detector and 200 fps for the key pointer which will run on every cone detection in a frame.

Overclocking Pis and making tea

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.