Software Update NO.1 (11/1/2024)
We decided to go with a claw design for intaking and scoring early on in the season. We focus on the intaking part in this post.
Compared to roller intakes, a claw intake needs to be aligned with the samples really well in order to effectively and consistently pick them up. Without automatic aligning and only relying on the driver to do that would be very time-consuming and impact cycle times, so a good vision system is crucial for getting the claw intake to work.
We have seen some amazing claw-based intake designs from Gear Wizards 16917 and 5064 Aperture Science, Proving that with a good-enough auto-align, claw intakes can be really reliable and fast.
Now moving on to what we did for the vision:
Vision for autoalign
For the CV part, we tested all our pipelines on a computer first with OpenCV. We made two approaches for the pipeline: a color-based pipeline and a edge-based pipeline.
We can separate the vision process into 3 parts:
- Color masking: masking out the red, blue, and yellow game pieces.
- Processing: will elaborate in next sections
- Find contours and calculate position and orientation.
Color Masking
For the Color masking process, we used the Hue in the HSV color space to effectively threshold out the samples (To make it more simple, we focused on the blue samples, adding new colors should be the same process)
Contour Finding
After we get the masked frame, we find all the external contours in it and calculate the center of mass for that contour as the position, and the direction of the best-fit ellipse as the orientation.
Processing Things
Now let’s talk about the processing part, what do we do in that part and why do we need it?
We initially made a simple pipeline that filters out the game pieces using a Hue threshold for the 3 colors (red, yellow, blue) and then finds contours to detect game pieces.
This worked well with single pieces and pieces that are not close together but failed when pieces are close to one another.
So a method is needed to effectively separate the game pieces from each other when they are touching one another. For this we developed two methods
Color-based pipeline
The color based pipeline operates on the principle of removing shadows. During testing we observed that the top part of the sample is the brightest, and if we remove the shadows or dark parts on the samples, we can effectively separate them out.
Our first idea was to just mask out the parts that had a Value (HSV color space) that’s less than a certain threshold, but due to the tricky shape of the samples (I swear they designed it like that on purpose!), we were also masking out the dark parts on top of the samples. This made the contour-finding process afterwards really bad.
So we improved it by introducing 3 more constants: confirmed_shadow_threshold, shadow_dist, and shadow threshold.
The idea is to not make it remove the parts on top of the samples, and those parts are ususally not at the edges of the color mask. So now we only remove dark patches if they are less than shadow_dist away from the edge, removing the edges. We solve the issue of not separating two samples if they are touching side-by-side (the dark parts are now at the center) by removing dark places that are below the confirmed_shadow_threshold regardless of their distance to the edge.
Video:
Edge-based pipeline
Another approach is the detect the edges and separate them.
For that we use Canny edge detection on the grayscale masked frame, and then dilate the edges so they touch.
This method gave us stabler results than the color approach, but when testing on our limelight 3 (we haven’t got a 3A yet ), we found this method to be horrendously slow, with single-digit framerates on the lowest resolution.
We timed our code and found out the bilateral filter we used was really slow, and we had a lot of unnecessary operation. After removing them our framerates significantly improved to about 40 fps, even becoming faster than the color-based method.
Timing code we used:
Results (This one is better than the color method):
Finding and separating contours
During testing we found out that even after processing, the samples will not separate. To fix this, we came up with our own separation algorithm (we tried watershed but results were not that good):
We first try to find the number of game pieces in the frame. To do this, we erode the contours a little bit at a time until their area is less than a certain threshold. We use the maximum number of contours as the number of game pieces there are. We then find the least-shrinked contours that has that maximum number of contours and use that for the downstream tasks.
In addition to that, we also use area thresholding to filter out the noise and any unwanted small contours.
This contour processing is used for both the color-based approach and the edge-based approach.
Code
“Talk is cheap, show me the code!”
The code is available on our GitHub: