Has anyone done it? Has anyone done it well?
I’m working on it. It’s proving more difficult than I expected. Lighting variation really causes grief, and a couple of quirks in our build shop don’t help. I’m getting lots of good results, but not good enough to field in an actual competition. Here’s my basic approach.
First, use an HSL color filter to detect bright yellow things. This is returning a lot more false positives than I expected. It seems that, at least using the lighting in our build room, plywood has pretty similar numbers to a yellow foam ball. Our robot sometimes chases our field elements, constructed of plywood. Beige carpeting also has patches that appear awfully yellow to the camera. Another problem with the HSL filter is that the bottom of the ball is always in shadow, while the top reflects a lot of light, and sometimes appears shiny white, reducing the saturation a lot.
Step two: Use some basic contour filtering to throw out some obvious false positives. Size, and basic ratios of height to width. (i.e. get rid of long, skinny, things, like 2x4s, and those yellow tape lines on the floor. Dilation and erosion also helped in not falling for the tape lines.
That gets me to a candidate target. I then snip the gray scale version of the image, and run a hough circle transform to get only circular things. This, I found, really helped with the shadowed portion of the bottom of the ball. I could “loosen up” the numbers on my color filter, and then examine the hough transform to be sure that it is detecting something circular. The bottom half of the ball is a different color than the top half, but the edge is usually still continuous.
It’s not ready for prime time yet, but the numbers are improving. Processing time is an issue, but, honestly, I’m more concerned about the theoretical aspects at the moment. If I can get good recognition of my images, I can work on the fancy CUDA programming and the like on a Jetson. Until I get the recognition algorithm working well enough, that isn’t a high priority.
I see a couple of posts using deep learning/neural network approaches. I’m curious if the performance of those is good enough yet to really make it work adequately for a “production” environment (i.e. a real world, actual, competition.) I’m a bit skeptical of that approach. I have a feeling it will get to a certain point, but never get any better, but there are some pretty amazing things being done in the real world with those approaches, so it’s certainly possible that someone will find the right combination of code to hit the sweet spot and make it work.
If I ever get it running well enough to use in our competition cose, I’ll post it.