OpenCV for detecting power cells

Has anyone done it? Has anyone done it well?

I’m working on it. It’s proving more difficult than I expected. Lighting variation really causes grief, and a couple of quirks in our build shop don’t help. I’m getting lots of good results, but not good enough to field in an actual competition. Here’s my basic approach.

First, use an HSL color filter to detect bright yellow things. This is returning a lot more false positives than I expected. It seems that, at least using the lighting in our build room, plywood has pretty similar numbers to a yellow foam ball. Our robot sometimes chases our field elements, constructed of plywood. Beige carpeting also has patches that appear awfully yellow to the camera. Another problem with the HSL filter is that the bottom of the ball is always in shadow, while the top reflects a lot of light, and sometimes appears shiny white, reducing the saturation a lot.

Step two: Use some basic contour filtering to throw out some obvious false positives. Size, and basic ratios of height to width. (i.e. get rid of long, skinny, things, like 2x4s, and those yellow tape lines on the floor. Dilation and erosion also helped in not falling for the tape lines.

That gets me to a candidate target. I then snip the gray scale version of the image, and run a hough circle transform to get only circular things. This, I found, really helped with the shadowed portion of the bottom of the ball. I could “loosen up” the numbers on my color filter, and then examine the hough transform to be sure that it is detecting something circular. The bottom half of the ball is a different color than the top half, but the edge is usually still continuous.

It’s not ready for prime time yet, but the numbers are improving. Processing time is an issue, but, honestly, I’m more concerned about the theoretical aspects at the moment. If I can get good recognition of my images, I can work on the fancy CUDA programming and the like on a Jetson. Until I get the recognition algorithm working well enough, that isn’t a high priority.

I see a couple of posts using deep learning/neural network approaches. I’m curious if the performance of those is good enough yet to really make it work adequately for a “production” environment (i.e. a real world, actual, competition.) I’m a bit skeptical of that approach. I have a feeling it will get to a certain point, but never get any better, but there are some pretty amazing things being done in the real world with those approaches, so it’s certainly possible that someone will find the right combination of code to hit the sweet spot and make it work.

If I ever get it running well enough to use in our competition cose, I’ll post it.

1 Like

Are you set on OpenCV? What about using a RPi, a USB camera, and PhotonVision

My team’s been doing some testing with this, we use a combination of GRIP generated and manually written code for our pipeline, which runs on a raspberry pi with the WPILibPi image. We lean heavily on tight HSV filtering and erosion/dilation to get rid of anything other than the power cell (plywood’s been a pain for us too). This gives us a contour that’s missing chunks and is really blocky, but we then draw a bounding circle around it to get the shape right for tracking. I’m not familiar with the hough circle technique you described, I’ll take a look at that, it sounds interesting.

I’m finding at least for my objectives this can produce good enough results with fairly low effort.

I do see things like plywood, a orange bucket, our yellow tool chest, etc. that can sometimes show up. But I’m not trying to make a solution for any scenario, just to identify from the starting position.

I do see the bottom half of the ball is darker, but it is often the lighter parts that are hard to include in the HSV thresholds without generation the background artifacts.

With Photonvision, I can somewhat cleanup my viewing area by liveviewing the output and putting something in front to block it from view. Anyways, this has seemed to work well enough, but definitely would fail under live competition conditions, but the level of detection needed is whatever you decide is your objective.

edit: My effort was to set the cells up in front of the camera, using the color picker a few times (mostly in the mid and lower color ranges) for HSV threshold, then set the contour values 0.1-10 for area, range around 1 for sphere, loose range around 80% for fullness, set up muliple targets, and that is basically it. After deciding I needed a low and high camera, the cell camera is just a USB camera connected to the pi, without additional leds.

I don’t participate in our vision code programming, but I have some idea of what we are doing. We are getting usable results with OpenCV. We work in the HSV color space. Not sure if this has any advantage over HSL, but we are able to tune a color filter to be pretty specific to a Power Cell. We can tune out plywood well enough, but like the other comments here, we find it surprisingly close to a Power Cell. We also filter with size, aspect ratio, and solidity (contour area / bounding rectangle area). Our carpet is kind of a brown color so it hasn’t been an issue. The official competition carpet color also was not an issue in the 2020 season (we were lucky enough to play).

We run on a Raspberry PI 3B and process 320 x 240 images. I know that we stream to the drivers station at 15 fps, but I think we are running at least 30 fps for calculations. The camera capability is in the range of 90 fps at the image size we use. We have two cameras on our robot and target the Power Cells with the one that has no light rings and exposure tuned more like a human-viewable image rather than a high-contrast image specific to glowing reflective tape.

Our programming team was working on Galactic Search last week and they are relying on a combination of vision and pathing rather than a pathing-only solution. Results were looking pretty good. The control adjustments based on vision still need some tuning, but the target coordinates coming in based on vision seemed to be quite solid.

Our experience suggests that OpenCV is a reasonable way to detect Power Cells. A deep learning solution should not be necessary. I think that deep learning would easily be “good enough” for this application. However, deep learning requires a ton of labeled training images to generate accurate image classification. Convolution nets appropriate for this kind of application probably need to optimize at least 100,000 parameters, which isn’t going to happen with just a few pictures of a yellow ball on a field. It might be possible to synthetically generate images which could work or it could be possible to leverage some kind of pre-trained model and require significantly fewer training images. However, given the pretty straightforward image processing task at hand, OpenCV seems like a far less labor-intensive approach.

The biggest thing that helped me for this specific challenge that I didn’t see you mention was increasing the resolution to max. I wouldn’t normally do this (framerate goes way down) but because of the 1 shot nature (you can make a decision at the beginning and go with it) of the galactic search it helps things.

I think all of the problems you mentioned are general problems with traditional vision methods (as opposed to ML). Once colored shapes is released I think we will about at the same point as a custom pipeline I would have made with GRIP or OpenCV but that is not needed for this problem as you completely control the field of view.

I’m using a custom model of tensorflow

Photonvision looks like a pretty good product. I’ll definitely keep it in mind.

For my specific purposes, I don’t want to use a simplified product, because I’m a mentor who is interested in practicing solutions that work outside of FRC, and I don’t think I have any students right at the moment that could implement photonvision, but if I can introduce the theory to them, it seems like a very good next step.

One of the problems you will find with the Hough transform method is that it is very computationally expensive. It also requires more tuning than I would like. I’m getting fairly good results with it, but on sample, static, images. If I were interested in putting this into a real competition, I don’t think it would work well, at least not without a Jetson or similar high powered processor. I haven’t measured the processing time, but once I threw that step in, my processing of sample images slowed down dramatically.

I think for regular competition, I wouldn’t have some of the problems. The dark carpet and known starting configurations could really eliminate some of the headaches, and the authentic field elements aren’t made out of plywood. I think there would be fewer false positives on a real field.

But if you happen to be paired with the Killer Bees, I could see some unfortunate mishaps occurring. They usually have somef yellow on their bot.

I went with the bounding circle method at first, and that eliminated certain of the false positives, but I find that I get a lot of crescent shaped hits from the HSL (or HSV) filter. The shaded lower portion fades out, and the balls are a bit shiny on top and with our lighting I often lose a spot on top.

1 Like

Yup.

Also Yup.

Low exposure + LED’s, disabling auto-exposure, disabling auto-white balance, and careful camera selection are all key prerequisites to OpenCV doing its job well.

Also, Possibly Yup.

Camera latency is a major factor to rule out as well. Most webcams are optimized for making faces look nice to other humans, not for gathering high-fidelity data as fast as possible. The Raspberry Pi cameras tend to work substantially better.

Low resolution helps latency on all fronts.

Definitely a major step up in difficulty on all fronts - not worth biting off until you’ve exhausted the capability of led’s on a darkened camera, or simply searching for blobs of specific colors.

This is a valid approach, but it’s worthwhile to be sure it’s tied back to the right root motivation. At a very high level, to take this approach is to say personal learning is more worthy of build-season time than robot functionality. Again, not wrong, but something to be cognizant of as you integrate your efforts into the broader goals of the team.

I state this much less for your own sake (I imagine you’ve already worked through it) and more for others when they read this thread in the future.

I agree with many of the posts above: it is entirely possible to detect the Power Cells with straightforward processing in OpenCV.

  • color filter in HSV
    • this might pick up plywood and offwhite walls
  • filter on size, x/y shape, contour fill vs bounding box size, etc
    • this should give you mostly blobs which are circular-ish
  • because of our work area, we need to filter out contours which were too high (mostly walls etc), but that seems perfectly legit; we are looking for objects on the floor, not hanging on the walls

With all that, we can find the balls with decent reliability. We are running with USB cameras at 480x848 pixels (partly because we can).

One suggestion that we have not had to do: if the walls, etc are a problem, hang a black cloth on them. That should be pretty reliable in killing noise from outside the field, and is legal (if outside the field area).

Yes. Very true. Each year I take a different approach, depending on the abilities and interests of students I’ve got. This year, I don’t have any veteran programmers. I don’t think they would be able to handle Photonvision, and it doesn’t really do anything for me, so I’m going a bit “theory heavy”. I think next year they will be able to handle something like Photonvision or Chameleonvision, so I’ll probably try to introduce it to them in the off season.

I’m looking forward to seeing some videos posted to see how well people actually did in the challenges, especially the search challenge where detection of power cells could really improve performance.

1 Like

I don’t really understand that statement. To my understanding, you need the vision system to automatically find the balls. But that is only so that it can recognize which path is laid out in front of the robot. Once it has that information (hopefully correctly), it can select a pre-canned auto, and does not have to do any more vision processing.

1 Like

Another trick I’ve used in the past is to filter circles by diameter, and only pass objects about the size of a ball. The apparent size varies with distance, of course, but since you’re looking down at the floor at an angle, there’s a relationship between the y-position of a ball in the image, and its distance. So, for each candidate circle, you can compute a range of acceptable radii based on the Y-coordinate of the center, and see if the measured radius falls in that range.

(This fails, of course, for balls that aren’t on the ground. But if you’re (for example) looking for the nearest ball to intake, that’s a good thing.)

2 Likes

I see detection of each ball as an above and beyond task, but I think it potentially is worse for timing giving the path certainty from only looking at the beginning.

I’ve taken in this code from 6391 to do path selection: 2021_UARobotics_Infinite_Recharge/GalacticSearch.java at master · 6391-Ursuline-Bearbotics/2021_UARobotics_Infinite_Recharge · GitHub

With Photonvision and it, I can check the Photonvision dashboard to make sure the camera is seeing the correct processed output, and then I can see in shuffleboard that the correct path is selected before running it. Both of these are helpful to reassure that you didn’t pick up some artifact that would cause a wrong path.

About the approach, it is definitely a matter of students working on the project. A more in depth learning is a valid approach, but for us I prefer to emphasize tools to the students that they can use with less knowledge that help give the best results achievable. Mostly because the students have a bigger gap in programming experience than schools that have courses and most students are not self-motivated to keep learning programming. I feel like seeing the blobs on the screen after just selecting a few items, and seeing it connect to the robot path is a maybe more inspiring to take up more in programming, than trying to build up background in the subject. But it isn’t clear which way is more likely to generate a lasting interest.

Again, I am a bit confused. How would you do Galactic Search without finding the balls with vision code? If it is “above and beyond”, you either think there is another solution (which I have not thought of) or you would just not do that challenge (which would be ok).

I assume they’ve got the same strategy we’ve been considering - the only thing vision needs to do is a “one-shot” detection of whether power cells are arranged in the A-Red, A-Blue, B-Red, or B-Blue pattern. Before the robot even starts moving, the camera system identifies the location of the nearest power cell, using some combination of position and size to determine which of the four possible paths to run.

I believe there are other “not in the spirit of the rules” mechanisms that were discussed at the beginning of the season which did indeed require no sensor system detecting power cells. Even within the rules, you could reasonably just go through all red/blue points, regardless of where the balls actually were at on the field.

ngreen I think was specifically talking about tracking each ball and individually chasing them down (without apriori knowledge of the expected placement positions).

Yeah pretty much this. The vision code is linked in my post (what code is not on the photonvision pi). It is based on both the size of the cell (near or far) and also somewhat dependent on the starting spot (combined with location of cells) which gives different number of targets.

If you path following algorithm is accurate enough, you know where the balls are, and you can just go there. In the past, I’ve never achieved enough accuracy on a path that long, but teams with more practice might be able to do it. Also, since the ball has to be collected, I know our pickup isn’t completely smooth. I think the act of picking up the ball will throw off the path, so correction using vision will be beneficial. I don’t know if it will be necessary. We haven’t run the bot yet to see how accurate we can get. Even though we are using the bot from last year’s game, we did a lot of rebuilding, while working under covid restrictions which has slowed things down a lot, and it hasn’t been ready for tests yet. The first real tests will probably come this weekend.

Our pickup last year also wasn’t as reliable as I would like. Sometimes it took a couple of passes to get the ball. Something that detects whether the ball was actually retrieved could help…although the highest score would probably come from just running lots of times and just using the fastest one where it happened to get all the balls on the first try.

OK, this all makes sense. It was not clear to me that people were talking about tracking each ball while driving. From my experience, the basics of “detecting each ball” is the same for finding the starting ball pattern (and thus which path is laid out) vs tracking while driving, although of course, actually guiding the robot is very different in the 2 scenarios.