In the past when my team (172 NorthernForce) has tried to do vision processing, we’ve not had much luck. 2-3 years ago, we tried doing it on the cRIO, which had horrendous framerates, so last year we added a raspberry pi with a USB camera. This worked better, but was still pretty slow. This year, we have some of the raspberry pi camera modules - and so far this looks pretty promising. Interfacing with the camera is a real pain however, so I thought I’d share what I’m working on, in case anyone else is interested in using the camera module.
I’ve built a class called PiCam for interfacing with the camera. All you have to do is construct a PiCam instance, giving it a requested size for images, and a callback, which processes the frames. The callback has one parameter which is an OpenCV matrix/image of the current frame to be processed.
Please keep in mind that this is in a pretty early state, and that there are many unfinished things - for instance I plan on adding more features for controlling the camera parameters (especially turning off that pesky auto exposure!).
Our team decided to go with doing all the computation on the driver station with Octave (A free MatLab clone), but we are doing some fairly heavy calculations. I am a bit worried about the bandwidth limit, but my other team members seem to think it’ll be fine.
Also, I can’t seem to load the GitHub page, but that’s probably GitHub’s fault
Sometimes I get a “Unicorn!” error, although once I got a Error 500.
Mainly, I’m using the pi right now because we already have several, as well as the camera modules. It would be nice to have something a bit faster though. Thanks for pointing out alternatives though, I had done some looking but didn’t get very far. I may just have to get my hands on an ODROID.
Nice! We tried to use the Pi last year but the Java libraries couldn’t process images fast enough. (Ironically another Java library was fine, but not compatible with the vision code and then we ran out of time.) This year, we are trying in C++. We are definitely going to look at your code and see if it can help things along.
This version has been compiled for the PI with hardware float set and some other optimizations. I measured a huge decrease in processing time (transactions that were taking 4-5 second are now subsecond) with the new version.
As always, Your Mileage Will Vary, but it’s well worth taking the 10 minutes out to download and set up the new version.
I’ve done some more testing, and the framerates so far are much much better than what we got last year. I’m doing a conversion from RGB to HSV, doing a simple threshold on each channel, and then finding the centroid and this is running at 17fps for 320x240 video and 30fps for 160x120 video. One of the main things I want to do is track the balls - which I think these resolutions would be fine for, since the ball is so big. We might need higher resolution for the vision targets, but I’m not sure.
Using the signal mode of raspistill is something I hadn’t thought of. That must be pretty slow though, since you have to wait for the image to be written out and then read it back in. I’m hoping to track the balls so I need a pretty high framerate (and as I mentioned, it looks like I’ll have it), but for identifying when the goal is hot in autonomous mode, fetching images with raspistill is probably fast enough and easier to accomplish.
I don’t know about writing but reading and processing a 552x91 image (10K file) took .4 seconds real time (.3 user). 552x91 is the approximate size of the rectangle of the goals. We haven’t been able to measure the camera write time accurately, but I’ve been told it’s less than 750 ms. Obviously for ball tracking that’s not quick enough but as you say detecting hot mode is easily done.
Just curious, how are communicating back to the cRio/LabView?
I haven’t been worrying about that yet. We did it last year over ethernet, so I can always fall back to that. We were thinking though that either using the GPIO pins on the pi to digital in pins on the cRIO, or an I2C or SPI connection might be better.
Last year we (Team 116 Epsilon Delta) used 2 USB cameras connected to a RaspberryPi (definitely stretching the power limits).
One camera was to stream video using mjpg_streamer, which barely taxed the CPU (less than 14%) and could easily stream up to 30fps with 720p images (since all the mjpg encoding work was already being done within the USB camera), but we dialed it down to 416x240@10fps since that was sufficient for the human driver and limited the amount of network bandwidth used by the camera to less than 2Mbps.
The second camera was controlled by OpenCV code which processed only a few frames per second at best. With the dedicated Pi camera, there should be significantly increased data rates, but the biggest constraint will still be the OpenCV code itself. Now maybe just trying to find the center of a single object and at most two colors for the balls versus the multiple goals like last year might be faster (we haven’t done this yet).
The only real disadvantage I see to the Pi camera is the shorter length of the ribbon cable, so it might be more difficult to position the camera where it really needs to be on the robot.
Recognition for autonomous mode should be easy. If 160x120 can be processed at 30 fps, then assisted catching should be possible. That definitely wasn’t the case with our USB-attached cameras and the Pi.
I’m wondering if anyone has attempted to use the BeagleBone Black to do OpenCV processing and see how it compares to the Raspberry Pi. I now think I’m going to have to get a couple of the dedicated cameras.
I’ve also been wondering about performance on the beaglebone vs raspberry pi. I’ve already have a beaglebone, but haven’t had time to get it set up.
And I have been able to process 160x120 at 30fps. The processing is just an rgb->hsv conversion, performing a threshold on each channel, and then centroiding the resulting binary image, but it seems to work pretty well. Haven’t been able to test on an actual competition ball yet though.
The fact that we only ever have to track one ball at a time makes the processing much easier, since we don’t have to do any blob separation (though we might want to depending on how good the threshold works)
My hope is to be able to do assisted catching, though another concern with that is the very limited field of view that the camera module has.