Yet Another Vision Processing Thread

Team 1512 will be giving vision processing its first serious attempt this year. I have been amazed by the wide variety of approaches to vision processing presented on this forum and am having trouble weighing the advantages and disadvantages of each approach. If your team has successfully implemented a vision processing system in the past, I would like to know four things:

  1. What co-processor did you use? The only information I have really been able to gather here is that Pi is too slow, but do Arduino/BeagleBone/Pandaboard/ODROID have any significant advantages over each other? Teams that used the DS, why not a co-processor?

  2. What programming language did you use? @yash101’s poll seems to indicate that openCV is the most popular choice for processing. Our team is using java, and while openCV has java bindings, I suspect these will be too slow for our purposes. Java teams, how did you deal with this issue?

  3. What camera did you use? I have seen mention of the Logitech C110 camera and the PS3 eye camera. Why not just use the axis camera?

  4. What communication protocols did you use? The FRC manual is pretty clean on communications restrictions:

Communication between the ROBOT and the OPERATOR CONSOLE is restricted as follows:

Network Ports:
TCP 1180: This port is typically used for camera data from the cRIO to the Driver Station (DS) when the camera is connected to port 2 on the 8-slot cRIO (P/N: cRIO-FRC). This port is bidirectional.
TCP 1735: SmartDashboard, bidirectional
UDP 1130: Dashboard-to-ROBOT control data, directional
UDP 1140: ROBOT-to-Dashboard status data, directional
HTTP 80: Camera connected via switch on the ROBOT, bidirectional
HTTP 443: Camera connected via switch on the ROBOT, bidirectional

Teams may use these ports as they wish if they do not employ them as outlined above (i.e. TCP 1180 can be used to pass data back and forth between the ROBOT and the DS if the Team chooses not to use the camera on port 2).

Bandwidth: no more than 7 Mbits/second.

Is one of these protocols best for sending images and raw data (like numerical and string results of image processing) ?

  1. We plan to use our DS (running RoboRealm) for vision processing this year. We’ve never tried using a co-processor, but since we don’t have very complicated uses for the vision system, we’ve had good success with doing the processing on the DS (even with the minimal lag it introduces.)

  2. We’ve used NI’s vision code (running inside the Dashboard program) in the past, but this year we’ll be using RoboRealm (as mentioned above.) In my tests I’ve found that it’s much easier to make changes on the fly and the tracking is very fast and robust.

  3. We just use the Axis camera. Since we need to get the camera feed remotely, IP cameras are pretty much the best way to go.

  4. We just use NetworkTables, since it’s efficient and easy and integrates nicely with RoboRealm. It’s also easier for new programmers on the team to understand (and we have a lot of them this year…)

Team 3334 here. We are using a custom-built computer using a dual-core AMD Athlon II with 4 GB of RAM. In order to power it, we are using a Mini-box picoPSU DC-DC converter.

On the software side, we are using C++ and OpenCV running atop Arch Linux. The capabilities of OpenCV are quite impressive, so read the tutorials!
I’m not sure about JavaCV, but a board like the one we built would definitely be fast enough to run Oracle Java 7.

Team 3574 here:

2012 - We used an Intel i3 with the hopes of leveraging the Kinect. We got this working, and in my opinion was our most succesful CV approach, though we dumped the kinect. There were mounting and power issues. Running a system that requires stable 12 volts off of a source that can hit 8 and regularly hits 10 was the biggest issue.

2013 - Odroid U2. CV wasn’t as important for us this year as our autonomous didn’t need realignment like 2012. We ran into stability issues with the PS3 EYE camera and USB, which was fixed with an external USB port. A super fast little (I do mean little) box. We hooked it up to an external monitor and programmed directly from the ARM desktop. Hardkernel has announced a new revision of this, the U3, which is only $65.

2014 - Odroid XU. The biggest difference here is no need for an external USB hub, with 5 ports. I’ve tested it with 3 running USB cameras (2 PS3 Eye’s and 1 Microsoft Life HD) with no issues. Ubuntu doesn’t yet support the GPU or running on all 8 cores, but a quad-core A15 running at 1.6ghz is pretty epic. If your team is more cost concerned, this is pretty pricey at $170. At this point the U3 is probably going to be able to keep up with it in terms of processing, and adding an additional powered USB hub is not too expensive.

I’ve played with both the beaglebone black and the pandaboard, but with the amount of work we’re having our vision processor do this year (see ocupus) I think we’re addicted to the quadcore systems now.

Python’s OpenCV bindings. Performance won’t make that much of a difference. The way OpenCV’s various language bindings are built, most of the performance intensive stuff happens in the native code layer.

PS3 eye camera. We originally picked it in 2012 mostly because we thought it would be cute to have alongside the Kinect. At one of the regionals though we had really bad lighting conditions that had us switch over the IR for 2013, of which there are a lot of tutorials online.

As for not the axis camera, it’s heavy, requires separate power, isn’t easily convertible to IR, and you pay a latency cost going in and out of MJPEG format.

In 2012 we had no restrictions, so just ran a TCP server in the CRIO’s code. 2013 we used network tables which is nicely integrated. We’ll use that this year. In 2013 we did not send the raw camera feed. This year, we’ve put together the ocupus toolkit to support doing that. This uses OpenVPN to tunnel between DS and Robot over port 1180.

  1. We use the driver station laptop, like 341 did in 2012
  2. LabVIEW. We took the vision example meant for the cRIO and copy and pasted in into our LabVIEW dashboard.
  3. Axis Camera
  4. Network Tables

The main advantage to this setup is ease of use. Getting a more complicated setup working can be difficult and have lots of tricky bugs to find. Using our driver station laptop (intel core 2 duo @ 2.0 GHz, 3 Gb RAM) gave us more than enough processing power (we could go up to 30 fps) and was the cheapest and simplest solution. The LabVIEW software is great for debugging because you can see the value of any variable/image at any time, so it’s easy to find out what isn’t working, and a pretty decent example was provided for us. The axis camera was another easy solution because we already had one and the library to communicate/change exposure settings was already there.

The network tables approach worked really well too and we got very little latency with this approach. We were able to auto line up (using our drive system, not a turret) with the goal in about a second, and we had it working before the end of week one, after spending about 2 hours with two people. In the end we didn’t need it in competition, we could line up by hitting the tower.

We’re doing the same approach this year for the hot/not hot goal. Compared to the other solutions, this is the cheapest/quickest/simplest, but you loose the advanced features of openCV. NI’s vision libraries are pretty good, and the vision assistant program works nicely too, but in the end, some people say openCV has more. You need to decide if the extra features are worth the extra work for your team.

You might need to upgrade your power supply to the M4. When we went with an onboard x86, we started with the 160W pico. The problem you’ll hit is when you run all your motors at top speed and your system voltage drops down to 10 volts, which caused our computer to shutdown.

In your experience, the M4 is stable? Ahh, thank goodness it’s been less than 30 days since I bought that pico :smiley:


Yes, the M4 was rock solid. At 6-24v input range it should handle any battery condition. The biggest risk is wiring it up backwards. We used a sharpie paint pen and colored the + terminal to mitigate that.

As the OP says, “Java would be slower,” This is not necessarily true. It would be faster than the Python bindings. However, the C++ code would still be fastest. The reason why I (and many others) prefer to use OpenCV with C or Python is because:
–fast, stable, robust, OpenCV written in C/C++, easy, well-documented
–easy to program, and easy to put together a program quickly and with little notice. You don’t need to go around, compiling the program every time you change very little code
–It’s really just for the sake of it. Java is good, but is so similar to C, that you’d probably be better off learning the better-documented C/C++ API instead of the Java one. However, it is personal preference. You are going to use similar commands, so C might just be easier to use. Also, with so many C compilers out, it is actually much more portable than Java which has a JVM for most, but not all systems. Java is really only nice if you are programming for Android, where you need maximum portability without the need to recompile the code every time it is downloaded!

I am actually thinking about starting an OpenCV journal, that explains what I have done, and what not to do, to not shoot yourself in the foot! Beware! It will be long :smiley:

By the way, today, I was working on setting up the Kinect for OpenCV, which I will make a thread about in a few minutes.


  1. A custom build computer running ubuntu
  2. OpenCV in C
  3. Microsoft Kinect
  4. UDP
  5. O-Droid X2
  6. OpenCV in C
  7. Microsoft kinect with added illuminator
  8. UDP
  9. 3 or 4 O-Droid XUs
  10. OpenCV in C++ and OpenNI
  11. Genius 120 with the ir filter removed and the asus xtion for depth
  12. UDP

Having sent off a number of people well on their way for computer vision, I can now offer help to more people. If you want some sample code in OpenCV in C and C++, pm your email so I can share a dropbox with you.

Tutorial on how to set up vision like we did:


  1. Driver Station
  2. NI Vision
  3. Axis M1011
  4. UDP
  5. Driver Station
  6. NI Vision
  7. Axis M1011
  8. Network Tables
  9. Driver Station or cRIO (TBD)
  10. NI Vision
  11. Axis M1011
  12. Network Tables

By using the provided examples and libraries, we were able to get a working solution in a minimal amount of time. The reason we use the driver station rather then an on-board processor is it significantly simplifies the system, both as far as reducing part counts (limits failure points) and it uses examples and libraries that already exist and have been tested by FIRST/NI.

Getting OpenCV to receive images from the Kinect is quite simple with libfreenect’s C++ interface and boost threads. If there is enough interest, I’ll post my code.

i don’t think a co-processor is necessary you have plenty of power to do the analysis on the crio. this was done on the first crio released in 2009 which is slower and has a much smaller processor cache then the new crios and it worked fine.

your not even tracking a moving target this year so it should be even easier

There is a very good reason to use a co-processor: any USB camera can be used. Other cameras with better framerates, resolutions, sensors, or other desirable characteristics can be used through a co-processor. That’s why my team committed to using one this season, after using cRIO-based analysis every season of our existence.

why, you don’t need any better cameras if the ones available can do they job as it has for us every year

I think we are already scared by the processing power of one ODROID! Your robot is going to catch on fire with so much processing power. (not literally).

Before you switch to 3 XUs, try one X2 again, but measure it’s processor usage, and then place your order after that. I’m pretty sure that 4XUs will be equal to one CIM continuously, especially with the 3 120Degree cameras and the Kinect!
Be careful there, or everyone will wonder why the voltage drops every time you boot the XUs!

My team already has our architecture set, but I’m curious about yours.
How do you intend to coordinate the image processing over the multiple nodes? Do you already have a multi-node OpenCV library? As far as I know,
OpenCV doesn’t have an MPI-enabled version, only multithreaded.

Well, 1706 loves using the Kinect, and they say that they will use 3 cameras. I guess that there will be one for each camera and one for the kinect. Maybe one of them does the data manipulation, or maybe it is done on the cRIO.

By the way, Hunter, how do you prevent your ODROIDs from corrupting from the rapid power-down? Do you have a special mechanism to shut down each node? Also, which converter are you using to power the Kinect? I don’t think it would be wise to connect it directly to the battery/PDB, etc. You’d need some 12v-12v converter to eliminate the voltage drops/spikes!

As for your query of multi-noded, I think you are misunderstanding what he is doing, having a computer for each of the 3-4 cameras onboard the bot. Hunter will probably just use regular UDP sockets, as he said in his post. Either one UDP connection per XU can be used to the cRIO, or maybe there can be a master XU, that communicates with each slave XU, processed what they see, and beams the info to the cRIO!

However, I think it is still overkill to have more than 2 onboard computers, except the cRIO!

Okay I see now. I was under the impression that he was using them like an HPC cluster. Thanks for clarification!

We actually aren’t using the kinect this year (sad for me, but my mentor wanted to get away from it). The genius 120 has a FOV of 117.5x80. Our plan is to have 360 degree vision processing. We have a mentor that is a computer vision professor at a local state university and a kid’s dad is the head of the comp sci department at said university who stops by on occasion. Both of them know how to multi tread, and qt apparently has a method of doing it too.

The x2 is slightly slower than the xu according to our tests. A company bought us all these boards and cameras in exchange for us teaching them what we did with them. They are a biomed company and do a decent amount of biomed imaging.

We’re apprehensive about relying on ball and robot tracking on the asus xtion entirely because of the amount of ir light that gets pur onto the field.

One task at a time. Field location and oreintation is almost done. Next to ball tracking :smiley:

We are using 3 or 4 cameras. Through the xu is powerful, I dont want fps to drop below 20, and I think that would happen. So it’s easier for each camera to get their own board (especially when you already have the xus on hand).

We had a voltage regulator for our x2 and kinect, which helped. We didn’t noticed a problem with justing off the robot as a means of turning the computer off.

Using so many boards is sort of a proof of concept. We could have an autonomous Robot, but it’d be a one mam team, which isnt how we are going to play the game. We are trying as many things as we can while still within reason so we learn more and can do more in the future.