What Did you use for Vision Tracking?

This year my team didn’t have time to get into vision tracking. I’ve been trying to dive into it but before I get started I was wondering what the best option was. I’ve heard a lot of speculation about what is good and what is bad. I was wondering what people actually used at competition. I would love to hear feed back on what worked and what didn’t work.

1261 used raspberry pi, with opencv. The vision code was written in python and used pyNetworkTables to communicate

Thanks for the in-site. How hard was it to write the code in opencv and use pyNetworkTables? Do you have any good resources or documentation that might help? Also how bad was the lag?

We used a very similar setup, except written in c++. We actually had virtually no lag on the raspberry pi side. The only portion where there was lag was actually the roborio processing the data from the network tables.

The setup was pretty simple to get up and running. We compiled opencv and networktables 3 on the pi, then wrote a simple c++ program to find and send the necessary data to align with the target back to roborio. I actually followed a video tutorial here to install cv on the raspberry pi. For network tables, I downloaded the code off of github and simply compiled it like a normal program and added it to my library path if i remember correctly.

2383 used a Jetson TX1, with a kinect used as an IR camera. Vision code was written in OpenCV using C++ and communicated with the roboRIO over network tables.

During the offseason we will be exploring the android phone method that 254 used for reliability reasons; the Jetson+Kinect combo was expensive and finicky, compared to an android phone with an integrated battery.

This year shaker robotics used the RRio with NIvision(java) to track the targets, we analyzed frames only when we needed them to prevent using too much of the rio’s resources.

1771 originally used the Axis M1011 camera, and GRIP on the driver station. However we had a problem on the field, the network table data was not being sent back, and we couldn’t figure it out.

We switched to using a PixyCam, and had much better results.

4901 used Grip on an a RPi v2 + a Pi Camera.

For more info on our implementation visit here https://github.com/GarnetSquardon4901/rpi-vision-processing

We used Java and OpenCV on an NVIDIA Jetson TK1, processing images from a Microsoft Lifecam HD3000.

Like most engineering decisions, there isn’t a “good” and “bad”, but there is often a tradeoff.

We used GRIP on a laptop with an axis camera? Why? Because our code for vision was 100% student built, and the student had never done computer vision before. GRIP on the laptop was the easiest to get going, and it only worked with an i.p. camera.

There are down sides to that. If you use opencv, you can write much more flexible code that can do more sophisticated processing, but it’s harder to get going. On the other hand, by doing things the way we did, we had some latency and frame rate issues. We couldn’t go beyond basic capture of the target, and we had to be cautious about the way we drove when under camera control.

Coprocessors, such as a Raspberry PI, TK1, or TX1, (I was sufficiently impressed with the NVIDIA products that I bought some of the company’s stock), will allow you a lot more flexibility, but you have to learn to crawl before you can walk. Those products are harder to set up and have integration issues. It’s nothing dramatic, but when you have to learn the computer vision algorithms, and the networking, and how to power up a coprocessor, and do it all at the same time, it gets difficult.

If you are trying to prepare for next year, or dare I say it for a career that involves computer vision, I would recommend grip on the laptop as a starting point, because you can experiment with it and see what happens without even hooking to the robot. After you have that down, port it to a PI or an NVIDIA product. The PI probably has the most documentation and example work, so that’s probably a good choice, not to mention that the whole setup, including camera, is less than 100 bucks.

Once you get that going, the sky’s the limit.

That is very true and I got grip working on a laptop so I’m trying to figure out where the next best place to go is. I don’t know much about vision processing algorithms, python or openCV.

There is a good documentation for the raspberry pi which I have been working on when I can.

Thanks for the reply

The code wasn’t that hard for our developers, for openCV or pyNetworkTables. They looked at tutorials at pyImageSearch, it’s a great resource. I think we were getting in the 20-30fps, but I would have to review our driver station videos. We would process on the Pi, calculate a few things such as, (x,y), area, height, (x,y) of center of image and such. This information would be sent via pyNetworkTables to the robot code. The robot code would use this information as input to elevation PID and Drivetrain PID.

The images were also sent via mjpg-streamer to the driverstation for drivers to see, but it’s not really needed. We just liked seeing the shooter camera.

We’d be happy to help, just PM me.


We used a modified version of TowerTracker for autonomous alignment and a flashlight for teleop alignment.

Team 987 used an onboard Jetson TK1 for our vision tracking. We programmed it with C++ code using openCV. The Jetson sends target information to the roborio with tcp packets. From there a compressed low frame rate stream was sent to the driver station for diagnostics. The low frame rate stream used very little bandwidth to ensure we stay well under the maximum (it used around 500 KB/s).

We ended up using the RoboRio with a USB Camera, but we attempt to uses opencv on Jetson, begalbone black and raspberry pi 3. We scraped the jetson after realizing how hard we were landing after hitting a defence. (But we had opencv working) then we had the begalbone scraped after the raspberry pi 3 was released. We got the pi to work after 16 hours of compiling opencv but we ran out if time.

IP Camera. Custom Code on the Driver Station.

Vision program written in C++. It took the picture off the Smart Dashboard, and processed it. Pretty ingenious code. He looked for “corners”. Rated each pixel for the likelihood it was a corner (top corner, bottom left corner, bottom right corner). the largest grouping was declared a corner.

Our programmers want to clean things up and then we’ll be open sourcing our code. With pyNetworkTables, you have to static IP, otherwise it won’t work on the FMS.

FYI, for Python/OpenCV, the installation of OpenCV takes 4 hours after you have the OS. We used Raspbian (Wheezy, I think). PyImageSearch has the steps on how to install OpenCV and Python on Raspian. 2hrs to install packages, and the final step, a 2hr compile.

We used the Pi 2. The Pi 3 came out midway through the build season and we didn’t want to change. The Pi 3 might be faster, we’re going to test to see what the difference is.


Wow that long?

I am extremely new to open-CV and python do you have any good places to start?

The installation takes a long time because you need to build it on the pi, which does take several hours. Which language are you looking to get started on?

We used an Axis IP Camera and a Raspberry Pi running a modified version of Team 3019’s TowerTracker OpenCV java program. I believe someone posted about it earlier in this thread.