Vision Tracking?

Hey CD, I was just wondering… how in the world does vision tracking work? my team has attempted primitive vision tracking in the 2016 season (our second year) but with no success. I am not asking for your code, which everyone seems to cling to like the one ring (however I wont turn it down), but the theory and components that make it tick. What are the best cameras, do you write code to recognize a specific pattern of pixels (which would blow my mind), or to pick up a specific voltage value that the camera uses as a quantification of the net light picked up by the camera’s receiver. our team did well in 2016 with a solid shooter, I can only imagine how it would have done with some assisted aiming. Thank you all and good luck January 7th!

disclaimer: I just design and oversee final assembly, I am in no way a programmer, however our programmers will be taking a look at this

I’ll do a quick outline for you, I’m working on something more in depth thing about it though.

  1. Acquire Image (most any camera will work)
  2. Filter Image to just the target color (HSV filter)
  3. Identify contours (findCountours in openCV)
  4. Eliminate extra contours (filter by aspect ratio or just size)
  5. Find Centroid of contour and compute range and angle to target
  6. Align robot

Thankfully, documentation on this stuff is better than ever! The screensteps should get you started well enough though.

(ps: Googling “FRC vision tracking” or “FRC vision processing” returns this as the first link. Being able to Google things well is an imperative skill for any profession relating even tangentially to computers, and is something that is worth developing. It saves you the time that you spend waiting for my response, and me the time that it takes to write this response.)

So you’re developing your own computer vision system? A bold undertaking. For a short answer, the hardware really is not as critical of an element as the software is. As far as vision tracking goes, what specifically are you trying to track, if anything? Depending on the intended application, object tracking can be as simple as edge detection (think line-following robots) to more complex recognition of colors and patterns (i.e. face detection or tracking a specific object by color)

There are a few prominent vision sensor projects out there like PixyCam and OpenMV (in fact *( discussing one of them) that y’all could look into to see how they pull it off.*

Caveat: I had nothing to do with writing any of this; it was a pair of our student members in 2013. The code has been on line ever since. We used an IP camera communicating over on-robot network from a raspberry pi to the 'RIO, which then sent minimal targeting information over the network to the driver station. The code is all on our github (and has been for years): has the raspberry pi side of the code (written by Matt Condon), and several of our robot codes, earliest (and probably cleanest) being, written by our founder and my son Gixxy are designed to connect with the data from this raspberry pi.

Possibly our newer code also works, but our robot performance has not convinced me of this; we were rock solid in 2013 and quite good in 2014, but I did not see evidence of good end-to-end targeting code in 2015 (when we didn’t really try, because there were no targets in appropriate locations) or 2016.

Quite a few teams have published their vision code. This is the most complete list of FRC code I know of:

Here is a fairly simple system written in python using OpenCV and networktables that carried us to a successful 2016 season.

The physical premise of vision in most FRC games is detecting light that you send out with an LED ring, which bounces off retroreflective tape back to your camera. Retroreflective tape is a material with the property that incoming light bounces off in the same direction, instead of at an angle of reflection (like you’d expect from a mirror). That means no matter where you are, if you shine light at it, you get light back.

Anything with sufficient resolution and adjustable exposure is fine. Exposure because you need to set it correctly so the camera sensor isn’t flooded with light from the retroreflective tape.

Those are the same thing :slight_smile:
In the camera sensor, incoming light generates a signal (voltage). The array of signals is turned into an array of RGB colors (that is, an image). The premise of computer vision is detecting and tracking patterns in an image.

In the case of FRC, the retroreflective tape in the image will be much brighter and a different color (yes you need to carefully choose your LED ring color, which depends on the game. Red and blue are bad choices given the tower LED strips in Stronghold. Green is a good choice), so it’s possible to detect with HSV filtering. HSV is a color space based on hue (color), saturation (grayscale to full color), and value (all black to full color). Using three specific filtering ranges of hue, saturation, and value, you can pick out the pixels of interest which should be the ones from the retroreflective tape. Then, you need to filter out the noise since camera images aren’t perfect. This is usually accomplished by picking the largest continuous blob of filtered pixels.
Now you have a shape that corresponds to the target. You can use it to accomplish what you need to, eg a closed loop turn until the center of the shape lines up with the center of the image frame (lining up the robot to the target).
In Stronghold we took it one step further and calculated both distance and yaw angle (using a lot of statistics and NI vision), so the robot could quickly line up using the onboard gyroscope (a much more efficient closed-loop turn because the gyro has faster feedback) and adjust the shooter for the distance needed to shoot. This preseason we’re taking it another step further by working on calculating the full 3D position and rotation relative to the target using OpenCV (where it’s actually a whole lot easier than in NI vision). Hopefully the vision component in Steamworks won’t be as useless as in 2015 :rolleyes:

There are many ways to make an omelette, so I’ll give my version of this.

  1. Acquire Image
  2. Process Image to emphasize what you care about over what you don’t
  3. Make measurements to grade the potential targets
  4. Pick a target that you are going to track
  5. Make 2D measurements that will help you determine 3D location
  6. Adjust your robot
  1. Acquire Image
    This is actually where lots of teams have trouble. The images can be too bright, too dark, blurry, blocked by external or internal mechanisms, etc. It is always good to log some images to test your code against and to use the images provided at kickoff. Being able to view and calibrate your acquisition to adjust to field conditions is very important. The white paper about FRC Vision contains a number of helpful techniques regarding image acquisition.

  2. Process Image
    This is commonly a threshold filter, but can be an edge detector or any processing that declutters the image and makes it more efficient to process and easier to test. HSV, HSI, or HSL are pretty accurate ways to do this, but it can be done in RGB or even just using intensity on a black and white image. You can also use strobes, IR lighting, polarized lighting, and other tricks to do some of the processing in the analog world instead of digital.

  3. Make Measurements
    For NIVision, this is generally a Particle Report. It can give size, location, perimeter, roundness, angles, aspect ratios, etc for each particle. Pick the measures that can help to qualify or disqualify things in the image. For example, that particle is too small, that one is too large, that one is just right. But rather than a Boolean output, I find it useful to give a score (0 to 100%) for example, and then later combine those strategically in the next step. This is where the folder of sample images pays off. You get to tweak and converge so that given the expected images, it has a reasonably predictable success rate.

  4. Pick a Target
    Rank and select the element that your code considers the best candidate.

  5. Determine 3D Location
    The location of an edge, or the center of the particle can sometimes be enough to correlate to distance and location. Area is another decent approximation of distance. And of course if you want to, you can identify corners and use the distortion of the known shape to solve for location.

  6. Adjust your Robot
    Use the 3D info to adjust your robot’s orientation, location, flywheel, or whatever makes sense to act on a target at that location in 3D space relative to your robot. Often this simplifies to – turn so the target is in the center of the camera image, drive forward to a known distance, or adjust the shooter to the estimated distance.

From my experience, #1 is hard due to changing lighting and environment, inefficient or unsuccessful calibration procedures, and lack of data to adjust the camera well. #2 through 5 have lots of example code and tools to help process reasonably good images. #6 can be hard as well and really depends on the robot construction and sensors. Closing the loop with only a camera is tricky because cameras are slow and often noise due to mounting and measurement conditions.

So yes. Vision is not an easy problem, but if you control a few key factors, it can be solved pretty well by the typical FRC team, and there are many workable solutions. This makes it pretty good challenge for FRC, IMO.

Greg McKaskle

We used a program called TowerTracker last year to find the goal. We modified it slightly to fit our needs. The program ran on our driver station - it received a video stream from the robot (mjpg streamer) and sent back data (angle, etc) to the robot over NetworkTables.

Github link

thank you all for the helpful info, I cant wait to show it to my teammates, our golf ball shooter for this years game will never miss a shot :stuck_out_tongue:

I’ll add a plug for my students’ work as well :

Hopefully this paper is a good overview of how things work without making you read the code. But you can read the code as well - links in the paper.

I highly recommend watching this video from Team 254 ( and reading through the presentation directly (

Thanks to Jared Russell and Tom Bottiglieri for sharing their experience!