A Step by Step Run-through of FRC Vision Processing


Hello, I was wondering about the practical use of the cv2.solvePnP() function, and didn’t manage to find a solution in your GitHub repository mentioned in the paper - what parameters does it expect? I tried to pass an [x, y] vector for image points, an [x, y, z] vector for object points and a 3x3 two dimensional list for the camera matrix, however it gave me the following error:

cv2.error: OpenCV(4.0.0) C:\projects\opencv-python\opencv\modules\calib3d\src\solvepnp.cpp:92: error: (-215:Assertion failed) ( (npoints >= 4) || (npoints == 3 && flags == SOLVEPNP_ITERATIVE && useExtrinsicGuess) ) && npoints == std::max(ipoints.checkVector(2, CV_32F), ipoints.checkVector(2, CV_64F)) in function ‘cv::solvePnP’

If it helps, this is the exact code that caused it:

two_d = cv2.UMat(np.array([x, y], dtype=np.uint8))
three_d = cv2.UMat(np.array([distance, constants.TARGET_SIZES[‘2019’][‘height’], 0], dtype=np.uint8))
camera_matrix = cv2.UMat(np.array([[constants.FOCAL_LENGTHS[‘realsense’], 0, frame_center_x],
[0, constants.FOCAL_LENGTHS[‘realsense’], frame_center_y],
[0, 0, 1]], dtype=np.uint8))
tvec, rvec = cv2.solvePnP(three_d, two_d, camera_matrix, None)


The coordinate arrays for solvePnP are actually float values. Looks like you have the correct idea, but need to get the formatting just right. However, you need at absolute minimum 3 points in each list for solvePnP(). 4 is better.

You don’t want to bother with UMat for this. These are not image arrays.


two_d = np.array(((x1,y1), (x2,y2), (x3,y3))

and similar for three_d, where x and y are floats. Or you can wrap it in “np.float32(…)” if they are ints, and it will convert all of them for you.


Oh and a caution. RobotPy cscore is not compatible with OpenCV 4. You need to use OpenCV 3. 3.4.5 is fine.


A+ work on the LaTeX document, it really is the best way to build a document.



First, thanks for sharing your vision code. Section 9 on “Coordinate Systems and Geometry” in the vision paper was very useful. I’m a mentor who unexpectedly got drafted out of retirement to help our local team with camera/vision processing this season so I started at a bit of a deficit. Your source software has been very valuable to jump start my efforts.

The team is planning to use a Raspberry Pi 3 B+. That decision was made before I got onboard so I’m trying to make do with the FRC Vision RPi image as a platform. I have a few questions for the FRC Vision RPi team on the base software, but that’s a separate issue I’ll post elsewhere.

After fixing a few initialization and cut-and-paste errors in the original software posted on GitHub, I managed to get the 2877 VisionServer running on the RPi. I started with the “driver” routines that run the camera server and managed to get live video on a browser. I just cobbled together a spare roboRIO and radio and have to see if I can get the video passed to the Driver Station. My next step is to run the “rrtargetfinder” to see if it runs as-is and what it produces. The “rrtargetfinder” routines appear to run but I’m working to get to the output to see if it is finding targets.

This brings me to my line of questions associated with usage of Network Tables. Unfortunately, while somewhat familiar with Network Tables, I am not practiced at using them. I have just started to dabble with the Network Table example code using the roboRIO, but I am less clear about the usage in the VisionServer software. My understanding of the VisionServer architecture is that it acts as a client and uses Network Tables for vision processing configuration and to also post its target information to the roboRIO for navigation. I am unfamiliar with the usage of the “ntproperty()” method from “networktables.util” and I am having trouble finding documentation. There appears to be precious little VisionServer software interfacing with the Network Tables so it appears as “magic”. I would deeply appreciate any pointers to library references and an explanation of the Network Table flow in the VisionServer software.

Thanks in advance for any insights you can provide.



Documentation is here, but in summary, the ntproperty() gives you an abstraction that works like a variable, but assigning to the “variable” actually results in updating the value of the Network Table entry specified by the path given as the first argument to ntproperty(). For example, the assignment at visionserver.py:323 updates the Network Table entry “/SmartDashboard/vision/target_info” based on the setup at visionserver.py:50.

You can sniff what’s in Network Tables with the OutlineViewer. If you have a roboRIO handy, which it sounds like you do, it’s probably easiest to let the roboRIO run the Network Tables server as it usually does, and connect VisionServer and OutlineViewer as clients to that server. (If you have not updated visionserver2019.py:113 to be the IP address or mDNS hostname of your roboRIO, you will need to do this as well.) This should allow you to determine whether the VisionServer is populating the target_info Network Table entry.



Thanks for your prompt response.

I surmised ntproperty() does as you described, but having the documentation reference should “calm my inquisitive mind.” :wink:

I figured I got to the point where I needed to lash the Pi to a roboRIO to get the full operation so I grabbed a spare from the shop and have all the gear to run the target processing stuff. I might have got ahead of myself asking the question but I figure I would shotgun my efforts.

I believe I have made all the necessary edits to IP addresses and the like to get things pointed to the right components. I am hoping my understanding of the base operation on the roboRIO is correct, at least in that the RIO automatically runs the Network Table Server without writing any software.

During my initial Network Table exploration I did invoke the OutlineViewer to see the example software output - and it worked as advertised. So I’m hopeful to see some output (good or bad) using the OutlineViewer as you described.

Once I get some valid output going to the correct places I will figure out what the team has to do to get the vision server output into the roboRIO and in turn process the heading info for target steering assist. If only it were as easy as that sounds…

If this all comes together I will have to go back and recap the vision processing math, logic, and architecture so I can explain it to someone. Thanks for the help, and I might be back with more questions if I hit other roadblocks.

Best Regards,


Out of additional curiosity, I know your team runs vision processing on an OROID so do you have to deal with uncontrolled shutdown (i.e., sudden power off) effects on the file system? If so, what approach do you take since the software does write log files? For example do you just turn off logging for run-time operation?

We are using a Raspberry Pi 3 B+ with the FRC Vision RPi image. It is setup to make the file system read-only unless explicitly unprotected, like while doing development.


Wow, others are studying our code. Very cool.

@gbabecki hope @tpc’s answer is clear. He got it all right.

Note, if you are testing our code, we have a couple of modes/flags which help the testing. You can start the visionserver code with the “–test” flag, and then it starts its own NetworkTables server. That way you run everything without a Rio. Works nicely on a test bench.

If you want to test the target finding code (like rrtargetfinder), it has a main() routine which lets it process pictures stored on disk; makes for easy testing of algorithm tweaks. That mode is also why you might find some odd divisions in the code. The “target finders” know nothing about robot code, particular no NetworkTables, so they can run completely standalone.

Happy to answer other questions.


I have not done anything about that, although I know about (for my work). So far, no problems. Linux is really quite resilient, and I don’t care if the logs are not perfect. I know that the RPi FRC image runs with a readonly FS, which is smart, but also kind of a pain. Note that the ODROID uses an eMMC module so is much faster than SDcard, so writes are committed pretty quickly.


I’m not with 2877, I just have this thread bookmarked for easy reference so that I can steal ideas for our (623) own vision processing. I saw your question and could answer it, so I just jumped in.

Last year we used a Beaglebone Black (same idea as the Pi or ODRIOD). I wrote about how we handled shutting it down cleanly here:

Thanks for sharing!


There are also a few other flags and network table properties that can help with different aspects of testing. Some examples are tuning the NT values to be compatible with the properties of the target finders, recording images, etc. More information is provided in the readme on GitHub. Just make sure to use the correct path to each NT property when fetching and setting values. We have had some trouble in the past with other NT properties with the same names as ones that we had created. If this occurs, don’t mistake them for the real properties since they are located at a different path and are not updated to match the true target_info.


@prensing, @tpc, @Degman:

Thanks to all for the additional thoughts.

To the 2877 members, I started (i.e., got sucked into) this effort on short noticed so I scoured the FRC sources for any effort that already broke ground on vision processing. I was extremely lucky to find your team software and knew even after a cursory read that it was my best shot at getting something going. I did note the “-test” flag as well as the feature to process image files. Definitely lots of options baked into the software. Although is can be challenging following someone else’s software, especially sorting variables from object references, the software is well organized and extensible.

Being retired I’ve been circling back to re-educate myself on all things engineering, old and new. In particular I’ve been wanting to dig into OpenCV, however this effort has been a bit of trial by fire. I will go back through the fundamentals in more detail when I have more time.

As for the shutdown issues, I can see where the faster memory can help side step file system crashes. The power or shutdown button on the BeagleBone is a great feature, and I’ve seen some references to such an implementation for a Pi as well. I may look into that solution for the Pi and hope the operators follow the shutdown procedures. Of course that all assumes I wrap this up soon.

I’m about to fire up the gear on my desk now and see what shows up in the OutlineViewer. Thanks and best wishes to your teams.


Managed to see the vision server output via the OutlineViewer! It was in “driver” mode so the the basic result info (frame time, finder id, etc.) was in the table. I could see the camera stream via a web browser or pull it up on the SmartDashboard. I forgot to see if I could get it on the Driver Station window as well.

I haven’t tried to set up the SmartDashboard to send the vision configuration info to the Pi. That’s another thing I haven’t mastered in this saga so I’ll have to read that FRC documentation next.

Anyway, tomorrow I’ll switch over to the target finder and see it it generates target info. I have to remember to check the settings for camera height and angle as well as the target height to have the best chance of getting meaningful output.


@prensing, @Degman:

I might be getting ahead of my efforts employing your software, but I have two additional questions.

I am about to dive into the target processing section and the following thought came to mind. The distance-to-target calculation uses trig on the angles created by the height difference between the camera and target. Are there any issues in this calculation given the relatively low height of the targets (~ 3 ft) in the field configuration? These target heights would suggest either placing the camera low (on the robot) and looking up or placing the camera high to look down at the target to create the requisite viewing angles needed for the calculation. I suspect you already thought this through and was just wondering about the geometry.

Additionally, while I was able to view the network tables via the OutlineViewer, I am having trouble setting up the SmartDashboard to control the vision processing parameters as well as view the target output information. Configuration of the SmartDashboard is supposed to be “intuitively obvious” from what I read in the documentation, however I must be missing something because I can’t figure out how to see the table data or configure operator input fields. It sounded like widgets would be automatically added for each table key flying by, but I don’t see things like “target_info” appear. Do I have to write some software on the roboRIO to post “commands” to show up on the SmartDashboard so I can input the vision control parameters, or something to that effect?

I had another question regarding the two angles produced as part of the target location processing. I understand the first angle (angle1 to target – angle displacement of robot to target) which can be used to steer a heading to the target. However, I’m curious about the usage of the second angle (angle2 of target – angle displacement of target to robot) which was not really discussed in the paper.

Thanks for any thoughts on these questions.



I noticed there were recent updates to the files on GitHub. I see it corrected the initialization issues and various typos I found and fixed to get my copy running. BTW, not that it’s a show-stopper, but the rrtarget_finder id is set to 3.0 which is also the value of the hatch_finder id. I made the rrtarget_finder id 2.0 in my copy.


@gbabecki I think you are looking at last year’s cube finding code. There, we did not feel we could find all the corners and match the geometry (cube was not fully a cube), so we only used the center point.

This year, with the full vision targets, you get 4-8 corners of the target and they are easily known coordinates in real life, so we are using solvePnP(). With solvePnP, you don’t need to know the height of the camera, but you do need an accurate camera calibration matrix. Our testing so far says it is quite accurate.


NetworkTables “should just work”. OutlineViewer is the more obvious interface and is much clearer, but is not helpful during competition. We are not using SmartDashboard, instead we are using Shuffleboard, but yes, they can be confusing. You should not need to issue any commads to get them visible, but we frequently run into confusion of the exact path of an item in NT. We definitely get confused with picking up entries which look correct but are in the wrong path.

One suspicion I have (but have not confirmed) is that the Rio saves the NT setup every time, so any kind of incorrect junk gets preserved and slowly builds into a mess. I have been meaning to see if we need to delete the save file on our Rio.


Angle2 is shown in Fig. 6 in the paper. It is the angle of the target w.r.t. the line between the robot and target. If you turn the robot by angle1, drive forward, then angle2 would be how much you need to turn to make the robot perpendicular to the wall.

Or another way of looking at it, if angle1 = - angle2, then the robot is perpendicular to the wall, but may need to drive forward and strafe sideways to line up with the target.



My bad; I think it was a “glancing glance” of the compute_output_values() routine and I mistakenly took the trig as part of the height differential calculation for distance; trying to do too much. I do see all the target corner processing to verify a good target image and subsequent calculations. Thanks for the clarification on the two derived angles as well.

After I picked up the latest software versions I used the Shuffleboard to get at the controllable parameters. Oddly enough I couldn’t see the target_info there so resorted to the OutlineViewer for that output. Assuming I get the target acquisition part working I’ll have to figure out a clean way of making the mode switching available to the drivers.

Speaking of target acquisition, upon being able to switch into target mode I couldn’t get it to find a target. The log was showing that it couldn’t find a second contour, bad deltax ratios, failed ratio tests, and just generally not happy.

I believe my target construction is correct but I will recheck that first. I belive the target illumination is good but I don’t believe there is any processed image overlay on the output video stream to verify that. I guess I could use GRIP to just verify the settings are rendering good feature extraction. Clearly the software is working for you so any thoughts on where I may look next would be appreciated.



For camera switching, we use a button on an extra controller, and then have the robot Java code set the mode. In the past we have not use buttons, etc in Shuffleboard/Smartdashboard for inputs, although it is certainly doable.

For targetting, you may have to tweak the settings to match your setup. In particular, the camera exposure and HSV settings are definitely specific to a camera and lighting setup. You do need to have the target illuminated with green LED rings, at least if using our setup.


Yeah, I was leaning toward Driver Station button with roboRIO code to send the mode info via the table. The dashboards seem ok for development and testing but a little too unwieldly for operation.

Right after I posted my last response I thought “oh crap, I assumed the target illumination color was green” without checking the HSV values. Good to know at least that was not a problem.

At the risk of making another assumption, I gather there is no “live mode” to adjust the values and see the target rendering in the video stream. I know there is a “tuning” option and the image file operation, but was thinking of something real-time to see the adjustments. That’s why I’m thinking of using GRIP to emulate the front end and watch the adjustment impact. Let me know if there are any other points to probe with logging to check out the operation.

I am using a Logitech 930e as per various recommendations. I does appear to provide a nice wide field of view, especially with the 424 x 240 settings you use.

I have an awkward situation in that for some reason my laptop doesn’t respond to USB cameras. I don’t think my integrated camera even shows up as a selection. I must have some setting turned off to disable all cameras or something. Worst case, I can plug the camera into my desktop and use GRIP there.

Fun, fun…



NetworkTables will only exhibit this sort of behavior in these scenarios:

  • If you set a path to be persistent, then it will save it to a save file on your NT server (typically the rio). The default is to not have persistent keys.
  • When clients (shuffleboard, vision code, et al) connect to the server, they will give the server all of their current keys/values, so to truly clear the NT state to empty you need to kill all clients and all servers