Optimal board for vision processing

Hello, I’m a new (first year) member in a rookie team (this will be the 3rd year for the team) . As a enthusiast developer, I will be a part of the programming sub team. We program the RoboRio in C++ and the SmartDashboard in Java or C# (using ikvm to port the java binaries to .NET).

In this period, before the competition starts, I learn as much materiel as I can. A friend of mine and I were thinking about developing a vision processing system for the robot, and we pretty much figured that utilizing the RoboRio (or the cRio we have form last year) isn’t any good because, well, it’s just too weak for the job. We thought about sending the video live to the driver station (classmate/another laptop), when it will be processed and then sent back to the RoboRio. the problem is the 7Mbit/s networking bandwidth limit and, of course, the latency.
So, we thought about employing an additional board, which will connect to the RoboRio and do the image processing there. We though about using Arduino or Raspberry Pi, but we are not sure they too are strong enough for the task.

So, to sum up: What is the best board for using in FRC vision systems?

Also, if we connect, for example, a Raspberry Pi to the robot’s router and the router to the IP camera, the 7Mbit/s bandwidth limit does not apply, right? (because the camera and the Pi are connected via LAN)

P.S. I am aware that this question has been asked in this forum already, but it was a year ago. So today there may be better/other options.

The most powerful board in terms of raw power is the Jetson TK1. It utilizes an Nvidia GPU witch is orders of magnitude more powerful than a CPU for virtually any vision processing task. And if just want to use its’ CPU it’s still has a quad core 2.32Ghz ARM which to my knowledge is more than most if not any other SBLC on the market. It is however $192 and much larger than an R-Pi.


PS Here are some CD threads with more info:


So far the vision processing MORT11 has done has been done with a stripped down dual core AMD mini-laptop (bigger than a netbook) on the robot that is worth less than $200 on the common market. It has the display and keyboard removed. It has proven to be legal in the past but we have rarely relied on vision processing so it often is removed from the robot mid-season. It was also driven 200+ times over the bumps in the field with an SSD inside it and it still works fine. For cameras we used USB cameras like the PS3-Eye which has a Windows professional vision library and can handle 60 frames a second in Linux (though you hardly need that).

That laptop is heavier than the single board computers in part because of the battery. However I would suggest that battery is worth the weight. As the laptop is COTS the extra battery is legal. This means the laptop can be running while the robot is totally off.

The tricky part is not finding a single board or embedded system that can do vision processing. The tricky part is powering it reliably and the battery fixes that issue while providing enormous computing power in comparison.

Very likely all of the embedded and single board system that will be invariably listed in this topic will not be able to compete on cost/performance with a general purpose laptop. The market forces in the general computing industry drive differently.

The cRIO gets around this issue because the cRIO gets boosted 19V from the PDU and then bucks it to the internal low voltage it needs. As the battery sags under the motor loads, dropping the 19V is no big deal if you need 3.3V. As switching regulators are generally closed loop they adapt to these changing conditions.

So just be careful. The 5V regulated outputs on the robot PDU may not operate in a way you desire or maybe provide the Wattage you need and then you need to think about how you intend to power this accessory.

People have worked around this in various ways: largish capacitors, COTS power supplies, just using the PDU. I figure that since electronics engineering is not really a requirement for FIRST that using a COTS computing device with a reliable and production power system is asking less.

Keep in mind that I see no reason an Apple/Android device like a tablet or cell phone would not be legal in past competitions on the robot as long as the various radio parts are properly turned off. It is possible someone could create a vision processing system in an old phone using the phone’s camera and use the phone’s: audio jack (think Square credit card reader), display (put a photo-transistor against the display and toggle the pixels) or charging/docking port (USB/debugging and with Apple be warned they have a licensed chip you might need to work around) to connect it to the rest of the system. I’ve been playing around with ways to do this since I helped create a counter-proposal against the NI RoboRio and it can and does work. In fact I can run the whole robot off an Android device itself (no cRIO or RoboRio).

Thanks for the suggestion! But it’s kind of pricey, when compared to Pi. Is it worth it?
Also, is the developing for CUDA any different from ‘normal’ developing?

Thank for the detailed info! But in this case, I guess I can just use a stripped-down classmate (we have 2 of those), or just any other mini laptop in order to do so (I guess that the Atom Processor is more then powerful enough in terms of computing power). Also, what platform did you use to develop the image processing code?

I would say the most bang for your buck is the Beaglebone black. 987 used it way back in 2012 with the kinect sensor. Very powerful, and if I can remember clearly, it has about 20fp/s. Maybe somebody can give a more accurate number, but it is plenty powerful. Same type of computer (rpi style microcomputer) that has ethernet for UDP communications.

Odroid and pdDuino are both good options too

RPis are okay. I hear most teams get anywhere from 2fp/s to 10fp/s (again all depending what you are doing). I would say for simple target tracking, you would get about 5fp/s.

I want to also start doing some vision tracking this year on another board. I would end up using the regular dashboard (or maybe modified a slight bit) with labview. I would be using a BeagleBone or maybe RPi just to start off. I don’t know how to use linux, which is my biggest problem. Anyone have any information on how to auto start up and use vision tracking on linux? I need something simple to follow.

We started off testing this idea when the COTS rules would allow a computing device several years ago (more than 3 years ago).

Our first tests were conducted on Dell Mini 9’s running Ubuntu Linux LTS version 8 which I had loaded on mine while doing development work on another unrelated project. The Dell Mini 9 is a single core Atom processor.

Using Video4Linux and OpenJDK (Java) the programming captain crafted his own recognition code. I believe that helped get him into college. It was very interesting.

We then tried a dual core Atom classmate and it worked better when his code was designed to use that extra resource.

Between years I slammed together a vision system using 2 cameras on a Lego MindStorm PTZ and used OpenCV with Python. With that you could locate yourself on the field using geometry not parallax.

Other students have since worked on other Java based and Python based solutions using custom and OpenCV code.

I have stripped parts out of OpenCV and loaded them into ARM processors to create a camera with vision processing within it. It was mentioned in the proposal I helped to submit to FIRST. I think using an old phone is probably more cost effective (they make lots of a single model of phone and when they are old they plummet in price).

OpenCV wraps Video4Linux so the real upside of OpenCV from the ‘use a USB camera perspective’ is that it will remove things like detecting the camera being attached and setting the modes. Still Video4Linux is pretty well documented and the only grey area you will find is if you pick a random camera. Every company that tries to USB interface a CMOS or CCD camera does their own little thing with the configuration values. So I suggest finding a camera you can understand (Logitech or PS3-Eye) and not worrying about the other choices. A random cheapo camera off Amazon or eBay might be a huge pain when you can buy a used PS3-Eye at GameStop.

Thanks for the info - actually, I have a friend which has a RPi (model b) lying around, I guess he will allow me to test with it. If it will not do, I’ll check bout the beaglebone.
Also, can someone answer my question about the bandwidth limit?

And I might be able to assist you with linux:
If I remember correctly, try to open the Terminal and run

sudo crontab -e

Then you will e able to edit the cron file, which is basically a file which automates tasks in linux systems. Add the following line to it:


The command you typed should be executed in every startup.

I have been CSA at several competitions over the years.
If you can avoid sending live video you depend on over the WiFi please do (I speak for myself not FIRST or 11/193 when I write this).

I can assure you what you think you have for bandwidth you probably do not have.
I can back that up with various experiences and evidence I have collected over the years.

If you must send something to the driver’s station send pictures one at a time over UDP if you can.
If you miss one - do not send it again.

I have no interest in hijacking this topic with any dispute over this (so if someone disagrees feel free to take this up with me in private).

It is worth it IF you want to process video and track an object continuously. As for power, the new voltage regulators’ 12 V 2A port will be more than enough. The jetson needs 12V and people have tested this thing running heavy duty vision applications on the GPU/CPU without cracking 1 amp.
It is soooooooooooooo easy to use the GPU. After the setup (installing libraries, updating, etc.) We were tracking red 2014 game pieces on the GPU within the next 30 min. We used the code from here: http://pleasingsoftware.blogspot.com/2014/06/identifying-balloons-using-computer.html Read trough this and the related get hub linked in the article.

Open CV has GPU libraries that basically work automatically with the Jetson.
You can see in the gethub of the above example as well the different compile command for activating GPU usage.

If you ever get to use that code on the Jetson note: The program in the above link opens up a display window for each step of the process and closing the displays speeds up the program from 4fps with all open to 16fps with only the final output open. I presume with the final output closed and no GUI open (AkA how it would be on a robot) it would be much faster. Also we used this camera and were set to 1080p for the test: http://www.logitech.com/en-us/product/hd-pro-webcam-c920

Yeah, people have told me basically what you said here, but I asked if local LAN (the camera is connected with Ethernet cable to the router on the robot which connects also with Ethernet to the RPi/NVIDA thing/any other board) counts as networks badwidth? (It does not seems likely, but want to be sure about this).

Cool, thanks! I’ll look into this. If I’m doing processing in a resolution of, let’s say, 1024*768 (more seems too much), how many FPS will I get? (approx.)

Depends on the mode of the D-Link in previous years. In bridge mode anything on the wired Ethernet will likely go on the WiFi. In routed mode the D-Link is a gateway and therefore things not directed to it or the broadcast should go through the D-Links internal switch but not necessarily over the WiFi.

This question greatly depends on how you achieve this.
If you do it in compiled code you can achieve 5fps or more easily (with reduced color depth).
If your CPU is/are slow or your code is bad then things might not work out so well.

Anything you send over TCP/IP, TCP/IP will try to deliver and once it starts that it is hard to stop it (hence reliable transport). With UDP you control the protocol so you can choose to give up. This means with UDP you need to do more work. Really - someone should do this and just release a library then it can be tuned for FIRST specific requirements. I would rather see a good cooperative solution that people can leverage and discuss/document than a lot of people rediscovering how to do this in a vacuum over and over.

I will put an honorable mention here for VideoLAN(VLC) as far as unique and interesting ways to send video over a network.
Anyone interested might want to look it over.

When we tried going down to 480p. Our frame rate did not improve per say however the capture time went down which is very important for tracking. That said our test wasn’t extensive so the are other factors at play. It may or may not improve overall performance.

The optimal 'board 'for vision processing in 2015 is very, very likely to be (a) the RoboRio or (b) your driver station laptop. No new hardware costs, no worry about powering an extra board, no extra cabling or new points of failure. FIRST and WPI will provide working example code as a starting point, and libraries to facilitate communication between the driver station laptop and the cRIO exist and are fairly straightforward to use.

In all seriousness, in the retroreflective tape-and-LED-ring era, FIRST has never given us a vision task that couldn’t be solved using either the cRIO or your driver station laptop for processing. Changing that now would result in <1% of teams actually succeeding at the vision challenge (which was about par for the course prior to the current “Vision Renaissance” era).

I am still partial to the driver station method. With sensible compression and brightness/contrast/exposure time, you can easily stream 30fps worth of data to your laptop over the field’s wifi system, process it in in a few tens of milliseconds, and send back the relevant bits to your robot. Round trip latency with processing will be on the order of 30-100ms, which is more than sufficient for most tasks that track a stationary vision target (particularly if you utilize tricks like sending gyro data along with your image so you can estimate your robot’s pose at the precise moment the image was captured). Moreover, you can display exactly what your algorithm is seeing as it runs, easily build in logging for playback/testing, and even have “on-the-fly” tuning between or during matches. For example, on 341 in 2013 we found we frequently needed to adjust where in the image we should try to aim, so by clicking on our live feed where the discs were actually going we recalibrated our auto-aim control loop on the fly.

If you are talking about using vision for something besides tracking a retroreflective vision target, then an offboard solution might make sense. That said, think long and hard about the opportunity cost of pursuing such a solution, and what your goals really are. If your goal is to build the most competitive robot that you possibly can, there is almost always lower hanging fruit that is just as inspirational to your students.

So… to summarize, I will always be able to choose NOT to send data over WiFi?
Is it ‘safe’ to develop a high-res/fps vision system which all its parts are physically on the robot (i.e. the camera and the RPi)? By this question I mean that suddenly in the field I will discover that all the communication actually goes through the field wifi and hence the vision system is unusable (because I have limited WiFi bandwidth - which I never intended to use in the first place).

Wow, thanks! And yes, in the beginning I intend to only develop recognition of the retroreflective strips. Well, I’ll talk with the other guys in the programming team and we"ll see about this. The major goal (at least for this time) of my planned vision system is to assist the driver is scoring - that it will slightly fix the position of the robot and therefore be more precise.

Do you have any evidence of this? Bridge mode does not mean that the D-Link acts as a hub and blasts data anywhere. It still has an internal switch and will only send the data to the appropriate port. If both connections are Ethernet, it won’t send the data through WiFi. The only exception is broadcast packets.

Even if the D-Link were to repeat the packets you send over the D-Link switch intended only for the wired ports, if the packets are UDP it has no effect on the onboard interconnection. The worst that happens is you hit the field side with UDP and frankly if they are monitoring remotely as they should they can filter that if it really came down to it.

Also you do not need to send much over the D-Link switch unless you are sending video to the driver’s station. In fact you can avoid the Ethernet entirely if you use I2C, the digital I/O or something like that.

So you should be good. Just be careful to realize that if you do use Ethernet to do this - you are using some of the bandwidth to the cRIO and RoboRio and if you do this too much you can cause issues. You do not have complete control over what the cRIO/RoboRio does on Ethernet especially when on a regulation FIRST field.

I believe there is a relevant example of what I mean in the Einstein report from years past.

Yes a properly working ARP table in a switch should work that way.

D-Link has had issues with this in the past. Hence they deprecated the bridge feature from the DIR-655.

There are hints to this floating around like this:

Also this (odd is it not that the described broadcast does not pass…)

Can you elaborate what I2C is? (again- I’m new to FRC. Sorry if this is a noob question!)

I2C is the telephone style jack on the bottom left in this picture of the digital side car.


I do not want to hijack your topic on this extensively.
So I will simply point you here:

It is basically a form of digital communication.
To use it from a laptop you would probably need a USB interface for I2C and they do make things like this COTS.

I²C (or I2C) is a really simple way to communicate. Most of the times it’s between 2 microcontrollers because it is easy and fast to do. I hear it is pretty hard to do on the cRIO (hopefully it is easier to do on the roboRIO).

What I would do is try and get a DAC (digital to analog converter) and on the RPi or Beaglebone. That way you can hook it up straight to the analog in on the roboRIO and use a function to change the analog signal back into a digital signal. I feel like it would be an easy thing to do. (especially if you are doing an auto center/aim system. You could hook a PID loop right up to the analog signal (okay, maybe a PI loop, but that can still give you good auto-aiming))

I also completely forgot about the laptop driver station option. Although it is not the fastest method, vision tracking on the driver station is probably the easiest method.

Also, the RoboRIO has 2 cores, so maybe you can dedicate one core to the vision tracking and that way there is practically 0 latency (at least for communications)

I would hope the Java vision examples for the RoboRio have improved.
On Team 11 we had very little luck getting the cRIO to handle all we asked from it with Java doing vision and everything else and in some example cases even getting the examples to work.
The RoboRio is faster so that will help. It is less picky so that will also help.

I believe I asked around on ChiefDelphi in the past for Java vision examples for the cRIO that actually work.
I would love to see working Java vision examples on the RoboRio. Perhaps video of them working.

I have not involved myself with the beta work MORT is doing on the RoboRio and Java in this regard so it may exist.

Changing that now would result in <1% of teams actually succeeding at the vision challenge (which was about par for the course prior to the current “Vision Renaissance” era).

I did a survey last year at several events regarding what people were using for vision processing. It was not asked of the CSA but since I was asking other questions I asked or looked. I think you would be surprised at how many teams made the coprocessor work without major headaches.

I also spent part of each competition chasing around teams sending video back to the driver’s station in ways that messed with the field at the request of the FTA. Very competent people were having issues with this so I do not think it is quite so cut and dry. If anyone wanted I could toss that data together for the events at which I volunteered in MAR.


I would like to add there is a hidden cost to the embedded and single board computers.
It is the same hidden cost of the cRIO especially back 3 or more years ago when the 8 slot cRIO was the FIRST approved system.

How many of these do you have knocking around to develop on?

Think about it: general purpose laptops are plentiful and therefore anyone with a laptop (increasingly all the students in a school) could snag a USB camera for <$30 and start writing vision code. If you are using old phones you can get the whole package probably for $100 or less and probably your students are already glued to the phones they use too often every day now.

On the other hand if you buy a development board for nearly $200 how many people can actively develop/test on it?
If a student takes that board home how many other students can work and be confident that what they are doing will port to that system?
Vision is a big project and more importantly you can often play a FIRST game without it.

Is it better to use laptops you probably already have or commit to more proprietary stuff you might have to buy multiple of and then…if that product goes out of production…do that all over again if you even really use it? Is the cost justifiable?