Opencv - Java Vision Tracking

Hi guys, I am the head programmer of team 4619. Our team has never used vision tracking before, and we are trying to have basic vision processing code this year(or next year). However, this is my first year in programming, so vision is a really huge mountain for me to conquer. It would be really helpful if someone can summarize the code structures for vision (java/opencv) and concepts. However, we are open to any method of vision processing.
Thank you so much! :slight_smile: :slight_smile: :slight_smile:

p/s: We have jetson nvdia in stock

Grip… on driver station laptop.

No, hear me out. It’ll get you up and running, the feedback is instant, and you avoid a lot of the big pain points. But while you’re futzing with stuff you’re learning the process of how to process an image to extract the info you need. It’s the best kind of learning.

Check out all the articles under this link. If you want more control, combine the Java output of a generated GRIP pipeline, all the resources from above, and the OpenCV API Reference, to learn more about the basics of detecting and filtering contours. As said by Andrew, GRIP is a great tool for getting something up and running relatively quickly. In most cases, you don’t need much more than this to get something to track a target.

Another great resource would be team 254’s 2016 presentation. Their slides are here, and the video is here. You can learn a lot from it about the general methods of processing, though you’ll have to look somewhere else for the programming side of things.

1 Like

Thank you so much for the recommendation!

I echo the GRIP suggestion. Very easy to create your pipeline and visually see it applied. We do that, then once we have the pipeline coded, pop it into a Java project with a main program and build a process that runs on a Raspberry Pi 3 to process the images and produce steering information for the RoboRio.

I guess the decision comes down to speed. There are three real options, at this point in build season, I highly recommend you spend your time optimizing your current code. Vision takes a lot of time, whether you use GRIP or OpenCV. If you decide to develop vision in the off season, which I highly recommend, I would suggest that you use a co-processor. The Pi is easy to use and PiNetworkTables works extremely well on it, but can be slow if not multi-threaded. The NVIDIA Jetson Tegra K1 is extremely fast, but takes and understanding on Linux and a competent electrical department, as it requires clean 12V. You can always run it on the Driver Station, but you can lose speed as it requires you to send the image over Wifi, which can be a severe bottleneck. I personally recommend the Tegra, as it’s the fastest and most powerful, but it requires some work. If you are interested in this I would be happy to help you.

TL;DR: A co-processor such as the Tegra or the Pi is the fastest option and I have experience with this so if you need help I am happy to work with you.

@Maxcr1: are you actually using the CUDA cores on your Tegra? if so, what language are you writing in?

I apologize for the extremely late response, but Vision is a hot topic on CD right now, so I figured it could be useful for someone.

NVidia actually has a release of OpenCV that is optimized for the Tegra’s CUDA cores. When you go through the JetPack utility designed to install Ubuntu on the Tegra, it gives you an option to install Jetson-Optimized OpenCV. As RoyalVision (our vision software) is written with OpenCV libraries, I was able to use these libraries for seamless use on the Jetson Tegra

What you’re probably thinking of is that some of the function use NEON - which is ARM’s version of vector instructions (think SSE or AVX on Intel). That optimization comes for free. The CUDA support in OpenCV requires explicit coding to use … it isn’t really seamless at all. Plus, last time I looked there weren’t even Java bindings for the GPU functions.

Many of the functions are similar to ones which run on the CPU but there is some effort to get up and running. They also have limitations different from and behavior slightly different than CPU functions. Also, given my experience, it takes a decent bit of understanding to make some types of algorithms run faster on CUDA than on the CPU - a straightforward implementation will often lead to little or no improvement.

That being said, the TK1 or TX1 CPU is fast enough for most normal vision processing tasks. Anything faster than camera frame rate is wasted, and the typical filter/threshold/contour/track pipeline code is faster than input rate.