Chief Delphi

Chief Delphi (http://www.chiefdelphi.com/forums/index.php)
-   Programming (http://www.chiefdelphi.com/forums/forumdisplay.php?f=51)
-   -   Vision: what's state of the art in FRC? (http://www.chiefdelphi.com/forums/showthread.php?t=130777)

MrRoboSteve 10-10-2014 12:37

Vision: what's state of the art in FRC?
 
Our team is wanting to get serious about vision this year, and I'm curious what people think is the state of the art in vision systems for FRC.

Questions:

1. Is it better to do vision processing onboard or with a coprocessor? What are the tradeoffs? How does the RoboRIO change the answer to this question?

2. Which vision libraries? NI Vision? OpenCV? RoboRealm? Any libraries that run on top of any of these that are useful?

3. Which teams have well developed vision codebases? I'm assuming teams are following R13 and sharing out the code.

4. Are there alternatives to the Axis cameras that should be considered? What USB camera options are viable for 2015 control system use? Is the Kinect a viable vision sensor with the RoboRIO?

billbo911 10-10-2014 13:26

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by MrRoboSteve (Post 1403753)
Our team is wanting to get serious about vision this year, and I'm curious what people think is the state of the art in vision systems for FRC.

Questions:

1. Is it better to do vision processing onboard or with a coprocessor? What are the tradeoffs? How does the RoboRIO change the answer to this question?

2. Which vision libraries? NI Vision? OpenCV? RoboRealm? Any libraries that run on top of any of these that are useful?

3. Which teams have well developed vision codebases? I'm assuming teams are following R13 and sharing out the code.

4. Are there alternatives to the Axis cameras that should be considered? What USB camera options are viable for 2015 control system use? Is the Kinect a viable vision sensor with the RoboRIO?

Solid questions Steve!

I can not speak for all of FRC. I can only speak from the perspective of a 3 time award winner for the vision/object tracking system we developed and used in 2014.

I'm going to try to answer your questions, from our perspective, in order.

1) Up through 2014, it was "better" to use a co-processor. Vision tracking on the cRio will work, but unless it is done VERY CAREFULLY and with limited requirements, it could easily max out the processor. We did use it successfully a few years ago, but we really limited the requirements. Since 2013, we started using a PCDuino (This link is to a retired version, but it is the version of the board we used.).
How does the RoboRIO change the answer to this question? Sorry, I can't say either way. We are not a Beta team, so we have no direct experience with the RoboRio. My guess is, it will have "much better" performance, but how much better remains to be seen. (I too am curious!)

2) We used OpenCV running under Ubuntu. Our scripts were written in Python. We will likely stick with this for 2015, but that determination is yet to be made. The PCDuino offers excellent access to the GPIO pins, thus allowing us to do some really neat software tricks.

3) Our code was shared in this post. That post is the main reason we won the "Gracious Professionalism Award" on Curie this year. This code is board specific, but can easily be modified to run on many different boards.

4) Sorry, without being a Beta team, we can not address this question.

faust1706 10-10-2014 13:31

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by MrRoboSteve (Post 1403753)
Our team is wanting to get serious about vision this year, and I'm curious what people think is the state of the art in vision systems for FRC.

Questions:

1. Is it better to do vision processing onboard or with a coprocessor? What are the tradeoffs? How does the RoboRIO change the answer to this question?

2. Which vision libraries? NI Vision? OpenCV? RoboRealm? Any libraries that run on top of any of these that are useful?

3. Which teams have well developed vision codebases? I'm assuming teams are following R13 and sharing out the code.

4. Are there alternatives to the Axis cameras that should be considered? What USB camera options are viable for 2015 control system use? Is the Kinect a viable vision sensor with the RoboRIO?

There are a number of threads that already talk about computer vision in FRC, however, it never hurts to talk about it every once in awhile as the participants of the competition are always changing.

I'll go through your questions one at a time.

1. I do not know if this question has an objective answer. Teams have had success with with the cRIO, off board, or using a second computer like the PI or an o-droid. The trade off for using an on board computer is obviously weight, but PI weighs something like 200g so that shouldn't concern you. You have to somehow communicate what the vision system outputs to your cRIO, for the past 3 years we've been doing a simple udp.

As for off board, I have heard it can be slow, and with the cap on data per team, I personally think it is no long a valid option.

Using the cRIO, and programming it in labview or something, is doable, and there are plenty of resources around to do so, but the truth is it won't be as fast a something running on an SBC.

I haven't looked at the specs of the roboRIO, so my opinion on that shouldn't be taken seriously. I heard it is "linuxy," what have yet to get a clear definition of what that actually means. It would be cool to make your vision program an application and run it on the roboRIO, but I don't know if that is doable.

2. As for vision libraries, any will do, but some have more support than others. I don't know the state of openni, but last time I checked it was no longer funded. OpenCV has been around since....I want to say 1999. It is used in industry and academia and that is the one I suggest using, but of course there exists bias. I played around with roborealm a tad, but it seems too simple, I feel someone using it wouldn't get the fundamental understanding of what is happening, but if you are just looking to figure out how far away you are from something and don't really care about understanding, then I suggest that.

There is always the option of writing your own computer vision library, of course, but I'd be willing to bet money it wouldn't be as encompassing or efficient as existing open sourced cv libraries, like ni and opencv.

3. I like to think computer vision is a little unique. Yeah, you can develop code bases, but the vision task changes so dramatically over the years, except for game piece detection, that I don't think it'd be worth it. Take opencv for example. It is an extremely developed library that now focuses on readability and ease of developing. If you really know what you're doing, you can solve the vision program in 100 lines of code, if you don't know what you're doing, it can take upwards to 2k. You could maybe group some routine operations, such as binary image morphology, into one step, but even then you'd only be saving one line of code.

Once you get a game piece detection code, however, that can just be "pulled from the shelf" and used year after year.

4. As long as your library can get the image from the camera, it really doesn't matter which one you use. The common option is the axis camera. We've used the kinect for 2 years, then this past year we used 3 120 degree cameras (it was more a test of concept for future years). If you want to do some research on hardware, look up: asus xtion, pixy, playstation eye, kinect, and keep and eye out for the kinect one to be "hacked" and thus being usable.

I feel that computer vision is a developing aspect of frc. My freshman year it was logo motion. I didn't ask around in the pits, but I didn't see many cameras to my memory. When 1114 and 254 were doing multiple game piece autonomous (in the einstein finals, mind you), I think it really inspired people I think, as did 254's autonomous routine this year. Just look at all path trajectory stuff that has been posted over the path several months.

This is my 2 cents on the topic. Take it how you will. If you have any questions, or want to see some code from my previous vision solutions (from 2012-2014), I'd be more than happy to go over it with you.

I emailed a first rep requesting they consider an aerial camera for future games, and if they do that, or allow teams to put up their own aerial camera, then on board vision systems become obsolete because essentially you have an "objective" view, instead of a "subjective" one.

Greg McKaskle 10-10-2014 16:10

Re: Vision: what's state of the art in FRC?
 
A few clarifications and a few of my own answers.

NI Vision isn't the same as openni.

The OS on the roboRIO is described in quite a bit of detail here http://www.ni.com/white-paper/14627/en/

The code is on github.

1. There are tradeoffs here that depend on the robot, the strategy, and the team's resources. All have been made to work. The roboRIO is somewhere between 4x and 10x faster. It has two cores and includes NEON instructions for vector processing.

But vision is a hungry beast and will bring any computer to its knees if you just brute force it. The "right" solution depends on where the camera is mounted, where it is pointed, lighting, lenses, and of course processing. Makes it challenging, huh.

2. All of them are capable of solving the FRC imagining problems quite well, IMO.

3. I've seen plenty who make it work.

4. The WPI libraries will continue to support Axis and will support many UVC cameras via NI-IMAQdx drivers. This means that many USB webcams will work. Additionally, we have also tested the Basler USB3 industrial cameras. I have not tested Kinect directly on the roboRIO, though if you google myRIO and Kinect you will see others who have.

Greg McKaskle

ToddF 10-10-2014 18:13

Re: Vision: what's state of the art in FRC?
 
One word: CheesyVision

(From the perspective of a Mech E)

faust1706 10-10-2014 19:46

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by Greg McKaskle (Post 1403791)

But vision is a hungry beast and will bring any computer to its knees if you just brute force it. The "right" solution depends on where the camera is mounted, where it is pointed, lighting, lenses, and of course processing. Makes it challenging, huh.

I don't think people realize this. After the image is acquired, it is literally pure mathematics. If you aren't careful, as in don't put in a wait function, your computer will try to execute the program as fast as possible, and the program is basically in an infinite loop until you somehow close the program. (I believe the majority of teams don't gracefully exit their program after a match, but instead simply power off).

To give an example, when you have a binary image you desire, as in your targets are white (1) and everything else is black (0), you preform the operation described in this paper (http://wenku.baidu.com/view/6cb52ede...aa811dad5.html) This paper was published in 1983. There do exist other methods, but the computational cost is about the same. (See also: http://en.wikipedia.org/wiki/Edge_detection).
Basically what this step does is reduces a 640x480 (or whatever resolution image you have) matrix to a sequence of curves that allow you to do cool things, such as recursively approximate a polygon to guess how many sides it has.

yash101 10-10-2014 21:01

Re: Vision: what's state of the art in FRC?
 
What is the purpose? Do you want to process a handful of images at the beginning of the match, or do you want to continually process images and perform AI like pathfinding? Those questions are extremely important for answering many of your questions. The first option should not require more power than the RoboRIO offers. The second option would probably benefit from a coprocessor. These tiny boards can offer a lot of power, easily many times the available oomph of the RoboRIO.

Running OpenCV on the RoboRIO is far from realistic. The hard disk space is quite limited, and as it seems to me, there will be missing pieces of linux which will make it extremely difficult to compile all the dependencies of OpenCV.

Also, I think OpenCV can be a tad inefficient in general. I was running a small test and OpenCV's windows maxed out one of my system's cores (i3)!

I really think converting color spaces, thresholding and some of these relatively simple transformations should be much faster, especially because the formula should be constant.

faust1706 10-10-2014 22:11

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by yash101 (Post 1403817)
Also, I think OpenCV can be a tad inefficient in general. I was running a small test and OpenCV's windows maxed out one of my system's cores (i3)!

I really think converting color spaces, thresholding and some of these relatively simple transformations should be much faster, especially because the formula should be constant.

It can be inefficient, but it great at what it does. It is an open source library that people volunteer their time to contribute to. I think I'm going to be able to get away with taking 6 credit hours next semester for doing research in the field of computer science. That means I'll have dedicated time to work on stuff, like optimizing the opencv library (or just the functions I use regularly, such as erode, dilate, find contours, appoxpoly, solvepnp, and optical flow)

I've still not looked at the source code for opencv in depth. I really hope some things in it aren't parallel-ly computed because I love doing parallel computing and it'd give me great practice.

Like you mentioned...2 months ago to me, it spawns threads when it'd be quicker not to do it. Maybe I could simple put a condition that if the image resolution < (x,y), then don't spawn another thread.

Caboose 10-10-2014 23:43

Re: Vision: what's state of the art in FRC?
 
Team 900 this year is currently working with a nVidia Tegra TK1(Quad ARM Cortex-A15 with Kepler GPU with 192 CUDA cores, <$200) to tackle vision processing. So far we are seeing good results with high FPS(>40) with large image resolutions(720p) doing just filtering at the moment all on the GPU with OpenCV. We are working on tracking next.

Abhishek R 11-10-2014 00:02

Re: Vision: what's state of the art in FRC?
 
Answering question 4, the Kinect is a viable means of vision sensing. I'd recommend checking out this paper from Team 987, who used the Kinect very effectively as a camera in 2012's FRC challenge, Rebound Rumble. I believe one of the major advantages of the Kinect is it's depth perception is much better than a standard camera, though I'm not really a vision expert.

faust1706 11-10-2014 00:24

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by Abhishek R (Post 1403834)
I believe one of the major advantages of the Kinect is it's depth perception is much better than a standard camera, though I'm not really a vision expert.

I don't really understand what you mean by depth perception. It has a depth map, if that is what you mean. The kinect is a great piece of hardware because it has an rgb, ir, depth camera, and a built in IR light (though I suggest adding some IR leds if you're going the IR route). It is fairly cheap now, and there has been a lot of work done with it, but if you're using only the rgb camera, the only thing different is how much it distorts the image. All the cameras you will ever think about using basically have the same amount of distortion, so that shouldn't be a concern. If it is, then calibrate your camera:

http://docs.opencv.org/doc/tutorials...libration.html

But I'm going to say that it is a waste of time unless you are doing an camera pose calculation.

The kinect does have a subtle advantage over other cameras, like the axis: It draws attention to it. Little kids love seeing things they see everyday used in a way they never thought possible.

mwtidd 11-10-2014 08:29

Re: Vision: what's state of the art in FRC?
 
1. For me, I think this mostly depends on whether or not you want to processes as you move. If you're able to process from a stopped position, then you should be fine processing from the crio. Pray for more reflective tape, because that stuff makes life 100x easier. Also make sure you grab a camera that you can modify the exposures on, and ideally one that will hold those exposure settings after a reboot.

That being said, if you're looking to process while moving, I'd recommend going with a coprocessor. Where it may be feasible to do it on the roborio, its probably safer to do it on a coprocessor, and you'll get much better performance/resolution utilizing a coprocessor. Like 900 I'm very excited about the Jetson TK1, and with the new voltage regulator, getting set up with it should be simplified.

2. I have stayed away from RoboRealm. This is more of a personal preference but any software that requires licenses immediately makes it more challenging to work in a distributed fashion. This year we utilized both NI Vision and Open CV. The NI Vision library was used for hot and cold, and Open CV for Java was used for distance to target calculations. I was fond of this set up, because it allowed us to work in the same language offboard and onboard. That being said, I think the advantages the TK1 presents may outweigh the advantages of working in a single language.

NI Vision was a lot easier to get started with. Once the camera was calibrated correctly, it was relatively simple. We spent 90% of our time trying to make it more efficient (essentially a moot point). Once we landed on only doing hot cold detection onboard, that made things easier.



Open CV took a lot more to get set up with. If you haven't done vision processing in the past, and opt to go with something like open cv, be prepared to spend most of the season on it and even with that, it may never find its way on the robot. In the backpacking world we have an expression "Ounces make pounds", and it's definitely true in first. Expect a coprocessor to be one of the first things to go when you have to shed weight.

3. They're out there, but you'll have to dig, and you'll have to dig through full code bases to find it. From what I've seen teams often release their whole code base as a whole rather than a set of libraries.

4. As many others, I'm not on a beta team, but if you're using NI Vision, the Axis Camera is actually a fairly nice solution. The big things I've found is that WPIlib is already configured for the Axis Camera specs. Also, it holds it's values after a reset, so you don't have to worry about your exposure settings resetting in between matches.

All that being said, I think 900 is probably a good reference point for the state of the art solution/execution. The way they implemented their single frame vision processing was rather clever. Also, it sounds like they're following the FIRST roadmap by getting started on the Jetson TK1 early. I think that is where a lot of the beneficial state of the art stuff is going to go. Teams have been utilizing the kinect and doing object detection for several years now, but it seems to me these are more bullet points for awards as opposed to practical implementations.

For me, the state of the art stuff I'm interested in, is that which eases implementation while addressing the various constraints for the platforms we work on.

yash101 11-10-2014 19:29

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by faust1706 (Post 1403824)
It can be inefficient, but it great at what it does. It is an open source library that people volunteer their time to contribute to. I think I'm going to be able to get away with taking 6 credit hours next semester for doing research in the field of computer science. That means I'll have dedicated time to work on stuff, like optimizing the opencv library (or just the functions I use regularly, such as erode, dilate, find contours, appoxpoly, solvepnp, and optical flow)

I've still not looked at the source code for opencv in depth. I really hope some things in it aren't parallel-ly computed because I love doing parallel computing and it'd give me great practice.

Like you mentioned...2 months ago to me, it spawns threads when it'd be quicker not to do it. Maybe I could simple put a condition that if the image resolution < (x,y), then don't spawn another thread.

I've been working on a library extension for OpenCV that adds features (there's currently no vanilla way to rotate an image x number of degrees!). I am also going to use OpenMP to attempt to create extremely high-performance functions that utilize all cores on your system!

The nice thing about OpenMP is that if it is not available, everything will work properly. The function won't be multithreaded though!

OpenCV actually has three modes for threading, as I remember. OpenCV can use OpenMP and IntelTBB and disabled.

I guess that you could use C++11 threading to parallelly perform many simple tasks. It is quite cool how the threading can be done with lambdas:
PHP Code:

std::thread x([](){printf("Hello World, from a thread :D");}); 

or using regular function pointers:

PHP Code:

void hello()
{
    
printf("Hello World, from a thread :D");
    return;
}

int main()
{
    
std::thread x(hello);
    
x.join(); //We're not doing anything new afterwards, so the thread would crash if we did not join it!


Instead of just skipping threading, better threading techniques should be used! Let's say cvtColor:
If the res is less than 32 by 32 (let's just say), then skip threading. This resolution should be quite low because threading is actually incredibly fast! Use libraries like pthread because of their speed. News Flash: C++11 threading under POSIX uses pthread :D.
What I would do is use divide the image up into equal parts, with the denominator as the hardware concurrency. I would then run my own version of cvtColor with no optimizations on that small image. Afterwards, I would stitch those images back together to return. Voilla! You have a cvtColor function that is highly optimized using your hardware concurrency. This really dictates how many threads the computer can run TRULY parallelly (not switching back and forth in threads!).

I believe the DLib has some optimized color conversion code too. Just convert the Mat to cv_image and it should work beautifully:
dlib::cv_image<dlib::bgr_pixel> image(cv::imread("image.png"));

Good luck! Maybe we can work on this stuff together!

-------------

As Lineskier mentioned, OpenCV is kind of difficult to get started with!
This is true. OpenCV is quite difficult to set up and start running. However, once the setup is complete, it is an extremely easy-to-learn library, especially because of it's documentation. There's a 600 page manual hat explains nearly every function. Just use [CTRL/CMD] + [F] and you should be good :D
However, I faced these chalenges with OpenCV, so I have a working install script that downloads my code and everything. Feel free to use the script under Debian/GNU Linux. If you don't want my old code, just remove those lines. I might remove them myself as they download a lot of code and fill up your drive!

If you want, Go ahead and check out my GitHub (@yash101)! I have all sorts of OpenCV programs!

install script:
PHP Code:

wget https://gist.github.com/yash101/10190427/raw/2926009fcf9cb0278d523a0e374d760dbbb92bcd/install.sh && chmod +x install.sh && ./install.sh 

Also, i spent around 3 hours a day during build season last year with OpenCV as I was learning it and coding with it at the same time. I also hadn't coded in C++ in 4 years so I lost all my good coding skills and had to recode the entire app a couple times. I at least got rid of the 1500 lines of code in int main() :D!

I am also working on the DevServer, which hopefully should make it much less of a hassle of granting the cRIO with data!

controls weenie 12-10-2014 18:10

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by Caboose (Post 1403831)
Team 900 this year is currently working with a nVidia Tegra TK1(Quad ARM Cortex-A15 with Kepler GPU with 192 CUDA cores, <$200) to tackle vision processing. So far we are seeing good results with high FPS(>40) with large image resolutions(720p) doing just filtering at the moment all on the GPU with OpenCV. We are working on tracking next.

That's great that you are learning the nVidia Tegra TK1. I would like to know more about your experiences with this device. Will you post some example code? Are you using C++ or python?

We used the PCDuino last year and we downloaded code from Billbo911's (above) website. His code is written in python and very easy for the kids to understand. It worked very well last year. We tracked balls, reflective tape and bumper reflectors. We will probably use the PCDuino and Bill's code again this year.

We used a USB web cam but had do down sample to 480x320 to maintain 30Hz with our image processing output. OpenCV and python work very well but you have to be careful because python loops can slow you down.

One thing that I did not see mentioned in this thread is how to enhance the reflective tape image. We used white ring lights last year which is very WRONG! There is so much ambient white light that we had a terrible problem tracking the reflective tape. I recommend using 3 green ring lights. Then pre filter only green pixels. You can buy small diameter, medium and large. The small fits in the medium and the medium fits in the large.

marshall 12-10-2014 19:31

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by controls weenie (Post 1403991)
That's great that you are learning the nVidia Tegra TK1. I would like to know more about your experiences with this device. Will you post some example code? Are you using C++ or python?

We used the PCDuino last year and we downloaded code from Billbo911's (above) website. His code is written in python and very easy for the kids to understand. It worked very well last year. We tracked balls, reflective tape and bumper reflectors. We will probably use the PCDuino and Bill's code again this year.

We used a USB web cam but had do down sample to 480x320 to maintain 30Hz with our image processing output. OpenCV and python work very well but you have to be careful because python loops can slow you down.

One thing that I did not see mentioned in this thread is how to enhance the reflective tape image. We used white ring lights last year which is very WRONG! There is so much ambient white light that we had a terrible problem tracking the reflective tape. I recommend using 3 green ring lights. Then pre filter only green pixels. You can buy small diameter, medium and large. The small fits in the medium and the medium fits in the large.

We're using OpenCV in C++ so example code is plentiful around the web. Our students have just now started to use it so we don't have anything to share just yet. If we make progress to the point where we can share it then we will, probably towards the end of build season.

The big deal with the TK1 is that it has the ability to use the GPU to assist with offloading work. To my knowledge, there is no method to use the GPU assisted functions for OpenCV with Python currently but that might be changing with the 3.x code release around the corner. We're using the 2.4.x code right now.

C++ is what we are using for the GPU integration as of right now because you have to manually manage the memory for the GPU and shuffle images onto it and off of it as you need them. Nvidia has a decent amount of resources out there for the Jetson but it is definitely not a project for those unfamiliar with linux. It's not a Raspberry Pi and not anywhere near as clean as a full laptop. To get it working you have to do a bit of assembly. It's a nice computer, just not as straight forward as a Pi or a PCDuino or any of the others that have larger user bases. There are also problems running X11 on it so you really need to run it headless (Nvidia writes binary blob graphics drivers for linux that are not super stable).

We're aiming for full 1080 but depending on the challenge we will likely have to down sample to 720 to get it to work with the frame rates we need.

Granted, this is all off-season right now and we have a lot of testing to do between now and the events before any of this is guaranteed to go on the robot. For all I know FIRST is going to drop vision entirely... I mean, cameras don't work under water do they?

yash101 12-10-2014 20:20

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by Abhishek R (Post 1403834)
Answering question 4, the Kinect is a viable means of vision sensing. I'd recommend checking out this paper from Team 987, who used the Kinect very effectively as a camera in 2012's FRC challenge, Rebound Rumble. I believe one of the major advantages of the Kinect is it's depth perception is much better than a standard camera, though I'm not really a vision expert.

That is quite an old document. OpenKinect has changed significantly and is much harder to use now! The documentation kind of sucks as the examples are all in very difficult (for me) C!

The greatest problem with the Kinect was getting it to work. I have never succeeded in opening a kinect stream from OpenCV!

The depth map of the Kinect is surprisingly accurate and powerful!

As of last year, thresholding was the easy part :)! Just create a simple OpenCV program to run on your PC, to connect to the camera and get video! Create sliders for each of the HSV values, and keep messing with one bar until the target starts barely fading! Do this for all three sliders. You want to end with the target as white as possible! It is OK if there are tiny holes or 1-4 pixels in the target not highlighted. Next, perform a GaussianBlur transformation. Play around with the kernel size until the target is crisp and clear!

Last year, I use std::fstream to write configuration files. It is a good idea, unless you get a program that has a much better configuration parser! Just write the HSV values to the file and push it onto your processor! Voilla! You have your perfect HSV inrange values!

Hunter mentioned to me, last year, that when at the competitions, as soon as possible, ask field staff if there will be time where you will be able to calibrate your vision systems! At the Phoenix regional, this was during the first lunch break! USE THAT PERIOD! Take the bot on the field and take a gazillion pictures USING THE VISION PROCESSOR CAMERA, so when you aren't under as much stress, you can go through a couple of them at random locations and find the best values!

As I mentioned before, and will again in caps lock, underline and bold:
SET UP A CONFIGURATION FILE!

This way, you can change your program without actually changing code!

controls weenie 12-10-2014 20:50

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by marshall (Post 1403998)
We're using OpenCV in C++ so example code is plentiful around the web. Our students have just now started to use it so we don't have anything to share just yet. If we make progress to the point where we can share it then we will, probably towards the end of build season.

The big deal with the TK1 is that it has the ability to use the GPU to assist with offloading work. To my knowledge, there is no method to use the GPU assisted functions for OpenCV with Python currently but that might be changing with the 3.x code release around the corner. We're using the 2.4.x code right now.

C++ is what we are using for the GPU integration as of right now because you have to manually manage the memory for the GPU and shuffle images onto it and off of it as you need them. Nvidia has a decent amount of resources out there for the Jetson but it is definitely not a project for those unfamiliar with linux. It's not a Raspberry Pi and not anywhere near as clean as a full laptop. To get it working you have to do a bit of assembly. It's a nice computer, just not as straight forward as a Pi or a PCDuino or any of the others that have larger user bases. There are also problems running X11 on it so you really need to run it headless (Nvidia writes binary blob graphics drivers for linux that are not super stable).

We're aiming for full 1080 but depending on the challenge we will likely have to down sample to 720 to get it to work with the frame rates we need.

Granted, this is all off-season right now and we have a lot of testing to do between now and the events before any of this is guaranteed to go on the robot. For all I know FIRST is going to drop vision entirely... I mean, cameras don't work under water do they?

Oh yeah...I forgot about the water issue:)

I see an issue getting a USB camera driver reading the image more than 30Hz. This was an issue with the PCDuino and our web cam last year. The Ubuntu USB driver would not feed the processor more than 30Hz. Dumping the images from RAM to GPU could be a bottle neck because of the huge sizes of the frame buffers.

I used python binding at work to copy data to (and from) the GPU queue. Python might be easier for the kids to use if it is available. I wonder if you can use OpenCL on the TK1 dev kit? OpenCL might give you the OpenCV/python bindings on that OS.

I hope FIRST continues to have image processing during the games. Some of the kids enjoy that more than any other task. Good luck with the TK1.

faust1706 12-10-2014 21:02

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by marshall (Post 1403998)
For all I know FIRST is going to drop vision entirely... I mean, cameras don't work under water do they?

AUVs (autonomous underwater vehicle) are gradually developing vision systems. A big problem is correcting the color distortion from the water. A good friend of mine is working in a lab at Cornell and is detecting, and retrieving, different colored balls at the bottom of a swimming pool.

The task of finding your local position (aka GPS-denied) becomes exponentially more complex when you do it in 3 dimensions (think quad-copters or AUVs).

Quote:

Originally Posted by yash101 (Post 1404003)
The greatest problem with the Kinect was getting it to work. I have never succeeded in opening a kinect stream from OpenCV!

The depth map of the Kinect is surprisingly accurate and powerful!

Next, perform a GaussianBlur transformation. Play around with the kernel size until the target is crisp and clear!

Hunter mentioned to me, last year, that when at the competitions, as soon as possible, ask field staff if there will be time where you will be able to calibrate your vision systems!

Over half the battle is getting everything to work, in my opinion. You have to compile source code and sometimes change cmakelists (if you want to compile opencv with openni).

For those of you interested in what the depth map looks like for the kinect: depth map

You can do a lot of cool things with a depth map, but that's for another discussion.

I personally am not a fan of blurring an image unless I absolutely have to, or if my calculation requires a center and not corners of a contour.

You should be asking when you can calibrate vision to the point that it is borderline harassment until you get an answer. A lot of venues are EXTREMELY poor environments due to window locations, but there isn't much you can do about it. As an example: uhhhh
By lunch on Thursday, I got it working like it did in stl:stl

Here is a short "video" me and a student made during calibration at stl:videoWe tweaked some parameters and got it to work nearly perfectly. As you can guess, we tracked the tape and not the leds for hot goal detection. I somewhat regret that decision, but it's whatever now.

final

Joe Ross 12-10-2014 21:57

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by MrRoboSteve (Post 1403753)
Our team is wanting to get serious about vision this year, and I'm curious what people think is the state of the art in vision systems for FRC.

Questions:

1. Is it better to do vision processing onboard or with a coprocessor? What are the tradeoffs? How does the RoboRIO change the answer to this question?

2. Which vision libraries? NI Vision? OpenCV? RoboRealm? Any libraries that run on top of any of these that are useful?

3. Which teams have well developed vision codebases? I'm assuming teams are following R13 and sharing out the code.

4. Are there alternatives to the Axis cameras that should be considered? What USB camera options are viable for 2015 control system use? Is the Kinect a viable vision sensor with the RoboRIO?

I don't think that most teams fail at vision processing because of any of the items listed. FIRST provides vision sample programs for the main vision task that generally work well. Here's what I think teams need to work on to be successful with vision processing:
  1. You need to have a method to tweak constants fairly quickly, to help with initial tuning and also to tweak based on conditions at competition.
  2. You need to have a method to view, save, and retrieve images which can help tune and tweak the constants.
  3. You need to have a way to use the vision data, for example accurately turn to an angle and drive to a distance.
  4. You need to understand exactly what the vision requirements are for the game. Most of the time, there are one or more assumptions you can make which will greatly simplify the task.

As for your third question, Team 341's 2012 vision sample program is probably the most popular: http://www.chiefdelphi.com/media/papers/2676

As for us, we've used LabVIEW/NI Vision on the dashboard PC. This makes it much easier to tweak constants and view and save images.

yash101 14-10-2014 00:07

Re: Vision: what's state of the art in FRC?
 
I have plans for an OpenCV Codegen, where I basically make a drag and drop (more like click to add) interface that writes the C++ code for you. It won't be 100% efficient because it really is just putting together known bits of code that work. It is up to you to thereafter change the variable names and optimise the code. I am trying to learn how to thread HighGUI at the moment so hopefully everything should be 100% threaded!

This will be meant to help beginners (and adept) programmers get OpenCV code down in no time!

I will also try to add two network options -- DevServer-based, and C Native socket calls (Windows TCP/UDP, UNIX TCP/UDP).

I have been slowly working on this project since last year. I am thinking about it being 100% web-based. Hopefully, this will make getting started with OpenCV a no-brainer!

It is my goal this year, to get my vision code completed as soon as possible!

Tom Bottiglieri 14-10-2014 00:23

Re: Vision: what's state of the art in FRC?
 
While CV is a neat field and is definitely worth learning more about, planning to use a solution before you know the problem is probably a bad idea. Our team looks at vision as a last resort as it introduces extra points of failure to an already complicated robot.

I recommend using simple sensors to get your robot the perception it needs, then tuning your control loops/operator interface code until you run into a brick wall. If you really can't achieve the task you want without a ton of driver practice, then look in to adding a vision system to give you that last little bit of performance.

billbo911 14-10-2014 00:59

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by Tom Bottiglieri (Post 1404189)
While CV is a neat field and is definitely worth learning more about, planning to use a solution before you know the problem is probably a bad idea. Our team looks at vision as a last resort as it introduces extra points of failure to an already complicated robot.

I recommend using simple sensors to get your robot the perception it needs, then tuning your control loops/operator interface code until you run into a brick wall. If you really can't achieve the task you want without a ton of driver practice, then look in to adding a vision system to give you that last little bit of performance.

Tom,
I've heard you make these comments before, and although I don't agree 100%, I fully understand your reasoning and logic. Since I like to look to the Poofs as a team to learn from, I would like to know if there ever was a game where 254 needed to use vision to overcome an obstacle?

yash101 14-10-2014 02:20

Re: Vision: what's state of the art in FRC?
 
I am doing vision less because it is so powerful, but more for the experience and learning. There is a lot to learn through it. There's a good amount of math involved behind the scenes, which I am slowly catching up on. It is also dependent on your algorithmic development skills. If you are able to figure out exactly what you want out of your program and draft how you are going to do it, the code is actually quite simple. It only took me a couple of hours to write my actual vision processing code. What took me the longest was A) optimizations B) features (yes, I overcomplicate things) C) Calibration/Testing/Getting people to listen and get out of your way when testing :p.

Think of the entire program like a cake. What makes a cake a cake is the bread inside. It may be covered with frosting, or it may be bare. The preparation to bake the cake are the testing protocols and your testing workbench. This step also involves you ensuring that what you are trying to solve is feasible, and worth the pain. The bread (cake) is your processing loop. This should be your first priority. After you have written your basic processing code, and have a test-bench and some testing protocols to ensure it works, you can proceed to the decoration stage. Optimizing the code and making it run with peak efficiency is what you should now work on. This stage is just like putting the frosting on the cake. Next, you can start writing on the cake -- adding features and Easter-eggs! Now that you have successfully made your cake, it is time to inspect it -- make sure you don't have any errors or bugs. Use the testing protocols you should have created before even starting this project to ensure everything you want is working. Finally, it is time to eat the cake! Om nom nom! Eating the cake is while at competitions, where your software is working perfectly, and you are doing much better than the other robots.

I came up with this model last year, where I failed to get my vision program completed in time. I started to write on the frosting before I even baked the cake, so there was nothing supporting my excessive features and everything broke. I also did not have a proper testing protocol last year, so my first demo to my mentors went as a flop. I didn't know about threading back then, and the camera buffers were overflowing, so I was getting 20+ seconds of lag. That is -- not a very great first impression.

Because of these problems I faced last year, I have been working on some software to help me get to the cake first, which are also open source. I have some grabber templates and I am working on a small OpenCV extension, as I mentioned before.

My two cents: If you are wanting to pursue anything complex in the next build season, get started right now. Create your development/testing platform, so when you get started coding, you have one step out of your way!

MrRoboSteve 14-10-2014 10:37

Re: Vision: what's state of the art in FRC?
 
Thanks a lot everyone for your comments. Here are my notes from the thread.

General notes

Many teams are using vision, but it's not required to be successful. Often times there is a simpler control strategy than vision, for a particular task.

Teams using vision can't expect a lot of troubleshooting support from the CSA at the event.

Processing

There are two main strategies for performing vision processing.

1. Event -- perform a specific task
a. Aim -- e.g., robot is driven into position, and an aiming command is given.
b. Autonomous scoring -- moving robot into known position on field
c. Ramp -- robot automatically drives over 2012 ramp
2. Continuous -- the vision subsystem runs continuously, identifying one or more objects and feeding image telemetry as an input to the robot program
a. Drive to known position
b. Create HUD style display (image with overlay) to show driver
c. Indicate when robot is in scoring position to driver
Note that most of these are using telemetry as input to an autonomous process.


You have three choices on where vision processing runs, each of which has benefits and drawbacks.

Driver station
+ have full power of PC
+ NI libraries available
+ fairly easy to interface
- Communications limits between robot and driver station prevent certain algorithms from working. This can be a big limitation
+ easy to display telemetry to drive team
+ can use DS software to move telemetry to robot

cRIO
+ NI libraries available
+ simplest interfacing between vision program and robot programs
- Running vision on separate thread/process makes programming more complicated.
- Easier to crash robot program (e.g., memory management issues)
- Limited CPU power. Current WPILib with cRIO at 100% CPU exhibits unpredictable behavior
+ Easier to move images to DS than coprocessor option
- IP camera support only -- no USB camera support

roboRIO
+ NI libraries available
+ potential for openCV to work, but some questions about whether NI Linux Real-Time has necessary libraries
+ simplest interfacing between vision program and robot programs
- Running vision on separate thread/process makes programming more complicated.
- Easier to crash robot program (e.g., memory management issues)
+ USB support allows direct interfacing to cameras
+ Much more CPU power than cRIO
CPU 4-10x faster
Has NEON instruction support, which looks like it's supported in openCV. Unclear on NI Vision.

External single-board computer (SBC or coprocessor)
+ Many choices of hardware available, some more powerful than roboRIO.
Popular examples include Arduinos, Raspberry Pi, PCDuino, GHI Fez Raptor.
Nvidia Jetson TK1 looks like a monster board -- 2GB of RAM, 192 GPU cores, Tegra K1. OpenCV 2.4 doesn't appear to support the GPU, though.
SBC with a video output is easier to troubleshoot than one without.
+ Some hardware supports hardware graphics speedup (vector instructions, GPU)
+ Many SBCs have USB support, allowing direct camera interfacing
- No NI library support
- Requires ability to do UDP packet processing
- Display of image on DS is more difficult

Software

NI Vision generally considered to be easier to set up.

If you want the option of using a single board computer (vision coprocessor), you probably want to code in C++ or Java, as code can run in any of the three locations.

Running a web server on your coprocessor can make things easier. http://code.google.com/p/mongoose/ is one.
http://ndevilla.free.fr/iniparser/ is one of many free configuration file parsers written in C

Camera

Camera calibration is an essential part of the process. Ensure that the camera you select can be calibrated, and the settings persist through reboot/power cycles.
Mounting location also essential
Need to make sure your software library can acquire images from your camera. UVC is standard for USB cameras. UVC 1.5 supports H.264 video, which can be faster to process in certain ways if your vision proc
Some question about whether USB can support frame rates above 30hz

Cameras
Axis cameras (from KOP) are good choices for people just starting out. There is good built in WPILib support, and they maintain their settings through reboots.
Kinect works too. Depth map can be very useful. Driver support in OSS world seems rough.
Other interesting cameras: Asus Xtion, Playstation Eye
Future: Pixy
LED ring lights (typically green, don't use white) are considered essential

Vision programming tactics

. Need to be able to modify parameters at runtime
Driver station dashboard parameter setting
Config file on robot filesystem
Config file is more flexible because you could have named presets, selected via the DS dashboard, that combine several parameter settings
. openCV is very popular. NI Vision is also viable. No commenter supported RoboRealm; one felt it was too simple (but is that bad?!) and another was held back by fears about licensing issues
. It's debatable whether having a FRC specific library on top of a vision library has any use.
Lower the resolution if you need to run at a higher frame rate
Should have a calibration procedure that you use at competitions, which includes moving robot around competition field and taking a bunch of pictures through webcam to use back in pit for calibration purposes.
Some venues are really bad: https://www.dropbox.com/s/j8ju2ttvx7...et..png?d l=0

Resources

Team 2073 Vision Code from 2014: http://www.chiefdelphi.com/forums/sh...d.php?t=128682
pcDuino 3: http://www.pcduino.com/pcduino-v3/
roboRIO OS whitepaper: http://www.ni.com/white-paper/14627/en/
Team 987 Kinect Vision whitepaper from 2012: http://www.chiefdelphi.com/media/papers/2698
openCV camera calibration: http://docs.opencv.org/doc/tutorials...libration.html
Team 3847 Whitepaper on Raspberry Pi: http://www.chiefdelphi.com/media/papers/2709
Team 341 sample vision program from 2012: http://www.chiefdelphi.com/media/papers/2676

marshall 14-10-2014 11:49

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by MrRoboSteve (Post 1404219)
Nvidia Jetson TK1 looks like a monster board -- 2GB of RAM, 192 GPU cores, Tegra K1. OpenCV 2.4 doesn't appear to support the GPU, though.

To be clear, OpenCV is supported on the GPU for the Jetson, just not with Python (to my knowledge). C++ definitely works on the GPU on the Jetson. We've had some awesome early success with it processing 640x480 images at like 60-80fps.

Tom Bottiglieri 14-10-2014 13:56

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by billbo911 (Post 1404191)
Tom,
I've heard you make these comments before, and although I don't agree 100%, I fully understand your reasoning and logic. Since I like to look to the Poofs as a team to learn from, I would like to know if there ever was a game where 254 needed to use vision to overcome an obstacle?

We used vision in 2012 to align with the basket from the key. You really had to be tight on the target that year and our long 8wd was not super easy to align by hand. We were able to pull out the angle of the robot relative to a line between the center of the robot and the center of the backboard, as well as the angle of said line relative to the field. This allowed us to aim a bit left or right of the center of the backboard based on whether we were on the right/left/center side of the key.

billbo911 15-10-2014 02:10

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by Tom Bottiglieri (Post 1404246)
We used vision in 2012 to align with the basket from the key. You really had to be tight on the target that year and our long 8wd was not super easy to align by hand. We were able to pull out the angle of the robot relative to a line between the center of the robot and the center of the backboard, as well as the angle of said line relative to the field. This allowed us to aim a bit left or right of the center of the backboard based on whether we were on the right/left/center side of the key.

Yes, I remember seeing it in action at CVR that year.
We used a similar process, but only aligning our turret, not the entire robot. We had fairly decent success that season using the cRio to do the processing. Fortunately we were able to relegate the cRio to only trying to process the image and determining alignment and distance, while not having to do anything else during the process.

controls weenie 20-10-2014 20:39

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by marshall (Post 1404230)
To be clear, OpenCV is supported on the GPU for the Jetson, just not with Python (to my knowledge). C++ definitely works on the GPU on the Jetson. We've had some awesome early success with it processing 640x480 images at like 60-80fps.

Marshal, how did you get the camera to output more than 30 Hz? What camera and OS did you use on the jetson. The PCDuino/Linux driver would only read at 30Hz. We would crop our image to get the higher frame rates.

techhelpbb 20-10-2014 20:54

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by MrRoboSteve (Post 1404219)
...
Processing
...

Forgot laptop or Android device on the robot.

Also do not assume that a CSA will not try to help you, but it could be asking quite a lot.
I know I've always offered to help as CSA but if you do something very complicated it's hard to justify the time to fix that over say getting a totally immovable team running.
So I will say...help us...help you :).
The CSA can't, and probably the FTA can't, turn the field inside out for your team's robot vision.
Then again if I notice it is not plugged in - I might suggest you fix that.

Quote:

Originally Posted by controls weenie (Post 1405147)
Marshal, how did you get the camera to output more than 30 Hz? What camera and OS did you use on the jetson. The PCDuino/Linux driver would only read at 30Hz. We would crop our image to get the higher frame rates.

I would guess they used a PS3-Eye camera which can go to 100fps with the right setup.
Be aware that I am only guessing based on my experience with that camera on a Linux laptop.

marshall 20-10-2014 22:10

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by controls weenie (Post 1405147)
Marshal, how did you get the camera to output more than 30 Hz? What camera and OS did you use on the jetson. The PCDuino/Linux driver would only read at 30Hz. We would crop our image to get the higher frame rates.

We got our framerate higher by dropping the resolution. Most webcams from logitech/MS seem to support dropping the resolution to up the framerate above 30 FPS. I think it's a logitech C920 we are using but I'm not certain. The OS is the stock ubuntu that comes on the board. It has been updated to the latest versions and we are using the latest stable OpenCV and the latest CUDA for Jetson from Nvidia (6.0, not 6.5 yet :( ).

Honestly though, beyond 30FPS doesn't seem to help as much as higher resolution does. We're happy to trade one for the other. At least for what we are working on. We want more precise targeting and distance calculations.

Right now the team is working on just basic object tracking using a crappy video I shot with the webcam of a co-worker juggling some red balls. The lighting was complete crap so the students are having to do a lot of processing to figure out the lighting, which is good because that will hopefully make them better at figuring it out come competition.

I had no idea about the PS3 eye. I own one of those so maybe I'll bring it in for the students to experiment with but as I mentioned, FPS isn't a big deal.

JamesTerm 21-10-2014 09:58

Re: Vision: what's state of the art in FRC?
 
This goes without saying, but while this thread has a lot of great information ... there is still more out there (within CD) worth finding as some teams may be out of steam to respond now.

One of my goals is to benchmark roboRIO in regards to vision processing, by using the iMac calls within NI Vision, and I'll post results in the FRC Beta Test Central forum. It will be at least two weeks away from now though.

I figure if I tell you this now... it will help motivate me to get it done. ;)

sparkytwd 27-10-2014 18:37

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by marshall (Post 1405170)
I had no idea about the PS3 eye. I own one of those so maybe I'll bring it in for the students to experiment with but as I mentioned, FPS isn't a big deal.

One of these days I should post a pic of the spent PS3 Eye casings in our supply cabinet. They are fantastic cameras, simply because they were designed with computer vision in mind. You also get a lot of flexibility in terms of lenses and filters.

I've been looking out for good USB3 cameras to offer better resolution at a high framerate, but so far nothing can beat the price of the PS3 eye.

Greg McKaskle 27-10-2014 20:48

Re: Vision: what's state of the art in FRC?
 
I understand there are a number of USB3 board cams on the horizon. What are the details and comparison points of the PS3 eye?

Greg McKaskle

techhelpbb 27-10-2014 23:19

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by Greg McKaskle (Post 1406050)
What are the details and comparison points of the PS3 eye?

Vision libraries for Windows and the PS3 Eye:
http://codelaboratories.com/products/
http://codelaboratories.com/research...typical-webcam
http://codelaboratories.com/research...ye-disassembly
They also produce the DUO with 2 cameras.

Camera specifications:
http://en.wikipedia.org/wiki/PlayStation_Eye
http://www.psdevwiki.com/ps3/PlayStation_Eye

Removing the IR filter:
http://www.peauproductions.com/cameras.html

Some unusual fun:
http://peauproductions.com/store/ind...ndex&cPath=136
http://peauproductions.com/store/ind...x&cPath=136_16

yash101 30-10-2014 14:05

Re: Vision: what's state of the art in FRC?
 
I need to check out that eye driver. Do you know if it will run on Linux? I also do not know where to find a free version!

techhelpbb 30-10-2014 19:12

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by yash101 (Post 1406493)
I need to check out that eye driver. Do you know if it will run on Linux? I also do not know where to find a free version!

I would have to ask your Linux kernel version and distro but the gspca driver for the OmniVision CMOS cameras often finds itself in the newer kernels for free.
So there is a chance if you plug in a PS3 Eye you can just use it with Cheese or VLC.
Standard USB device location finding for Linux may apply.

I suggest you read up on udev and video4linux.
It is actually pretty easy once you nail down the setup to retrieve raw webcam video at the Linux prompt without a GUI.

On Mac OSX check out the current Macam project:
http://webcam-osx.sourceforge.net/

yash101 30-10-2014 22:44

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by techhelpbb (Post 1406517)
I would have to ask your Linux kernel version and distro but the gspca driver for the OmniVision CMOS cameras often finds itself in the newer kernels for free.
So there is a chance if you plug in a PS3 Eye you can just use it with Cheese or VLC.
Standard USB device location finding for Linux may apply.

I suggest you read up on udev and video4linux.
It is actually pretty easy once you nail down the setup to retrieve raw webcam video at the Linux prompt without a GUI.

On Mac OSX check out the current Macam project:
http://webcam-osx.sourceforge.net/

Really?! OpenCV already uses V4L for video input :D. Things should be much easier then!

techhelpbb 31-10-2014 05:57

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by yash101 (Post 1406550)
Really?! OpenCV already uses V4L for video input :D. Things should be much easier then!

I am aware of the OpenCV using V4L. I mentioned it elsewhere in the forums.

The issue you need to be aware of is that in several Linux distros when udev finds the webcam it will not map a given webcam the same way consistently.

Without understanding the layers it will be difficult to understand how to craft a udev rule or where to troubleshoot to fix issues you may encounter. So I can either suggest how to build the foundation or I can merely pretend that looking at OpenCV is enough. The value OpenCV really adds is that you get lots of operations you do not have to write to handle raw video and set camera parameters once you have a clear path to communicate to the camera.

I've had students in the past simply use the raw video quite successfully.

NotInControl 31-10-2014 08:13

Re: Vision: what's state of the art in FRC?
 
Typically a PS3 Eye can do: 640 x 480 @ 60 fps, or 320 x 240 @ 120 fps based on the mfg spec.
The Axis camera we all have has a max 30fps at all resolutions it supports.

Note this is the rate at which the camera can capture frames. Your application will need to download the image, and then process it. So your actual frame rate is dependent upon how fast you can download the image from the camera, and then process it.

For 2014 we limited the fps of our axis camera to 20fps to ease the burden on our processing. Our calculation loop was 10ms (running on a beagle bone), and it took about 60-90ms to download a 640x480 image over our local network and store it into a 2D array which was a compressed jpeg and our images were never more than 20kb. This was over Ethernet, and direct USB should be much faster, but the time it takes to capture the image, and store it in a 2D array even over USB 2.0 is not negligible. Anyone have any good numbers for how long it currently takes them to capture a 640x420 frame and store it over USB?

I am not sure what compression is available on the PS3 eye, but if there isn't any, your 640x480 images will be of much larger size, and thus take longer to download, which will reduce your overall per-frame processing time.

If you already have the camera on hand, then I understand the reason for using them, and I would say just go ahead and try it out now in an off-season test. However, if you are in the market for a new camera, and one that will last for the next couple of years. I would suggest to make sure that the camera you get has built in compression like h.264, and can support at least 30fps at that resolution.

For us, I would rather have 15fps at a higher resolution, then to have higher frame rate at lower resolution. Think about that.

The axis IP cams we have can do this, the Logitech c920 also has built in h.264 encoding, and is a very popular Linux webcam. However, other features you might want to look into is the ability to adjust the camera settings. I do not know the process for adjusting the exposure and shutter settings on the ps3 eye, or Logitech c920, but one of the reasons I like the axis cam is not only can you adjust its settings from its built in web server, regardless of what system you use to process its images, but it includes an API access through the camera URL which allows you to change camera settings on the fly.

This means you can turn your exposure all the way down for auto to do object detection, and then turn it back up in teleop so it can provide a full colormatch recording during a match. More info on the API can be found here. Most people don't know about this functionality built into most AXIS camera products.
http://www.ce.rit.edu/research/proje...TP_API_3_00.pd and http://www.axis.com/techsup/cam_serv...urce_i0_sensor see section 2.2.16

Camera settings are very important, and if not adjusted correctly, you can be wasting unnecessary time processing your camera because the images returned contain a lot more than just the target objects. If your goal is to process objects that are not retro-reflective (like the balls in 2014) then higher resolution will help you, not higher frame rate.

Other very important specs for your camera are shutter time, and light sensitivity when shopping for a camera.

FIRST is targeting official support for the Microsoft LifeCam HD3000, which is a 720P cam, and states it can do 30fps at that resolution. FIRST mentioned they would be sending these cams to certain beta teams, so If we get one, I will be trying it out.

Regards,
Kevin

P.S. Post 200, I should retire from Chief now!

MrRoboSteve 31-10-2014 08:15

Re: Vision: what's state of the art in FRC?
 
Hadn't looked at the PS3 Eye before. Hard to resist at $9.25 with Prime shipping on Amazon.

techhelpbb 31-10-2014 10:10

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by NotInControl (Post 1406579)
Note this is the rate at which the camera can capture frames. Your application will need to download the image, and then process it. So your actual frame rate is dependent upon how fast you can download the image from the camera, and then process it.

This is the rate that the PS3 Eye can capture frames and the rate at which it can send them to the computer. That is important to note. Per one of my links above some webcams play a game where they send you more frames than the sensor can actually produce effectively creating a high frame rate that is bogus.

The PS3 Eye was specifically designed to be able to capture and send those high frame rates. However you are entirely correct if your USB port or the underlying processing is too slow it does not matter if you send 20fps or 120fps most of the frames will be useless and will have to be ignored. In fact the PS3 Eye can bury a USB 1.1 port because the data flow it produces is just too great.

The value is that the PS3 Eye can achieve those frame rates with minimal compression artifacts and right up to the port. What the user chooses to plug into it is up to them.

In the past we had a student pulling raw frames from V4L talking to a PS3 eye at over 50fps (throwing away the extras) in OpenJDK on a dual core netbook legal for a FIRST robot and doing some very simple retroreflective tape tracking. That's pretty fast considering that it wasn't fully compiled code and had the JVM overhead. Obviously with a netbook and potentially the Jetson TK1 (never tried this on the Jetson but the Jetson supports this camera) you can compile code and potentially leverage CUDA so you can raise the frame rate you can fully process very high.

Quote:

Originally Posted by NotInControl (Post 1406579)
Anyone have any good numbers for how long it currently takes them to capture a 640x420 frame and store it over USB?

From this link I also provided above:
http://codelaboratories.com/research...typical-webcam

Quote:

PS3Eye Data Requirements
Lets take a look at some simple data transfer numbers. The PS3Eye is strictly a high-speed USB 2.0 device, meaning that if you have USB 1.0 port on your computer, the camera simply won't be able to transfer images to your machine. The camera provides the highest quality image sensor data (uncompressed raw data). As such, the amount of data to be transfered can easily saturate USB 2.0 bus!
For example lets take a look at the 640x480 24bit RGB image captured at 30fps:

Required Bandwidth = 640*480*30*3 = 27648000 bytes/s = 26.36 MB/s - Which is about 66% of total usable USB 2.0 bandwidth!!!

The 640x480 24bit RGB image captured at 60fps results in:

Required Bandwidth = 640*480*60*3 = 55296000 bytes/s = 52.73 MB/s - Which exceeds the total usable USB 2.0 bandwidth!!!
Read down to see how the PS3 Eye got around needing more bandwidth than USB2 offers. They packed and reduced the color data in a specific way.

Quote:

Originally Posted by NotInControl (Post 1406579)
I am not sure what compression is available on the PS3 eye, but if there isn't any, your 640x480 images will be of much larger size, and thus take longer to download, which will reduce your overall per-frame processing time.
...
The axis IP cams we have can do this, the Logitech c920 also has built in h.264 encoding..

Compression like that in H.264/MPEG-4 AVC produces artifacts as it removes image data. Hence why the more you compress the video from say an Axis camera the worse it appears.
http://www.cs.ucla.edu/classes/fall0...4_Tutorial.pdf

It also requires the device that is receiving the compression to decode that compression which increases time required to get the raw frames.

Both of these facts are something to be considered if your webcam forces you to use compression to your image processing system. You want to get as close to the actual images as you can even if you are setting up the camera to be more sensitive under a certain circumstance.

The PS3 Eye achieves those frame rates now - it doesn't need changes to make it work - whether your PC or embedded device can handle that pile of inbound data is really up to your choices.

A much better way to retain the data and reduce the flow is setup the camera to reduce the amount of color data per pixel that needs to be sent. For example the PS3 Eye uses YUYV/YUY2 also called YUV 4:2:2

http://linuxtv.org/downloads/v4l-dvb...-FMT-YUYV.html

This means it sends less than 24bit color but because of lighting conditions that is not such a big deal.

There is JPEG compression support in the PS3 Eye as well. Though I suspect it is probably less advanced than the latest H.264 standard. It is a slightly older camera but it was specifically designed with motion and object tracking in mind.

The higher resolution cameras need compression to make it possible to send the color data for the increasingly higher number of pixels down a pipe that is too small to support it. The price you pay for that high resolution is therefore the compression. Though a real upside of the compression can be that if you really tinker with the camera color balance you can let the compression toss away the blacked out portions of the image in the camera which of course means you have already lost image data because of that camera setup.

Quote:

Originally Posted by NotInControl (Post 1406579)
FIRST is targeting official support for the Microsoft LifeCam HD3000, which is a 720P cam, and states it can do 30fps at that resolution. FIRST mentioned they would be sending these cams to certain beta teams, so If we get one, I will be trying it out.

I literally have a small crate of webcams. Including 5 older Microsoft webcams we bought when we were doing the evaluation the PS3 Eye. The Microsoft webcams were more expensive. The drivers for Linux were not nearly as well developed (we were in contact with the package maintainer) and the quality of the white balance was inconsistent (easily saturated but slow to recover and dramatically sensitive to light conditions). Obviously with each version there is an opportunity for Microsoft to work out how they process their data but the core of that camera is still an OmniVision CMOS sensor and what configuration Microsoft will document and expose is variable. If you would like I will send you for free one of my older Microsoft cameras as they are collecting dust and I will even pay the shipping.

The core point is this:

Sony made the PS3 Eye to track quick movements of game players for their typically stationary console.
Microsoft made their webcam to make videos from a desk.
Neither was designed for mounting on a mobile robot.

If the goal is to track the whole end of the field from half-field you probably need resolution and minimal frames. So in that case the PS3 Eye might not be the best choice.

If you the goal is track the movements of an actuator or say a ball thrown over the bridge then you probably do not need high resolution you need high frame rate. That was a perfect use of the PS3 Eye and it would only have been better if the balls had LED lights in them.

billbo911 31-10-2014 11:55

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by MrRoboSteve (Post 1406580)
Hadn't looked at the PS3 Eye before. Hard to resist at $9.25 with Prime shipping on Amazon.

Ordered one this morning. Believe it or not, it will be delivered Sunday. Amazon Prime is now able to hold to 2 day delivery even on Sunday!!

yash101 01-11-2014 19:20

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by techhelpbb (Post 1406570)
I am aware of the OpenCV using V4L. I mentioned it elsewhere in the forums.

The issue you need to be aware of is that in several Linux distros when udev finds the webcam it will not map a given webcam the same way consistently.

Without understanding the layers it will be difficult to understand how to craft a udev rule or where to troubleshoot to fix issues you may encounter. So I can either suggest how to build the foundation or I can merely pretend that looking at OpenCV is enough. The value OpenCV really adds is that you get lots of operations you do not have to write to handle raw video and set camera parameters once you have a clear path to communicate to the camera.

I've had students in the past simply use the raw video quite successfully.

We are probably only going to use only one camera. Even if that is not the case, I have a plan, where I map the cameras using some extremely basic marker detection!

Cameras:
PHP Code:

+---+--+    /
         /     / 
/     /   <== left camerabanner is red
       
+---+--+      /
                         /
      +---+--+     /
     /     / 
/      /  <== right camerabanner is green
   
+---+--+      / 

Hope my ASCII art perspective was great.

Basically, using the color of the banner, placed in front, at camera boot (or maybe a small red/green dot drawn on the camera), at program boot, the cameras can be marked as left or right!

Ken Streeter 01-11-2014 23:38

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by Joe Ross (Post 1404016)
Most of the time, there are one or more assumptions you can make which will greatly simplify the task.

+1

Simplify, simplify, simplify. We've had success using vision processing on the robot (2006, 2009, 2012-2014) largely through simplification of the processing.

In 2006 we did in-camera processing in the CMUcam. In 2009, we did on-board processing in the cRIO. In 2012-2014, we did vision processing in the driver station laptop and sent processing results (target information) back to the robot. All of these can be made to work effectively.

In all of these cases, we tried to simplify the problem as much as possible. Use camera exposure, orientation, mounting, etc., to constrain the problem as much as you can. Reduce image sizes as much as you can. (You'll be surprised how few pixels you need; we've never had to go above 320x240, even with a full-court shooter in 2013.) Look for the easiest-possible target.

Think about fault-tolerance in your processing. If the approach you are trying is susceptible to false alarms, or color shift, or ..., change/simplify your approach to avoid the issue; don't be tempted to try to solve the issue by additional processing; try a different approach that works around the problem in a simpler way.

Test your vision processing in as many different environments as possible. Natural light, artificial light, etc. Tweak your approach until is is highly tolerant to image variation.

In short, for the vision approach to be effective, it has to be very reliable or it won't get used. A great way to increase reliability is to keep it as simple as possible.

Lastly, I'll share one of our "secrets" that nobody ever believes because it isn't the "normal vision processing approach": instead of determining range-to-target by size of a target in an image (which typically degrades poorly with lighting changes, color changes, viewing angle, etc.), hard-mount the camera to the robot at a different height than the target and use "elevation of target in field of view" to get range. We came up with this approach in 2006 (when size of target wasn't really an option) and have found it to be just as accurate as size-of-target processing, but much more tolerant of image degradation. At 15' to 20' we can resolve robot "range to target" within 2 inches. In 2013, at a range of 50' (full-court shooter) we had accurate "range to target" within 4 inches.

techhelpbb 02-11-2014 08:52

Re: Vision: what's state of the art in FRC?
 
Was a volunteer at RampRiot 2014 this weekend...

As predicted a few people sent their Axis camera video to their driver's station in most cases just so they could display it. At least 5 teams skyrocketed their missing packet count because the camera consumed the available bandwidth. Only one team had their software setup so that despite missing the packets, not quite up to the point of disable, the robot would generally keep moving the last direction it was given. Everyone else of those 5 teams noticed performance issues as the result of missing inbound direction from the field to the robot. In some cases the lost packets were so often that the robot ended up disabled because the robot software you do not control will disable a robot that does not receive an FMS packet before the timeout.

My personal advice...backed by several years of experience as a CSA and as an experienced network engineer...the available bandwidth you have on the field is at the mercy of the field operations at the competition and the field operations at FIRST. Be prepared if you send video back to the driver's station to potentially have to reduce the quality of that video or even have to turn it off (especially if you do not know how it is configured or can not reconfigure it). I dislike having to point this issue out to teams every year over and over and I do not think it is fair to the teams that stumble into it because they saw other teams do it. For example, if you watched the team that lost 3,500 packets on the field over the course of the match and kept moving, you would swear if you did not understand the details that there was no issue even though 3,500 missed packets were 3,500 directions their robot did not get. Obviously if the team that can loose 3,500 packets is okay with that limitation that is a design choice I have to honor. However even in that case all it would take would be a subtle change in the field and they would loose 2x, 4x or more packets and then they would likely end up disabled because of the timeout as well.

Just be clearly aware - if you send video back to the driver's station you can find yourself at the field on a competition day and see your missing packet count skyrocketing on the field (there's a display next to the field we use to do diagnostics it has 6 x 4 round red/green circles on it) and the FTA can dig in the field software (for a short time after a match) and see graphs that will show your robot is smacking the bandwidth limits. There are other things that can cause this but I am starting to think I should just write up a sign with this information so I can save the talking ;)

Another way teams manage to pull off sending video to the driver's station (other than designing their software to better deal with missed packets) is they will setup the camera so that the picture is no longer really the sort of video people expect (obviously these 5 teams at RampRiot wanted the sort of video people expect from a TV so that is what they sent). In these cases (and Ken just suggested this again) they changed the camera setting so that large portions of the video are drown out. This not only makes identifying a piece of retro-reflective tape easier but it causes the camera compression to crush down the drown out portion of the video because that information is all the same. This is also a compromise. It makes the video stream back to the driver's station even smaller, but again, it is possible you could still eventually find yourself dropping packets because of that. All it would take is for there to be a lot of illuminated retro-reflective surfaces to send, or a change in the field settings.

If it wasn't for the retro-reflective tape the camera setting change would be much more difficult to accomplish. With a proper light source the retro-reflective tape shines brightly back at the camera allowing the camera to be setup in such a way that the rest of the image is very dark or even black and the tape is still visible. If the goal was to track something not lit by a light source these camera setup tricks would not be nearly as effective because you would likely end up sending full color TV style video again.

If you really must send image data back to the field please consider sending a series of images. Each time you can break the socket if you use TCP and that can restrain the bandwidth that will be used. You can also use UDP if you don't mind writing some protocol. Otherwise if you really want to process color video, like the sort of video you would expect on a TV, then consider doing it on the robot and not sending that video back to the driver's station.

Every year someone argues this, teams end up having this issue, and too often if they just pull the plug on the camera the problem related to that just disappears. In some cases they recode a portion of their software so they can turn off the camera without consequences. So I also recommend that you write your software to tolerate a disconnected camera. That is a fast way to troubleshoot this issue. Video to field disappears - problems with missed packets go away - it is surely just a coincidence ;) nothing to see there (especially since you turned off the camera :)).

The deeper problem that shows up is when this is not merely the only problem. So you end up fixing the camera issue just to have that mask another problem. By the time a team is 1 match in and sees the camera problem, then fixes the camera problem in the 2nd match, or maybe the 3rd, then finds the next problem they have lost significant opportunity. Plus they might be having wear and tear between matches such that the camera problem has an even greater impact. So I think this is really something be aware of. Doing this wrong could cost a team a whole competition.

Greg McKaskle 02-11-2014 12:52

Re: Vision: what's state of the art in FRC?
 
Just to sharpen a few points.

The control packets are sent at 50Hz from the DS. The FTA counter resets at match start, so should see around 7000 packets sent during a match. 3500 lost packets is a lot, but depending on the distribution, it could mean 25Hz control with no packet loss or half the match spent disabled.

The Graphs tab of the DS shows the packet loss per second along with lag. This is later viewable using the DS Log File Viewer app. This tells you during the match how the lost packets were distributed and what the lag was. It will show any periods where the robot is disabled as well. You will find hundreds of logs to compare against at other events, practice fields, etc.

Also, the default dashboard shows the current camera settings, current bandwidth usage due to camera, and an LED is colored according to how this matches the field limits. There is also a control for enabling/disabling the camera and changing the three settings that most affect bandwidth.

I'd be happy to look at some team logs if they would post them.

Greg McKaskle

techhelpbb 02-11-2014 14:45

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by Greg McKaskle (Post 1406872)
...and an LED is colored according to how this matches the field limits...

How does this indicator determine what the field limits are?
Is it set dynamically based on the field configuration or set based on a static expectation?
I've seen fields with the channel bonding on and with it the channel bonding off.

In the case of a few of these teams that were disabled or had performance issues more than once I asked them to see their DS log viewer and you could see the bursts of missed packets. More than enough during those peaks to disable their robot till it got more FMS packets.

In the case of the one team that had effectively had 25Hz control I know what they did because I asked the person that helped them work that out. Mostly because usually when I see missed packet counts that high it will shortly be followed by a complaint.

The problem with all these tools is that you probably will look after you had a problem on a competition field.
Then you might change the settings but by then you are already down a match or if you are lucky maybe a practice match.

Though it is nice to be able to collect evidence I wonder how much of this evidence gets off the driver's station in the end.
At least 2 of these teams has no idea they had so many logs already.
Since these robots have been getting onto these fields for months there should be dozens of match logs where this behavior is visible.

Is there a button or control somewhere I do not know about to upload logs to be examined?
I am not seeing one on these directions:
http://wpilib.screenstepslive.com/s/...og-file-viewer
This might be something I am missing but by the time people are disappointed I think they are not interested in digging around a lot then figuring out how to send this.
I've asked for this evidence myself in the past so I could forward it but I've rarely gotten it.

If this feature does not yet exist could I ask for it to be implemented?

Also Greg, if you like, I can send you an iPhone video I have with a total missing packet count of 5,000 packets from the team with the code that could tolerate this. They never got disabled. I just noticed this because at the time I made that video it was to track an issue for Team 11 with the RoboRIO. Might I add I was happy to see the RoboRIO performed as well as it did considering the robot it was on was often dragged down to 6V on the battery. As far as I am concerned that was a good improvement.

Greg McKaskle 03-11-2014 11:56

Re: Vision: what's state of the art in FRC?
 
My answers are marked with ***s.

How does this indicator determine what the field limits are?
Is it set dynamically based on the field configuration or set based on a static expectation?
I've seen fields with the channel bonding on and with it the channel bonding off.

*** The LEDs are based on the static guidelines. They are in effect with and without a field so hopefully a team will identify a 15Mbps camera before they compete.

*** The field is typically using bonded channels, but for a portion of 2013 it didn't, and in some venues it may not be feasible. Even without bonding, the field "should" have enough bandwidth to run a match with all robots using the recommended bandwidth. But some venues are challenging.

...

The problem with all these tools is that you probably will look after you had a problem on a competition field.
Then you might change the settings but by then you are already down a match or if you are lucky maybe a practice match.

*** The LEDs and Mbps values are there even when not on the field, so hopefully most teams are prepared. The charts are live on the DS, but not in the front tab. Yes, these metrics are mostly for analysis following a match. My time machine is on the fritz.

...

Is there a button or control somewhere I do not know about to upload logs to be examined?

***
The Screensteps indicates that it is available in the Start Menu or in Program Files. There is in fact a button for launching the viewer, but the docs were not updated. In 2015, the button is even more prominent. The button I'm commenting on opens the file in the Viewer on the DS laptop. There is no button to send this to FIRST since they have a subset of this info already. I'm don't work for FIRST, so I'll sometimes ask teams if I can get their logs to help identify what happened. The log file viewer shows where they are saved and lets you look at other locations in case you want to have archives of other years/events.


If this feature does not yet exist could I ask for it to be implemented?

*** Too many pronouns, what feature is being requested?


Greg McKaskle

techhelpbb 03-11-2014 13:22

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by Greg McKaskle (Post 1407053)
The LEDs are based on the static guidelines. They are in effect with and without a field so hopefully a team will identify a 15Mbps camera before they compete.

...

The LEDs and Mbps values are there even when not on the field, so hopefully most teams are prepared. The charts are live on the DS, but not in the front tab. Yes, these metrics are mostly for analysis following a match. My time machine is on the fritz.

I think the issue is more that teams get close enough to the bandwidth that they think should work but not quite below the limit sufficiently. Such that they get some operation but then get cut off here and there. Sometimes they say they were fine at the last competition but then get to another and they are not fine (when in fact they were always skirting with this issue). Compression makes this a little more complicated because the bandwidth of the video changes slightly based on what is being shown and that changes all the time.

So, if a team does not look at the logs, they might think they dragged down the battery or had a software problem. This misleads them into thinking they are not having this issue. Also some teams change out the robot operators so they don't think to go back and look at these logs which would still show the lost packets.

In fairness this is a tool teams should leverage more often.

(We could solve the whole real time video issue if we could just travel forward in time and record the match then play it back in the past. :))

Quote:

Originally Posted by Greg McKaskle (Post 1407053)
The field is typically using bonded channels, but for a portion of 2013 it didn't, and in some venues it may not be feasible. Even without bonding, the field "should" have enough bandwidth to run a match with all robots using the recommended bandwidth. But some venues are challenging.

True enough. Increasingly the 5GHz WiFi spectrum is being gobbled up by schools and other mobile devices capable of using it. We can't always clear the air to use what we might like.

Quote:

Originally Posted by Greg McKaskle (Post 1407053)
If this feature does not yet exist could I ask for it to be implemented?

*** Too many pronouns, what feature is being requested?

At the time I wrote that I was wondering if we could get a button that would send the log we have open to a central server somewhere, like perhaps something linked to NI Parkway. Though the more I thought about it: there might not be Internet access to their driver's station at the competition (usually there is...but not always).

So I think I will modify my feature request based on what I just wrote. Can we get a button in the DS log viewer that will compress and transfer all the current match logs (or at least the currently displayed log in the DS log viewer) to a USB flash drive? So that I can just plug in and take all their logs with me by clicking that button? (If this is too much trouble I can write a little bit of code to achieve this and put it on a USB flash drive. I usually have several with me when I CSA anyway for all the documents and updates).

aryker 03-11-2014 14:25

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by MrRoboSteve (Post 1403753)
4. Are there alternatives to the Axis cameras that should be considered? What USB camera options are viable for 2015 control system use? Is the Kinect a viable vision sensor with the RoboRIO?

As a mentor for a beta team, I can tell you that support for the Kinect has been discontinued for the roboRIO(I think their reasoning was that it was too much of a hassle to keep up with Microsoft's updates). Keep this in mind as you plan for the coming year.

EDIT: As Alan pointed out below, this is not entirely correct. The Driver Station will not support the Kinect, but that doesn't really answer your question. Apologies for giving the wrong impression.

Alan Anderson 03-11-2014 14:36

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by aryker (Post 1407094)
As a mentor for a beta team, I can tell you that support for the Kinect has been discontinued for the roboRIO...

I don't think what you wrote is what you meant. Kinect was never "supported" for the roboRIO in the first place. Teams that used it as a robot sensor did so on their own, without using FIRST-provided software.

It's the Driver Station that will no longer support a Kinect next year. That isn't what Steve was asking about.

aryker 03-11-2014 14:46

Re: Vision: what's state of the art in FRC?
 
Quote:

Originally Posted by Alan Anderson (Post 1407096)
I don't think what you wrote is what you meant. Kinect was never "supported" for the roboRIO in the first place...

Corrected. Thanks.


All times are GMT -5. The time now is 22:14.

Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi