View Single Post
  #13   Spotlight this post!  
Unread 11-14-2016, 11:49 AM
Jared Russell's Avatar
Jared Russell Jared Russell is offline
Taking a year (mostly) off
FRC #0254 (The Cheesy Poofs), FRC #0341 (Miss Daisy)
Team Role: Engineer
 
Join Date: Nov 2002
Rookie Year: 2001
Location: San Francisco, CA
Posts: 3,069
Jared Russell has a reputation beyond reputeJared Russell has a reputation beyond reputeJared Russell has a reputation beyond reputeJared Russell has a reputation beyond reputeJared Russell has a reputation beyond reputeJared Russell has a reputation beyond reputeJared Russell has a reputation beyond reputeJared Russell has a reputation beyond reputeJared Russell has a reputation beyond reputeJared Russell has a reputation beyond reputeJared Russell has a reputation beyond repute
Re: 30fps Vision Tracking on the RoboRIO without Coprocessor

This is very cool, though I'm not (yet) convinced that you can get 30fps @ 640x480 with an RGB USB camera using a "conventional" FRC vision algorithm. But now you have me thinking...

Why I think we're still a ways off from RGB webcam-based 30fps @ 640x480: Your Kinect is doing several of the most expensive image processing steps for you in hardware.

With a USB webcam, you need to:

1. Possibly decode the image into a pixel array (many webcams encode their images in formats that aren't friendly to processing).

2. Convert the pixel array into a color space that is favorable for background-lighting-agnostic thresholding (HSV/HSL). This is done once per pixel per channel (3*640*480), and each op is the evaluation of a floating point (or fixed point) linear function, and usually also involves evaluating a decision tree for numerical reasons.

3. Do inRange thresholding on each channel separately (3x as many operations as in your example) and then AND together the outputs into a binary image.

4. Run FindContours, filter, etc... These are usually really cheap, since the input is sparse.

So in order to do this with an RGB webcam, we're talking at least 6x as many operations assuming a color space conversion and per-channel thresholding, and likely more because color space conversion is more expensive than thresholding. Plus possible decoding and USB overhead penalties. Even if we ignore that, we're at 7.7 * 6 = 42.6ms per frame, which would be 15 frames per second at 64% CPU utilization. Anecdotally, I'd expect another 30+ ms per frame of overhead.

The Kinect is doing all of the decoding for you, does not require a color space conversion, and gives you a single channel image that is already in a form that is appropriate for robust performance in FRC venues. No Step 1, No Step 2, and Step 3 is 1/3 as complex when compared to the above.

However...

Great idea hacking the ASM to use SIMD for inRange. I wonder if you could also write an ASM function to do color space conversion, thresholding, and ANDing in a single function that only touches registers (may require fixed point arithmetic; I'm not sure what the RoboRIO register set looks like). This would add several more ops to your program, and have 3x as many memory reads, but would have the same number of memory writes.
Reply With Quote