Quote:
Originally Posted by buchanan
2077 did this last weekend at WI.
We sent an 80x60 (later scaled up) image split into 5 interlaced fields at 8 bits per pixel with no compression in the usual sense. Is there an API for compression? We used the 8 bits as a 4-bit red channel and a 4-bit green channel, giving "sorta-color" sufficient to distinguish targets.
We got usable video at 10 fps with occasional dropped fields (jerkiness, smears). Latency seemed to be around half a second with our final setup, though some earlier tests did better at the price of less acceptable blurring. Not sure what the limiting factor in this was.
The key things we found were 1) controlling the rate at which we sent frames into the Dashboard class (the MDC->DS loop is asynchronous at .02 seconds) and 2) using the lower level NI imaq APIs to access pixel data in the Image object quickly.
I suspect considerably better performance is possible with some tweaking, and especially if real compression were used, but that's were we got in the time available, and it was certainly usable.
|
The Axis 206 is capable of capturing, compressing, and transferring MJPEG images at a sustained 30 FPS with less than 33ms delay. It can do this even for full VGA images. With a 160x120 image and a high compression setting you can get reasonable images that average 1.7KB. Not all images will be the same size. With jpeg, it depends on the complexity.
I find the jpeg encoder on the 206 to be pretty good. The attached example is 1.54KB. When you put an image of this quality in motion at 25 fps, and expand it to 320x240 the eye integrates the pixelation and fills in the gaps.
When the camera is setup to stream, the images arrive over the TCP/IP link as multi-part http response that essentially goes on forever or until the TCP/IP connection is torn down.
The key is not to decode the images. The decoder in the library is pretty slow, there is no encoder, and additional delay would be introduced by transcoding. Instead the compressed jpeg images is just passed through the user data stream to the DS which forwards it on to the dashboard. If you want, you can also decode the image on the side into an HSL for processing the tracking.
In order to control all the timing, we replaced the Dashboard, DriverStation, and RobotBase classes with our own variants. The main difficulty is that you cannot allow queues to form. You have to throw away stale images.
The system timing is all controlled by the DS. The DS sends a packet every 20ms and it passes the last received packet to the Dashboard every 20ms. This is more or less synchronous. The MDC responds to each DS packet with an MDC packet. The MDC never originates a packet, it only responds. The DS never responds, it just streams a packet toward the MDC every 20ms (and another to the Dashboard in a different task). There are better ways to do this - but that's a discussion for post season.
When packets are delayed and arrive bunched up at the MDC, it responds to each one with the latest buffer it has. When the DS receives these packets, it just places the data in the buffer, overwriting the previous content. Every 20ms, the Dashboard gets the latest data from the MDC.
The Dashboard will generally get a reliable 50 pps stream with close to zero jitter and no dropped packets because the connection is local. However, the CONTENT of these packets reflect upstream packet losses, delays, and jitter.
So the pipeline is reliable transport -> unreliable transport -> reliable transport.
To summarize the timings:
1) The image is captured by the CCD (based on exposure time), always less than 33ms.
2) The image pixels are dumped in mass to the shift CCD's and marched out the CCD into the jpeg compressor. The image can be compressed on the fly since jpg works with multiple scan lines. This delay is also well under 33ms.
3) The jpg image is sent over the TCP/IP connection. This has minimal delay.
4) The MDC TCP/IP stack receives the image, unblocking the waiting read task.
5) Once the complete image is read, it is passed for formatting in the telemetry packets. This requires synchronization with the stream. If the image is too old, it is discarded here. Otherwise, it goes on to be packed within 2 or more telemetry frames which introduces a 30ms typical delay.
6) The actual packet transmission delays are nominally small. There is an additional 10ms average delay from DS to Dashboard.
7) The packet arrives in the Windows IP stack where it is passed to a waiting thread in the python code.
8) The image is decoded (to RGB) and expanded (from 160x120 to 320x480) by python PIL code and then written to the display. This has a delay of about 20ms.
9) The image will not be seen by anybody until the next video refresh comes around. This is an average delay of 8.3 ms.
The entire process has an end to end delay of around 150ms.
Longer delays will occur if you allow a queue to develop anywhere along the pipeline. You must drain and discard to avoid this. The most common places for this to occur is in the IP stacks in the MDC (for data coming from the camera) and in the Dashboard (for packets coming from the DS). If you don't keep up with all of the processing, queues will build and you will start to see large delays.