I am trying to get camera tracking working on my beaglebone black. I want to use the axis camera to do this, and I can get that part working. The problem is that the tracking is limited to about 15 frames per second, and the camera is sending at 30fps. This is causing images to just get stored in the buffer, which then starts lagging badly. Does anybody know how to either shrink the buffer down to 1 image, read from the back of the buffer, or loop through the buffer to the end quickly. Or if there are any other suggestions that would help too.
Also this is unrelated, but does anybody know why no networking programs allow reading from the back of the buffer. I have wanted to do that MANY times, but nothing ever supports it, and just curious if anybody knows why that is.
This is basically what I was going to write, but Zaphod said it better…
…have a separate capture thread, in addition to your [image processing] thread. The capture thread keeps reading in frames from the buffer whenever a new frame comes in and stores it in a “recent frame” image object. When the [image processing] thread needs the most recent frame, it locks a mutex for thread safety, copies the most recent frame into another object and frees the mutex so that the capture thread continues reading in new frames.
Is there a reason you are not just requesting one frame and then processing it before requesting the next?
In other words, don’t allow the camera to “free run”.
You might also try setting the Axis to run at 15 fps. As long as you can process at 15 fps, you should not be filling the buffer.
I tried just requesting a single frame at a time, but it would take 100ms from requesting the image to actually receive the image. It was a little too much for me.
As for limiting the fps it wasnt working for some reason.
The WPI libraries in LV include vision code. The code uses parallel loops – results in parallel threads – in case the user has set the camera faster than they can process. This is effective for keeping the lag to a minimum.
You may also want to lower the camera frame rate, though. First, there is some overhead involved in processing the frames, though hopefully you aren’t decompressing the images or doing any of the expensive things. Second, the two thread approach will result in much less certainty about when the image was acquired. If you are trying to estimate time, it is likely better to run the camera at a slower, but known rate that doesn’t buffer images.
The other approach of requesting an image only when you want one was used in older versions of WPILib. It worked/works far better on the 206 model than on the newer Axis cameras. There is indeed quite a bit of setup overhead and the mjpeg approach is almost required to get above 10fps.
We are using a Beaglebone black, which makes us run openCV. If it could run labVIEW we would use it. For some reason limiting the FPS was not working.
I ran the threading overnight last night, and it ran for 12 hours without missing any frames or starting to lag. How I had it set up was grab the image in the fast looping thread, and store that. Then in the slow thread i would grab a frame as fast as the tracking could loop, which was about 60ms. Because at 30fps that is a new frame every 33ms, that means at maximum the image is 2 frames old, which is perfectly fine.
The reason we are on a beaglebone instead of the dashboard is i have heard stories of FTA’s requiring dashboards to get shut off, and we don’t want to be stuck if that happens. So we want it to be onboard.
Totally understand about the approach. I have a bone at my desk for precisely this sort of experimentation. I was relating it to how LV implemented it in order to validate your approach.
I’d be interested to hear if FTAs need to limit dashboards this year. Some venue’s are less wifi friendly, so it may still happen, but it is not the expected route.
We did our image processing in python using OpenCV in a single thread, and didn’t have any lag problems. Here are some thoughts that may help:
Memory allocation of large images can get expensive. Don’t allocate new images each time you read something in. Instead, find the size of the image, allocate a numpy array to store the image, and process it. Reuse the same array each time you grab a new image. Similarly, when doing processing steps that return a new image (like splitting, or colorspace transforms, or whatever), allocate your buffers once, and reuse the same buffers over and over again by using the ‘dst’ parameter.
Be wary of accessing too many non-local variables in python. In our robot code (which is running python), we found that accessing a lot of things at module level instead of local level caused noticeable slowness. Presumably this is because each time python needs to do multiple lookups in each scope to find things.
Keep in mind that in python 2.7, because of the GIL multithreading doesn’t buy you a lot unless you are I/O bound
Were you reusing the same image buffer each time, or were you just using the returned image buffer from read()? If you are allocating a new image each time (the default behavior), that explains why you had so much lag receiving the image.
It works in a single thread on a computer because a powerful computer can do all the tracking faster then 33ms. When I put the code on the beaglebone, it takes about 60 ms just to run the threshold and morphology commands. So that causes the camera buffer, which gets a new frame every 33ms, to lag. if the beaglebone had more power a single thread would work, but because it does not the 2 threads are needed.
As for grabbing the image directly, openCV has no way of just opening a jpg over the network, so some other library had to be used. I was using urllib because that was the only one i could find working. And it would take 70 ms just to connect to the camera and download the image to an already allocated buffer.
I would definitely like to hear more about this approach.
I am no expert on the efficient use of OpenCV, and what you are proposing just might be the key to get rid of the minor delay we are seeing in our images. We can process 15-20 fps, but there is a small, but noticeable lag in our images.
We are using a USB camera and a PCDuino, but otherwise it is mostly the same.
Does the lag show up before processing or after processing. If it shows up before processing that could be a solution. If it shows up after processing then its the actual processing time that could be causing the lag. One way I liked to test this is to put a timestamp at the beginning of the processing, then another at key points in the processing. Then you can calculate the difference and see how much time each process is taking. I used this on ours to see that the morphology and thresholding took about 50ms to do on a 320x240 image. This would cause any processed image to be more behind then that, at which point is easily visible.
And if it helps, there’s an incomplete Node.js implementation of VAPIX that supports a video stream with the slowed down fps you need. https://github.com/gluxon/node-vapix
This answers your question, although I feel like the problem will still occur, just over a longer period of time with this change.
Edit: I would actually go with what virtuald said in his post for the solution. Sounds like a more complete way of handling things.
Not true. If you have ffmpeg support enabled (which it is by default in most opencv builds), you can retrieve the mjpeg stream directly from the camera via FFMPEG. Something similar to the following (error checking omitted):
vc = cv2.VideoCapture()
vc.open('http://%s/mjpg/video.mjpg' % camera_host)
h = vc.get(cv2.cv.CV_CAP_PROP_FRAME_WIDTH)
w = vc.get(cv2.cv.CV_CAP_PROP_FRAME_HEIGHT)
capture_buffer = np.empty(shape=(h, w, 3), dtype=np.uint8)
while True:
retval, img = vc.read(capture_buffer)
In 2012, I had to tell a student that the vision software they worked on was not allowed because the FTA wouldn’t let us use it. Then, we were stupid enough to try vision again in 2013, thinking that the “bandwidth limit” (which doesn’t work) would prevent the FTA from banning the dashboard, but no, the FTA said we weren’t allowed to use our vision stuff yet again, but this time, because a robot on the field had stopped working “because we sabotaged their network connection”. What really happened? The other teams battery fell out. Not our problem, but we still couldn’t use vision. Vision is not a guaranteed ability for robots. Nor are dashboards, or any amount of communication.
I think the crux of this thread points out the error in your statement.
Yes, it is really a shame when a feature that has been pushed by FIRST is taken away, for any reason, and that feature is part of the major design a team has undertaken. The point of this thread, on the other hand, is how to avoid this possibility entirely.
Here is why this thread addresses this. If the vision processing is done by a separate on-board system, Beaglebone in this case and PCDuino in ours, it removes the wireless network entirely. All acquisition of images, processing and communication of target information remains on the robot. Honestly, for much less than $200, any team can do this! It just takes time to learn how, and the information is already readily available.