Announcing ocupus, a toolkit for working with onboard vision processors

I’m a mentor on FRC 3574. We started onboarding computer vision in 2012, and haven’t stopped since. ocupus is built around solving the consistent problems we’ve had year to year. The software is built on top of Ubuntu running on ARM platforms with NEON support (sorry Rasp-pi’s).

The biggest feature of this toolkit is realtime streaming. Using WebRTC this gives both humans and vision processor realtime access to any cameras on the system.

The super high tech GUI:

There’s a handful of other features planned, mostly around easing system management and in-browser coding, but I feel that’s the big one.

At this time, at the git repo the only thing of substance are the ARM binaries for the WebRTC peer senders. Please reach out to me if you’re interested in playing with this, I’d like more teams input on this.

As extra incentive, if you send me a pull request with any helpful change (even just good documentation!), I’ll accept your change of the subject material of the screenshot.

What part of this is ARM-specific?

Only the precompiled binaries so far. You can run this on an x86 with no trouble. I’ll be adding those binaries soon.

Actually, it’s a bit easier, since the modules are a little saner. DKMS doesn’t work on most ARM platforms, so v4l2loopback is tricky.

Another thing that’s mentioned on git, but I meant to mention here. WebRTC is around 10x more efficient with compression than the AXIS camera. Right now I use 256kbps for my video, and that seems to be about the same quality as 3.7mbps mjpeg.

The more modern Axis cameras support better compression standards, but the licensing of the codecs is potentially an issue on a FIRST team. Also, the benefit of the newer codecs goes down as the robot starts to move and virtually no content from frame to frame is the same.

For these reasons, we kept the WPILib libraries using MJPEG.

There were teams who used H.264 on the field last year streaming to a browser. Of course, the one I dealt with determined their camera bandwidth with a stationary robot. They severely underestimated, and made the FTA very nervous in their matches.

The high variability of the encoding versus the predictability of MJPEG was also a consideration.

Greg McKaskle

ocupus uses VP8 as its codec, which is available under a BSD license and as such free for anyone to use.

As for frame-to-frame variance, at 30fps, there is usually enough interframe redundancy to still beat mjpeg pretty handily. You can tell this by looking at the relative size of the P-frames in the encoded stream.

One of the most useful things chrome has for me while testing this is at chrome://webrtc-internals which provides bandwidth graphs. Here’s one that resulted from me quickly moving the camera from looking at my … ahem … “visually noisy” desk and the view outside my window:

I can see that being an issue, and until ocupus gets more real world testing and performance characterization, teams should stick to existing size suggestions as well as having a very large safety margin. For 3574, we’ll be using no more than 2mbps.