Camera Starving Watchdog

Last night we noticed that the lights on our Jaguars randomly flicker every once in a while (we still have our electronics mounted on plywood away from the robot so it’s more convenient to test things on both platforms). A couple of details on our system first:

  • We’re controlling a string of Jaguars through CAN, bridged through a black Jag. We currently have 5 active.
  • We’ve added code to the default FRC 2010 project that comes up on LabView. We have changed Begin.vi and Teleop.vi, and nothing else.
  • We’re using a modified version of the camera tracking code found in the case statement in Teleop.vi which moves the camera on a servo (for now, again, electronics separate from drive train).
  • Drive system is mecanum, driven off of the Holonomic Drive VI that’s been modified twice (first for scaling, second to convert to CAN from PPM).

Anyway, this morning I narrowed down the blinking to a watchdog error (noticed the blinks corresponding with a very brief DS “Watchdog Not Fed” message), and a bit later, I found that after disabling most of our custom code and trying the default CAN project (which worked fine), that unplugging the camera ethernet cable stopped the watchdog problems, and plugging it back in started them again. I lowered the resolution from 320x240 to 160x120, and it’s been fine so far, but camera tracking is less reliable up close.

The difference between the working CAN project and the FRC project is that the CAN project opens the camera and gives it a reference number, whereas the FRC project opens the camera, sets the brightness, exposure, color threshold, compression, resolution, etc…

Does anyone know what particularly is going “wrong”, if anything? I would imagine this is no where near the limit of the processing power of the cRIO, I’ve seen them do way more on FRC demo videos (remember the OCR demo 2 years ago?). If at all possible, I would like to lower the color depth instead of the resolution, because in all honesty, 32-bit color depth doesn’t add much over 16-bit when you’re dealing with a webcam. First of all, there’s an entire byte that’s being processed for an alpha channel that doesn’t exist, and by flattening the colors to thousands from millions, I think the camera software might have a bit easier time recognizing shaded surfaces as contiguous.

Opinions or suggestions?

What I suggest is to put things back into the error state, plug the camera in, etc. Then modify the code to time how long the Tele is taking. The easiest way to do this is to drop a Milliseconds block, a feedback node, and a subtract. wire this to an indicator, or perhaps even better a chart as shown below.

This will give you an indication of how long the Tele is taking. It is not really a good idea for it to take the full 20ms period or it will miss control packets. My assumption is that at least some of the time, the Tele takes too long. If that is the case, play with the parameters or look for how to make it better and worse.

I don’t know how much time it takes to send out the CAN commands, but since the serial is 115kbaud or something like that, the limit will not be that hard to reach.

As for the camera stuff, this is hopefully not in any way in your Tele loop. Ideally it runs in parallel, sort of as a background task and the results are harvested when available to be used in the next Tele iteration. I suspect your usage is fine, but I’ve seen many teams who naively put it in the Tele, so I thought it was worth mentioning.

Anyway, with the BG camera task, taking a certain amount of CPU away, it makes it easier to overstep the watchdog time in the Tele loop, but it isn’t directly responsible for it. The camera actually exports one of two jpeg formats, and by definition, color jpeg is 24 bits per pixel. IMAQ decodes this to a color format that uses 32 bits, primarily for architectural alignment. It is actually more expensive to access three bytes on random address alignments than to pick up one four byte long that is well aligned. So this is the classic space for time tradeoff and the 33% space waste is done to speed up the pixel access times. The camera does support a different form of JPG without color, and at least last year, that didn’t decode properly. I’m not sure if it has been fixed. Because it didn’t work, I didn’t profile it, and perhaps that is a way to lighten the load and speed up the vision. Since you are so adept at this, maybe you can beat me to it :).

Post back once you have some measurements.

Greg McKaksle

I am in no way adept at vision processing, I just know random bits and pieces that I’ve picked up over the past few years.

We have the camera processing code in the right place, in a parallel loop inside Robot Main.vi, and the only camera code in Teleop.vi is the “Rotate To Target” VI that was already there, and is only run when button 3 of joystick 1 is pushed. I’ve added in some conversion code to turn the rotate into a servo controller (which all works), but the problem occurs even with that disabled.

About the CAN, the CAN transceiver chip in the beige Jaguar can operate up to 1 Mbps, but limited to the serial port 115,200 baud and running encryption, the useful data rate might be much lower. I can’t find any data on the black Jags so far, but I assume it would be either the same or faster. I do not believe that CAN is the issue, as the default CAN project has no issue with the watchdog, and even with all of the new CAN VI’s I’ve put into our code disabled, our project is still slower than the CAN default project. The only differences in processing speed I can think of is the CAN signal encryption and the fact that CAN is handled in the processor instead of the FPGA like the servo signals (FPGA->digital module).

I don’t have access to the cRIO until tomorrow, so I’ll check out the timing and let you know. Thanks for the suggestions, and I’ll keep you posted on how it goes.