It seems of late that whenever we deploy new code and reboot the cRIO from the Driver Station, the reboot appears to go well - all DS status lights return to green. However, upon enabling, the bot’s PWM and relay outputs are non-responsive. We have to power cycle the bot once (and infrequently, twice) to get the program to execute, after which we have no problems. This is obviously very annoying, as you have to wait much longer to enable the bot following a power cycle than a reboot. I am wondering if there is a hidden issue somewhere in the code or with the Driver Station or cRIO that is causing this unwanted behavior.
Our code compiles and deploys with no apparent errors. We use the Classmate wireless adapter and the robot radio in AP mode. I haven’t noticed any red dots on the DS or anything else that indicates wireless comms problems lately. We have the latest version of the DS installed, as well as the latest C++ update.
So does anyone have any initial thoughts on this matter? Debug steps to pursue? We’ll have access to the bot once again on Monday afternoon, so we can try out any suggested debug operations then. Thanks in advance for the advice and help!
Enable NetConsole with the cRIO Imaging Tool, then tether up, run it and look at the output after a deploy + warm reboot versus a cold boot. Definitely want to be tethered, as the cRIO could boot faster than you’d get a wireless connection. If everything is as it should be, then the output on a warm-reboot vs a cold boot should be identical. If it’s not, the we have information to work from.
Forgot to mention that you should probably run a practice match after booting. Or atleast enable teleop mode. That starts several functions that could throw relevant exceptions. We had the silly problem of not testing for a null pointer and killing the robot if we enabled in teleop without running auto first. Exception came right up in Netconsole and made the problem pretty obvious.
I don’t believe wired vs. wireless is the problem. Warm reboots via both methods yield similar error messages in NetConsole:
“alignment
Exception current instruction address: 0x0148c8bc
Machine Status Register: 0x0000b012
Data Access Register: 0x005edc3d
Condition Register: 0x24000228
Data storage interrupt Register: 0x00002809
Task: 0xee0c60 “FRC_RobotTask”
0xee0c60 (FRC_RobotTask): task 0xee0c60 has had a failure and has been stopped.
0xee0c60 (FRC_RobotTask): fatal kernel task-level exception!”
This does not occur following a cold power cycle.
So now to find out what is causing that. I am quite unfamiliar with how to navigate debug mode. I believe I found directions to setup the mode and enter it, but after that, I’m not sure how to proceed. Any relatively straightforward way to trace the problem?
I’ve only rarely used debug, so I won’t be much help there, I’m afraid. What I can say is that your netconsole capture isn’t matching up with a normal boot for our robot, which is a little puzzling. We’re using cthe command based template and it throws up a few messages about overriding the default disabled functions. Are you using the command based template, or something else?
Anyways, you can try sprinkling some printf statements through your code to catch where the exception is occurring. I’d put a printf at the beginning of the constructor of whatever classes you’re using so you can see if it’s a problem with a particular class. And if it makes it past constructing the classes, put some printfs in the initialization functions.
We actually had an interesting and slightly similar problem. When the cRIO was booting, NetConsole would show something wrong and have the same message. Our solution was to reimage the cRIO.
We found the problem - referenced a custom PID class in the .h and never instantiated it in the .cpp - insipid little thing. I actually got dangerous enough using the debugger to have it point toward the exception’s root cause. Everything works great now. Huzzah and such.
Thanks to both of you for the advice. We actually reimaged the cRIO. It didn’t fix the problem, but it never hurts to eliminate that particular uncertainty from time to time.
My current favorite problem is forgetting to initialize a subsystem to null in CommandBase.cpp. Compiles perfectly happily and then the code doesn’t run with NetConsole complaining of the task holding a reference to an uninitialzed variable.