Serious crashing issues

Copied from an email I just sent to some mentors:

GitHub - highlandprogramming/WIP-2013-Code-FRC: Repository for 2013 code in development.

All changes can be seen there and I can also point them out tomorrow.

Biggest visible problems:

Higher variable ping rate. Anywhere from <1 ms to over 800 ms, it jumps around and each spike has a random action.
C-Rio’s processor usage is between 98% and 100% last Monday, it was only around the lower 80%s. The change between packing the competition robot and today had only one change, the reversible controls on the drive. However, when I initial added and test that. CPU usage was still in the lower 80% range.

The crashing upon start has existed since week 3, but hasn’t be as noticeable until I recently when it was happening instead of every few times, to every time. It became more noticeable when we were fixing the low hang code. It started before we even made changes and effected out progress the entire time when downloading and testing, as sometimes a full restart was required to work again.

I suspect the ping problem is directly related to the high CPU usage causing delays in being even able to respond.

The high CPU usage is there with and without the Camera code.

The following messages are displayed by the Drivers Station in the follow order, total time between the first and last message is slightly less than 1 second.

Teleoperated Disabled, Autonomous Disabled or Test Disabled
Teleoperated Enabled, Autonomous Enabled or Test Enabled
Watchdog Not Feed <------Code execution stops, hence not being feed, however, this all happens in less than 1 second, and watchdog is set to 1 second before requiring a feed line. The C-Rio then crashes
No Robot Code <-------C-Rio crashes and automatically restarts. ping communication severed for 5 seconds during initial C-Rio restart.

This happens 1 reboot, sometimes 2 or 3 and then functions normally (After wasting 15-20 seconds each time to get communication again.) Rarely, but has happened at-least 3 times, it gets stuck rebooting and never restores communication, switching from getting a ping, Request Time for 5 seconds then Destination Host Unreachable.

Another note, is we did not test the robot with another laptop. The ASUS Netbook was recently left on until the battery died for two nights in a row. Yesterday it failed to recovery from a resume and we had increased problems, and today it did a full out Windows Repair.

Any thoughts or suggestions on how to proceed would be greatly appreciated.

We have so far:

Tried physical roots of Driver station and robot. No dice.
Removing comments and extra lines (Minus reducing Watchdog feed lines)
Comment and remove recent changes. No dice, returned those changes.

Crashing existed on both the competition and practice robots. On both the C-Rio I and II

Using Windriver in debug mode, sometimes crashed Windriver as well. So we must have made a serious problem. I will try to get some kind of exception error out of it tomorrow, anything that can help find the solution, the better.

First thing tomorrow I am personally re-writing the entire code to see if that helps anything (This project has had random problems like entire lines not building with no errors. As well as 1 bad build with no errors.

As I just stated, any suggestions would be highly appreciated, especially since the first regions are only a little bit away, we want to get this issue resolved as soon as possible.

Are your running wireless, and if so do you encounter the same issues while tethered?

According to what the driver station says you are crashing when entering autonomous. I didn’t look through the entire code base but I did notice at least one problem with how your autonomous is setup.

The way the autonomous should be setup is with a state machine. The loop is intended to run fast. That is ever 20ms the loop runs when it receives data from the driver station. You should only need to feed the watch dog one time in the loop. Additionally you need to remove all your delays. All the waits you have are causing the robot to time out and use up all the CPU.

It would look something like this. This has not been compiled and may contain minor syntax errors and possible logic errors. The idea is to give you an idea of how a state machine is setup. If you need additional assistance let us know.


Timer autoTimer;
int autoState = 0;
int fireCount = 0;

autoTimer.start();

while (IsAutonomous() && IsEnabled()) {
	if (autoState == 0) {
		myShooter1.Set(-1);
		myShooter2.Set(-1);
		if (autoTimer.Get() > intPause) {
			state++;
			autoTimer.Reset();
		}	
	}

	if (autoState == 1) {
		s[2]->Set(true);
		if (autoTimer.Get() > WaitDash) {
			autoState++;
			autoTimer.Reset();
		}				
	}

	if (autoState == 2) {
		s[2]->Set(false);
		if (autoTimer.Get() > 3) {
			if (++fireCount == FireDash) {
				autoState++;
			} else {
				autoState=1;
			}
			autoTimer.Reset();
		}				
	}
	if (autoState == 3) {
		myShooter1.Set(0);
		myShooter2.Set(0);
	}
	GetWatchdog().Feed();
}    

For sure there is over feeding of the watchdog timer (WDT). You could even disable it completely to eliminate it as a possible source of a problem. I think you can safely comment out all the watchdog lines of code. In either case:

GetWatchdog().SetExpiration(1);

means you have one whole second before you need to Feed(). That’s a long, long time so once per loop call should be plenty. I don’t know how long the SmartDashboard calls take (I know they can mess up Command-based programs due to delays), so I still think it is safer to dump the Watchdog until you find the issue. And you can leave it out since there is a global WDT the OS provides for safety reasons. We only use it if we have sketch code that might hang so we don’t lock up the 'bot. To beat this to submission, these lines come right from the WPILib Getting Started Guide:

void Autonomous(void)
{
SetWatchdogEnabled(false);
Drive(0.5, 0.0);
Wait(2.0);
Drive(0.0, 0.0);
}

Some more observations on the code:

I’m not sure what you’re doing here:

for(intAutoCtrl = 0; intAutoCtrl < (1);intAutoCtrl++)
{

It looks to me like this line of code really does nothing. The first time though intAutoCrtl is 0 so the code in the loop runs, but the next time intAutoCtrl==1 so the loop is done. Why the loop? It won’t crash, just odd. Also, the parenthesis around the (1) shouldn’t be there.

GetWatchdog().Feed();
for(intDelay = 0; intDelay< (((intPause)*2)+1);intDelay++)
{
GetWatchdog().Feed();
Wait(0.5);
}
GetWatchdog().Feed();

Here it looks like you’re trying to sleep for certain period of time. I guess you’re not calling Wait() directly since you were afraid the WDT would kick? It’s OK, but another reason to nuke the WDT. You don’t need the Feed() before and after. Every 500ms is fine.

So, to wrap this all up, I think your Auton code should like like so:

void Autonomous(void)
{

            SetWatchdogEnabled(false);

	s[0]-&gt;Set(true);
	s[1]-&gt;Set(true);
	SmartDashboard::PutString("Gear","High");
	compressor-&gt;Start();

            // You can probably leave this, but in debugging I get rid of everything
	// myRobot.SetSafetyEnabled(true);

            
            // change these to doubles so they match the return value below
	double WaitDash = 0;
	double FireDash = 0;
	double intPause = 0;

	// Stuff deleted

            // get rid of casts
	WaitDash = SmartDashboard::GetNumber("fire_wait");
	FireDash = SmartDashboard::GetNumber("fire_amount");
	intPause =  SmartDashboard::GetNumber("fire_pause");
	
            // deleted all the camera stuff that was commented out
			
	
	while (IsAutonomous() && IsEnabled())
	{

		myShooter1.Set(-1);
		myShooter2.Set(-1);
		
                    // Couldn't tell what the intPause*2+1 x 500ms was all
                    // about so I just used the double value here directly. You
                    // might need to convert to a float.

                    // Wait before firing.
                    Wait(intPause);

                    // Fire frisbees
		for(intCount = 0; intCount &lt; (FireDash+1); intCount++)
			{
				
				s[2]-&gt;Set(true);
                                    // This is the same math as your loop (almost)
                                    // or you could just enter the wait value in the dash
                                    Wait(WaitDash*2.0*0.5);

				s[2]-&gt;Set(false);
				
                                    Wait(intSecondWait*3*0.5);

		}  // end of auton frisbee chuck

		myShooter1.Set(0);
		myShooter2.Set(0);
		
	}

}

That should be a lot easier to read in order to follow the flow.

You might also want to start your compressor before Auton mode is called. You might need some time to build up pressure.