|
|
|
![]() |
|
|||||||
|
||||||||
![]() |
|
|
Thread Tools | Rate Thread | Display Modes |
|
|
|
#1
|
|||
|
|||
|
Re: 100% CPU usage and double timeout bug
I saw the issue with deploying with a few teams this year. I'll probably be disabling the safety config in the disabled code next year. We are also asking the RT team to make the deploy more aggressive. Currently, it is too easy for a busy cRIO to take a long time to do the deploy.
It isn't clear why this is called the double timeout, or how the deploy is related to the excessive CPU usage. We're there diagnostic errors due to timing or other issues? That seems like a possible reason for the CPU usage to be high. Greg McKaskle |
|
#2
|
||||
|
||||
|
Re: 100% CPU usage and double timeout bug
Quote:
Our robot returned from St. Louis, less one of our 10 CAN Jaguars, due to our bridge-tipper being removed for shipment after CMP. As we don't have a bridge in our lab, we felt it OK to leave the manipulator and controlling Jaguar off. We fired it up several times this summer, and kept getting the watchdog error, accompanied by shuddering, when all motors would momentarily switch off, and then right back on. We immediately thought that the code was waiting for a reply from the non-existent jaguar, so we drew disable blocks over the related code in begin.vi and our timed_task.vi. The behavior did not change. Returning to the above quote, our understanding was that drawing the disable blocks, removed the underlying code from the compilation step. Is there still code (like the safety system) that still gets compiled in, even if it is "disabled"? Not sure if it makes a difference, but we are using the original 8-port cRIO, and occasionally find it temperamental to deploy code, or re-image. -- Len Last edited by Levansic : 10-08-2012 at 19:06. |
|
#3
|
|||
|
|||
|
Re: 100% CPU usage and double timeout bug
The disable structure does what you think. It disables the code as if it were deleted. The wires leaving the structure have the default value or whatever you wire up to the Enabled frame.
The comment about safety being disabled was referring to a simple modification to the default Disabled.vi to ensure that it doesn't cause safety errors. As for the potential cause of your error. Does the CAN topology make sense with that one disabled? Also make sure that the CAN connections, cables, and terminator are good. Check to see if there are errors on the Diagnostics Message box, and potentially add the Elapsed Times VI to the loop that you think may be running to slowly to update the RobotDrive often enough. Greg McKaskle |
|
#4
|
||||
|
||||
|
Re: 100% CPU usage and double timeout bug
CAN is working great. We are using a star topology, and termination is working. 2CAN reports no errors. We'll be looking at inserting elapsed times next week when we get back into the lab, after school starts.
The weird thing, is that everything was "working great" after CMP for a few local demos. At least that is what our students maintain. This summer, there were no on board code changes. We did change our custom dashboard to disable target seeking for our turret. Now it looks like our robot has epileptic fits. The packet structure out of the dashboard is the same, and we can switch over to the prior seeking code or our original calibration mode. Everything works, just with the watchdog error. This lead us to chase possible intermittent connection causes. We swapped out Ethernet cables and checked every cable connection we could think of. We had some prior problems with cable retention on one port of the D-Link switch, but this was not the problem. Not finding problems with any cables, we're back to searching for potential software issues. I didn't even think to check for code in the disable.vi. I know that we reference and close that missing jaguar in the disable.vi. Do you think that could be triggering the problem? -- Len |
|
#5
|
|||
|
|||
|
Re: 100% CPU usage and double timeout bug
I reread the post with less bleary eyes and noticed that you said you received a watchdog error. The disable topic was about Safety Config, so I had them confused.
Watchdog is not on by default, and Safety is enabled only for the RobotDrive. Please determine which is on, try turning it off and verify that the symptoms change. If the jerking is being caused by WD or Safety, then it means you are missing deadlines. It may also mean you have a workaround. Assuming you are missing deadlines, I'd verify you have no errors in the Diagnostics panel, as the current mechanism for catching errors and shuttling them to the window is quite heavy and can cause you to miss deadlines. If a missing jag is still being referenced, or a disable structure causes a wire to be bad and causes errors, ... The disable issue is problematic to the original thread because most robots are in disable when they are being reimaged or reprogrammed. If disabled robots are throwing errors they take longer to respond and sometimes require a NoApp switch or similar. Making the disable code less CPU intensive due to errors seems like it will resolve many of these issues. I don't think disable has any impact in your robot's twitch. Greg McKaskle |
|
#6
|
||||
|
||||
|
Re: 100% CPU usage and double timeout bug
The closest thing I've seen to what you describe is when we deploy code from a given computer while tethered. Now, while the code is running, we disconnect the tether. If you reconnect your tether cable you will be unable to redeploy code until you reboot the crio or power your robot up then down.
I cannot say that I've ever seen a situation exactly like what you describe. Once the cRio timed out, it won't do anything for us until we reboot it. Regarding the disabled mode and not deploying, we ran into the same problems with problematic deploys ALL the time last year (2011). What we found was that because we have every sensor, data accumulator, etc enabled in disabled (including vision) so that we could debug, it pushed the the CPU too high to do a successful deploy. We now encase all of our disabled code into a single If/Then case connected to a Button on the front panel. The default position of that button is OFF, so when we deploy to the robot permanently none of the extraneous sensor stuff runs in disabled mode. When we temporarily deploy from the programming computer, we turn the button on when we need the data to tune things, then turn it back off to deploy so there is no code running in disabled mode. |
|
#7
|
||||
|
||||
|
Re: 100% CPU usage and double timeout bug
Quote:
-- Len |
|
#8
|
||||
|
||||
|
Re: 100% CPU usage and double timeout bug
I'm a little late in posting about this, but here's a follow-up on our issue.
Our main programmer deleted all of the code related to the no-longer present CAN Jaguar. I verified that this code was covered by Disable blocks, but in some cases, still had wires coming or going. All of the problems and all of the watchdog errors we were having went away! Now, our diagnostic log on the DS is completely empty, except for a new message at references the watchdog. Again, we are not in any way calling any watchdog functions in our team code. The new solitary error code is as follows: Watchdog Expiration: System 1, User 0 Because the robot is now working great, I'm of the opinion that this error doesn't matter. At the same time, I am a little concerned because we shouldn't have any error for a function we aren't calling. -- Len |
|
#9
|
||||||
|
||||||
|
Re: 100% CPU usage and double timeout bug
Quote:
|
|
#10
|
|||||
|
|||||
|
Re: 100% CPU usage and double timeout bug
That single system watchdog error is expected. It is an artifact of the timing when the robot transitions from disabled to enabled. Don't worry about it.
|
|
#11
|
||||
|
||||
|
Re: 100% CPU usage and double timeout bug
Quote:
Quote:
-- Len |
![]() |
| Thread Tools | |
| Display Modes | Rate This Thread |
|
|