![]() |
Still having some watchdog issues
We just updated to LV2.1 and dashboard1.1, the documentation from NI said that this should solve the watchdog glitches. For us, it has helped, but not completely eliminated the problem, occasionally a solenoid will for a brief instant return to it's default position. On the dashboard we can see several watchdog timeouts.
This isn't a huge issue, it's fairly uncommon and doesn't affect the bot much. But it's a bit worrisome. Anyone have any idea what else could be causing watchdog errors like this? thanks |
Re: Still having some watchdog issues
This is a tough problem which my team had when it came to autonomous. Lots of our autonomous codes were not working because of the WatchDog. Mine was the only one that worked, and was the simplest one. Try making the code simpler.
My team's cheif programmer managed to fix one other code, but it was unclear how that was done, even to him. |
Re: Still having some watchdog issues
Quote:
Ours is a problem for both autonomous and teleop, and it doesn't really cripple the code at all, it just will occasionally startle us with an unexpected solenoid twitch. The only reason I'm at all worried about it, is that it could mess with our timing that prevents us from breaking the rules about movement outside of the bumper perimeter. (rule G30a) |
Re: Still having some watchdog issues
The update fixed the issues we knew of with the watchdog, but this is one of those where until all contributing factors are accounted for, it isn't completely fixed for everyone. I'm working with a team who still sees the watchdog glitches. If you see any patterns, please let us know. My current theory has to do with the virus scanning, but I need to show the data to some other people, then figure out various ways of addressing it, etc.
Greg McKaskle |
Re: Still having some watchdog issues
Our Team is having the same problem. We simply turn the bot on in TeleOp and it will randomly cause the compressor to turn off for a split second and simultaneously cause all of our solenoids and pistons to twitch. We thought it might have been a watchdog error because we put the robot up on blocks and drove it forward for a while and we didn't notice any "ghost fires". Im guessing that perhaps when no one is touching the joysticks, the watchdog goes off. I can't say for sure though.
|
Re: Still having some watchdog issues
I want to point out to teams that are experiencing occasional watchdog timeouts and think it is not a big deal as they only see a short glitch for a fraction of a second, at a competition it becomes a HUGE deal... the field control system will shut your communications down between your operator controls and the robot on just one watchdog error - you will have a dead robot for the rest of the match - we saw this in multiple matches at the Suffield Scrimmage (Suffield HS, CT) where an official FIRST field control was used. I have a suspicion that control can be regained by restarting your dashboard program, but this would take a while and is not a very good option during a match.
We were able to resolve our watchdog errors by removing all excess code (LabView) that we were not using (such as camera, gyro, and solenoids that were programed, but attachments not yet made on our robot) and ran for a few matches without errors. Greg - thanks for the lead on the virus scanning, I think we had it turned off, but I will look into this. For us it was usually System Watchdog errors, and I've monitored Classmate CPU usage to be 50-80% when errors occurred, and have seen it at 100% (while shutting down LabView when dashboard was running) but did not have errors occur then. Merle Yoder GRUNTS Team #3146 |
Re: Still having some watchdog issues
For the record, we too are struggling with what appear to be watchdog issues. They manifest themselves as lags both with control assertion and de-assertion. We have the latest updates applied, but it did not address the problems. We have stripped our code down and sprinkled watchdog triggers in the code. After reading the previous post, we are concerned that this will become a fatal problem when we get to see a real Field Control system in competition.
Any thoughts would be appreciated |
Re: Still having some watchdog issues
Our situation:
We have 2 crios. One on the practice bot, one on the comp bot. We have been using the practice bot with no problems (no disables when we get the watchdog messages). Today we wired up the comp board and put it on the comp bot. Before we connected the victors to the drive motors, we powered everything up and downloaded the code to confirm function. Every time we see a new "watchdog expired" message, all the victors "blink" for a second. The odd part is, they don't blink to disabled. Somestime they blink to red, sometimes to green... the blink is 100% irregular. We tried a new battery, rewired all the power wires to confirm no clamping on insultation, wiggled every circuit on the board to attempt to isolate the problem, unplugged every pwm except one and still had the problem on that one. We just went through and reimaged to the newest image last night, plus the newest DS, and we're going to try tomorrow again before ship. However, I found it very odd that with the same code and the same DS the Crio's are acting differently, one blinking the victors during the watchdog expires and one not. |
Re: Still having some watchdog issues
Quote:
After Greg's post, we just decided we'd have to live with it. Tomorrow is ship day, we have no testbed. I guess we'll have to pray for a miracle on unpack/practive day! :/ Maybe it's the cRio version? We just imaged the only version that showed up on the list of options, are there more? EDIT: As for dumping bits of the code, we are using everything except the periodic tasks loop... Do you have any idea what part of the code it is? Maybe the camera feed is causing a delay or something? |
Re: Still having some watchdog issues
I do not believe that a watchdog occurrence will cause a robot to be disabled for remainder of a match. The FMS really doesn't even know whether the robot is watchdogged or not. Can anyone confirm this? Were there other circumstances such as an FTA disabling the robots?
Greg McKaskle |
Re: Still having some watchdog issues
Still having the same issues. Even with a code without any timing. This error is causing our robot to disable pretty rapily sometimes and then slowly others. We believe that this error will affect our on field piloting.
|
Re: Still having some watchdog issues
If you are still having issues with watchdogs, please unplug the I/O module and determine what effect this has. We are still investigating the cause, but it appears that in some setups, the I/O board can cause excessive driver activity which can delay the loop execution and lead to a system watchdog.
Please post one way or the other as to whether this has an impact. Also, it may be useful to test with a different I/O module, different USB cable, etc. Greg McKaskle |
Re: Still having some watchdog issues
we are also having watchdog issues. we have two fisher-prices on 2 seperate relays, both 20 amp. when tring to run them with a heavy load, we either blow the spike fuse, or we get a watchdog error. the compressor is also stuttering, also because of watchdog.
under diagnostics, DS: "Watchdog Expiration: System XX, User YY" where XX and YY are disconnected numbers that only go up or stay the same. |
Re: Still having some watchdog issues
Greg,
Quote:
Quote:
Where can we learn more about the FMS? (that is the name for the central control?) |
Re: Still having some watchdog issues
Quote:
What FMS responds to is that Stop button. On the real field the E-stop button is quite a bit more robust and hitting that WILL disable your robot for the remainder of a match. |
Re: Still having some watchdog issues
Quote:
Those motors draw much more than the 20 amps a Spike is good for. |
Re: Still having some watchdog issues
Quote:
As I understand it, there are two watchdogs, the User watchdog in the code and the system watchdog on the cRio. I have no idea which watchdog is causing this error. You say the FMS doesn't care about the User Watchdog, this is good news for my team. Is the same true for the System watchdog? |
Re: Still having some watchdog issues
Since watchdogs are timing issues, you might try this approach to turn off auto error logging. It may help.
|
Re: Still having some watchdog issues
Since watchdogs are timing issues, you might try turning off auto error logging. It may help. Look at the post titled:
Turning off automatic error logging in LabVIEW may improve loop timing |
Re: Still having some watchdog issues
User and System watchdogs look the same unless you are the safety code that knows which timer timed out.
Greg McKaskle |
Re: Still having some watchdog issues
Quote:
FMS only kills permanently on an E-stop. After an E-stop the robot MUST be reset to run again. The E-stop can be engaged at the Driver Station by the drive team or a referee. It can also be engaged at the FMS station at the scorers table at the discretion of the FTA or Head Ref. |
Re: Still having some watchdog issues
that definetly explains why the fuses are blowing, thanks, but what about the timing out on the compressor? I can't figure out why it would be stuttering, because it should be continious. and the relays should be operating just fine even when not under load, but still time out.
is there a limit to how much code we can put in teleop to prevent it from overtiming? or is it all electrical? |
Re: Still having some watchdog issues
Ok. We've fully updated. We're back to seeing the "expired" watchdog errors. We let the bot run in teleop (non-moving) enabled. It does not happen in disabled. It sat there for 26 minutes, and we encountered 96 watchdog expired messages. Every time one came through, it disabled the vics/spikes on the board for a split second.
We'll try the above mentioned change to improve loop timing, but I sure wish there was a way to diagnose this back to the root cause. Correct me if I'm wrong, but if it's my code shouldn't it be happening a wee bit more often than once every 15-20 seconds? I have NO loops inside my code at all, other than the vision code (default, no mods). The vision code is currently turned off. I'm going to try loading a default code tomorrow and see what happens. |
Re: Still having some watchdog issues
I realize this is the LabView forum, but we are having a very similar problem in C++. In fact, it can be duplicated with a very simple (<50 line) C++ program (with no Watchdog object at all). The symptoms are: every 30 seconds, or so, all the motors, servos and solenoids on our robot turn on and off for about 100-500 ms. This happens repeatedly over a 2-3 second interval, after which everything is fine for another 30 seconds - 1 minute (and then the whole sequence repeats).
When this "chatter" occurs, we get Watchdog timeout message on the driver station. We get this same problem when using the out-of-the-box SimpleRobot sample code in WindRiver. It happens with a much simpler sample as well (no Watchdog object instantiated at all). It happens on multiple cRios. It happens even if we raise our Task Priority to 1. We are running out of ideas on debugging this. If anyone has any ideas on this they would be GREATLY APPRECIATED! |
Re: Still having some watchdog issues
LyraS,
You could try running it from a different computer. Some people had some luck with that on the NI forum. I can't test that myself though :( |
Re: Still having some watchdog issues
On three matches at the Suffield Scrimmage our robot became disable during the match. After the second time I had the field software guy come over (sorry don't remember his name) and he looked at our system and told me that since we lost communication with the robot due to a watchdog error that the FMS would lock us out until it could regain communication. We would get an "FMS Locked" display instead of "FMS Connected". The only way to get rid of the Locked mode was to reboot our dashboard. We never tried to do this during the match.
After the second time we were disabled, I removed all of the LabView default code that we were not using (camera, gyro, autonomous), with the exception of the T/F code for gyro contro in the Teleop block (I just didn't have time to deal with that code). We then ran fine for 2 rounds, but on our next round we became disable again. This time I found many,many errors listed for the gyro having no refnum (I removed it from the Begin block) - my theory: one of the students had accidentally pressed button 3 which activated the gyro code block that I didn't remove - it appears this error kept coming up until finally a watchdog error was tripped and our robot was disabled. I did verify after that round that the field people did not disable us (as a part had fallen of our bot, so the students thought we were disabled). So, from my perspective I believe the watchdog error was shutting us down - is it possible to have this tested? Is someone at FIRST able to create a System watchdog error to see if the FMS stops operator communication by going to FMS Locked mode? Greg, BTW we did have the virus software enabled , wifi was turned off on the Classmate, we use LabVIew, and we do not use the PSoC I/O board. We have kept our ClassMate and cRIO if there are any test you would like us to try (I feel we have resolved our watchdog timer issue though, by eliminating unused code). Merle Yoder GRUNTS Team #3146 |
Re: Still having some watchdog issues
Just to better explain the states the DS goes into for a field, because you don't see them that often.
The DS is always accepting a field connection. If the field shows up, some UI elements hide and the field is in control of the robot mode, the alliance color, etc. Once the DS has attached to the field, you are bound to it. If the field disappears, the robot goes to FMS Locked state until either the field comes back or the DS application is restarted. This is to prevent anyone driving the robot without field supervision. If your DS said FMS locked, that means that the ethernet cable connection to the field was no longer connected. It may be worthwhile hard connecting to the robot, going to the Diagnostics tab and slowly wiggling the connector and cable to see if you have a flaky laptop connector. Greg McKaskle |
Re: Still having some watchdog issues
Hi, we have a somewhat similar issue in C++.
We wrote a nice, long autonomous code that works perfectly. Everything works as coded. But occasionally when we switch from autonomous to teleop to reset the robot after testing the autonomous, the robot keeps going as if it were still in autonomous mode, even though the Classmate clearly states that it is in teleoperated mode. Other functions of the robot go crazy, such as our kicker and such. No watchdog errors, nothing. Just the robot running in a straight line kicking at thin air. |
Re: Still having some watchdog issues
That sounds like a missed user check for the current robot mode.
Code:
while (IsAutonomous() )That means you must constantly check to see what mode the robot is supposed to be in and act on that information, otherwise, your autonomous code looks no different than your Teleop code and the system doesn't go out of it's way to stop it. If you want to run autonomously during Teleop, no one will stop you. Likewise, if you really want to run your Teleop code during Autonomous, it can be run. For instance, if you mistype a command in Autonomous and tell the robot to Wait for 3 minutes, then you'll never get to drive. If you followed the default framework and Autonomous happens to end early, then you won't notice a problem, but if autonomous runs longer than 15 seconds, then it'll encroach on Teleop time and keep running. |
Re: Still having some watchdog issues
Fact wanted:
Can I delete the watchdog already in the default code? Will it help with some of the problems mentioned? Is it legal? Is it permissable to feed it in the periodic taskVI loops, and to start feeding it in the beginVI, and feed in both independant and iterative automonus? Other random question: When unbundling the joystick, it says there are 12 buttons, where I can only find 11 on the Joystick itself. I may have answered my own question: In teleop, default code says I can "0" my gyro by button #2. Is it possible, permissable to change that, which joystick does it normally read, could it be a problem if I have actuators on that same button? |
Re: Still having some watchdog issues
You can change whatever you desire in your code. It's legal.
The framework is just a starting point, and some teams don't use it at all. Teams that are not using the gyro or camera to track should rip that code out entirely. It's just a possibly useful example. The user watchdog is not required. It is intended to be a benefit in helping locate user code that takes too long. Used correctly, it can help protect you from unintended, uncontrolled rampant behavior by a robot. Used incorrectly, it can facilitate unintended, uncontrolled rampant behavior by a robot. If it is a hindrance then you certainly can remove it. I'm not a particular fan of it myself, because if you can correctly implement it, then you probably don't need it. If you cannot correctly implement it, then it'll be the cause of more heartache than saving grace. Alternate joysticks have more buttons, just as they have up to 4 more axis than the KOP joystick. |
Re: Still having some watchdog issues
Thanks a whole bunch, my life just became a lot easier!
Thank you, thank you, THANK YOU!!! |
Re: Still having some watchdog issues
Quote:
|
Re: Still having some watchdog issues
Quote:
|
Re: Still having some watchdog issues
Ok - another update.
1. We are not using the cypress board. 2. We have now disabled the boolean that allows error-reporting. 3. We have gone through our code and cleaned out every unused global, .vi, etc. 4. The diagnostic screen shows absolutely NO errors, (other than the old camera error on startup every once in a while - one http error and never seen again). 5. We do not have a single loop in our code, other than what comes in the default camera code. 6. We have removed the image-save function and all the graphs from the dashboard to minimize classmate CPU time. We have only the camera image, and the driver station portion running. Upon doing this, enabling the robot, and letting it run we are seeing approximately 1 watchdog expiration every 6 minutes where the robot will have a temporary "twitch". We've gone to the effort of replumbing and wiring the bot so that when everything dumps to it's disabled state it does NOT fire the shooter latches or pneumatics. Tomorrow we're going to start timing our code with the timer VI. However, I suspect more and more this is not a code issue but rather the classmate CPU deciding to do something. For instance, I noticed yesterday than when the driver station blanked the screen for sitting too long it generated 3 watchdog expireds in very rapid succession (all in a second or two). If we have no further success tomorrow, I'm going to make sure that system restore, the virus scanner, fast file indexing, automatic updating, and the BIT's service are all turned off permanently on the classmate to see what that does. I'm also going to change the setting on the classmate to maximum performance for the battery management. |
Re: Still having some watchdog issues
This is from http://decibel.ni.com/content/docs/DOC-2957 :
Quote:
One can easily imagine that various system (or application) processes within Windows could occasionally interfere with TCP packet transmission. Windows is not a hard-real-time OS. ~ |
Re: Still having some watchdog issues
Has anyone called NI on their hotline or posted a discussion on their forum about this exact reason? I think this thread is excellent, as there certainly is a repeating problem here that they need to know about.
We too have had this problem (our robot is shipped now), but don't really have anything new to report. Sometimes we would have to cycle power to the robot to get it to cooperate. If this helps, we had a strong lead that it was a power issue because the connection from our battery was bad. But it still happened after that was fixed. I have just read through the whole thread and it does not seem to be a code issue. Teams have removed all offending items (loops, unused code, etc) and this does not seem to fix the problem. What interests me is the lead about the Classmate slowing things down. People have mentioned turning off virus protection and that going into screen saver promptly fires off some watchdog errors. Tom Line also posted other information shortly before me that supports this. I think this looks like it's worth researching. Thanks Tom for saying that you and your team will. One thing I'd like to mention is that the user watchdog gets tripped/sends an error if it's not feed for half a second (by default). To me, that sounded like a long time in terms of programming. The code loops a couple hundred times per second (correct me if I'm wrong on that, I might be) and before a watchdog trips the robot is still operable (as seen in past occurrences). One would notice if, say, a race condition was holding up the code for a whole half of a second. I don't know what to think of that, other than that it's interesting and does point to the classmate. |
Re: Still having some watchdog issues
Quote:
The 20ms is established by the Driver Station, which sends teleop (and autonomous iterative) data packet to cRIO every 20ms. The teleop (and autonomous iterative) task is triggered by receipt of data packet. ~ |
Re: Still having some watchdog issues
Also - Greg and the other folks at NI have been hard at work on this issue. The last update put up on the web for the Crio image etc greatly helped the problem for us (100's to only a handful in a half-hour).
Unfortunately, our team is in total crunch time and any non-essential testing that doesn't involve actual function of some component of the robot is being put by the wayside while we hammer out our mechanical issues on the practice bot and prepare for the all-mighty 6 hour fix-it window we have. |
Re: Still having some watchdog issues
Quote:
Ok, thanks Tom for what you are doing!!! |
Re: Still having some watchdog issues
Just an update:
It turns out that we didn't install the new cRio image in the 2.1 labview update, but the watchdog errors still appeared even after fixing this. Regardless, the FMS at the Oregon Regional did not disable our robot, we had no problem competing. |
Re: Still having some watchdog issues
Have any of you teams gotten any lead on this? This thread has been pretty silent. We would if we could, but we can't do much problem solving here since we don't have a second control system. Just wondering because I've been following this issue.
|
Re: Still having some watchdog issues
I can report that during our 2 competitions that we just finished, there was not a single incident of our computer classmate or our robot having coms problems on the field.
We did have ONE incident where the joysticks decided not to work. Pressing F1 on the driverstation to refresh them solved the issue (they were also unplugged then replugged quickly too). We continue to have expirations. Sometimes they get VERY bad (dozens a minute). When that happens, a reboot of the classmate drops them back down to a handful. They do not, however, affect our field operation. Greg, if you'd PM me you email I'll zip up our code and send it to you. |
Re: Still having some watchdog issues
Thanks Tom for the update!
Quote:
|
Re: Still having some watchdog issues
Our team is having similar problems, though we're not sure if it's the watchdog. Browsing through this thread, we think it might be, but I didn't see a final solution. We're getting a stutter approximately every .5 seconds (with some variation) when we call Compressor.setRelayValue(Relay.Value.kOn) each iteration. If we only call the function once, we get a short burst of compressor noise and then it shuts off. Any idea as to what might be messing with the relay?
Thanks so much! We're in a crunch at crunch time! |
Re: Still having some watchdog issues
We are still having this problem at the Los Angeles regional. After combing through most of our code, we simply decided to start commenting out parts. The camera (only sending an image directly to the dashboard) was one of the main issues. We are currently trying to fix this but we may simply run day 3 without the camera because the hiccups give us a lot of penalties for leaving our kicker out. We are using C++ and have a fairly extensive system architecture, so that interaction with the camera may be part of the problem.
|
Re: Still having some watchdog issues
For anyone experiencing periodic system watchdog errors please see my posting here:
http://www.chiefdelphi.com/forums/sh...661#post943661 I have found that our problem with multiple system watchdog errors per minute was being caused by some issue with Intel's SpeedStep power saving scheme (on the Atom processor of the Classmate). By going into BIOS and disabling the SpeedStep and C-State options I was able to elimate our errors. We do still have a system watchdog error when transitioning from Autonomous Enable to Disabled and TeleOp Enabled to Disable... more investigating to do. Merle Yoder The GRUNTS Team #3146 |
Re: Still having some watchdog issues
Thank you so much Merle!!!!!!
|
| All times are GMT -5. The time now is 01:33. |
Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi