Chief Delphi

Chief Delphi (http://www.chiefdelphi.com/forums/index.php)
-   C/C++ (http://www.chiefdelphi.com/forums/forumdisplay.php?f=183)
-   -   Serious bug identified in SmartDashboard/NetworkTables -- robot hangs (http://www.chiefdelphi.com/forums/showthread.php?t=126102)

Aren Siekmeier 10-03-2014 23:22

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
Quote:

Originally Posted by NotInControl (Post 1356714)
... we use Java ...

We switched from C++ to Java after seeing our issues, and while it was hard to reproduce with C++, everything we have tried so far with Java has shown no sign of the problem.

NotInControl 11-03-2014 18:12

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
Quote:

Originally Posted by compwiztobe (Post 1357067)
We switched from C++ to Java after seeing our issues, and while it was hard to reproduce with C++, everything we have tried so far with Java has shown no sign of the problem.

That is interesting. I have to admit I have not yet confirmed this to be the cause of the symptoms I saw on our robot in the pits this passed weekend.

The robot in question is now bagged, however I will be trying to recreate these problems on our practice bot over the next few days.

The symptoms expressed in this thread were very similar to the symptoms we saw which is why I think this bug may be a suspect.

However, we have always had all of our smartdashboard calls in a separate thread that gets started on robot init. The reason for this is to reduce the amounts of writes per second.

The only smarthdashboard call I have that is running in the same thread as the robot thread is our autonomous sendable chooser which runs in the disabledPeriodic() block.

We are going to do testing with and without this function call to see if we can get the robot to hang again. During our quick diagnostics in the pits, the only way we could re-establish full comms was by restarting the robot, and the driverstation/dashboard. Doing just one or the other was not enough to correct the problem.

I am more concerned with preventing the robot from hanging then having my dashboard work.

We have never seen this problem on the field, as we always have a standard practice to shut robot off, and exit all dashboard/driverstaion windows prior to every match.

JamesTerm 11-03-2014 22:30

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
Quote:

Originally Posted by JamesTerm (Post 1356495)
I believe the last issue remaining deals with the time it takes to connect to the time it takes to make the initial first write. I'm thinking of putting a sleep in there as well as taking a closer look at ConnectionMonitorThread::run()... I suspect this thread may not be sleeping in some cases... but I could be wrong.

No, no, no... I was wrong. I found this bug! grrrrr, ok it's now fixed... same place to get the zip and patch. Here's the full patch including Dustin's fixes. https://www.dropbox.com/s/f4mcx9x0hj...tinPatch.patch

I'll explain... this latest fix is for the client side code mostly... what was happening is during the time when the server (robot) loses connection like rebooting the cRIO. The client code was still trying to issue reads and throwing exceptions... the fix knows when this has been closed and when the reconnect has been issued... so during that time it will stop issuing the reads. On some platforms (e.g. win32) the read would return bogus data, which is another issue, but the most important thing is that it should be calling the read when it knows it should succeed.

This has been a week of hair pulling for me... but now I think it is good to go. Of course the key to the success of this (like anything else) is a lot of testing. All of the other fixes are just as important as this one... they all are needed to resolve issue. I'm looking forward to hearing back from anyone who wants to test it before the official release. Thanks.

Now I'm signing off of this task... and going back to other code. :)

JamesTerm 16-03-2014 10:56

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
For anyone who has been following up on this thread, I just wanted to say that we tested the smart dashboard and network tables (with the James/Dustin patches) for the Dallas regionals with no issue. I gotta say I felt a little bit of anxiety the first 3-4 matches, but felt more confident as the days progressed... we left driver station running on full time with SmartDashboard and Driver Station windows always on where this tests the stress of cRIO reconnect on existing connections. We also use the GetNumber() for autonomous ball count. It always maintained the correct ball count throughout the day. I am hoping more teams will use this again once these patches are officially released. I'll post back here when they are.

virtuald 19-03-2014 18:17

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
Update: FIRST has released an official stable release that should address the problem. It will not be a required update for teams, but if you use NetworkTables I'd highly recommend it. It can be downloaded here: http://first.wpi.edu/FRC/c/update/Stable/

JamesTerm 19-03-2014 18:41

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
Thanks Dustin for the posting... I'll keep an ear out here for any issues that may arise.

Joe Ross 19-03-2014 19:42

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
Looking through the source, it looks like artf1712 was also fixed, as well as http://forums.usfirst.org/showthread...ive-data-rates

virtuald 22-03-2014 23:53

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
At the Virginia Regional this weekend, I helped out a team using Java that would inexplicably go to 100% CPU and all control would drop out. While there were some definite problems with their code, it turned out that when they commented out all the SmartDashboard code, the problems stopped happening. Very odd.

kylelanman 23-03-2014 02:36

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
We were aware of this bug and attempted to avoid the situation by always power cycling before placing the robot on the field. I am nearly certain we reproduced this bug on the practice field. We had a faulty ethernet cable. During robot power on the connection was intermittent. The dashboard never came to life. We ended up having to hard power cycle the robot and restart the dashboard. Restarting the dashboard may not have been necessary. But we did them in tandem and the dashboard came back to life. Minutes after this we applied the patch and had no NT problems the rest of the regional.

JamesTerm 23-03-2014 09:40

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
Quote:

Originally Posted by virtuald (Post 1362815)
At the Virginia Regional this weekend, I helped out a team using Java that would inexplicably go to 100% CPU and all control would drop out. While there were some definite problems with their code, it turned out that when they commented out all the SmartDashboard code, the problems stopped happening. Very odd.

Was this team 2481, and was the 3886 patch already applied before these problems occurred?

To kylelanman: What programming language are you using?

virtuald 23-03-2014 10:23

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
Quote:

Originally Posted by JamesTerm (Post 1362903)
Was this team 2481, and was the 3886 patch already applied before these problems occurred?

It was not team 2481, and the team was using Java, so the 3886 patch would not apply.

kylelanman 23-03-2014 23:38

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
Quote:

Originally Posted by JamesTerm (Post 1362903)
To kylelanman: What programming language are you using?

C++

JamesTerm 24-03-2014 17:49

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
Quote:

Originally Posted by virtuald (Post 1362919)
It was not team 2481, and the team was using Java, so the 3886 patch would not apply.

Ah ok... I looked over the JAVA source and it is different in its synchronization locking management, and from what I've heard there's been no known issues with it like there was for c++. I just wanted to make sure that the patch fix does not have any more outstanding issues... and so now I know this patch does not apply to JAVA teams. If this is the only known issue it could be a red herring. I will however keep an ear out for JAVA issues too... in case we need to code review it for next season.

Joe Ross 25-03-2014 19:47

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
The update is now availible in the release folder as announced in today's team update. http://first.wpi.edu/FRC/c/update/Re...325rev3887.exe

MamaSpoldi 26-03-2014 14:25

Re: Serious bug identified in SmartDashboard/NetworkTables -- robot hangs
 
At the risk of sounding ignorant in this excellent technical discussion...

Am I correct in thinking that even if we do not explicitly perform any NetworkTables operations that we could be affected by the bug in question and therefore need the update? We use the SmartDashboard only for simple display operations, eg. calling PutXXX to display values on the dashboard screen. It sounds like these operations use the NetworkTables behind the scenes and are therefore subject to this issue. So I wanted to verify if we need to install the update.

FYI, we are using C++ on the robot and the SmartDashboard on the driverstation.

Also, does this update include a change to the .jar file for the dashboard implementation which runs on the driverstation laptop or just a change to the library code built into the application that runs on the cRIO? I ask so that we know if it needs to be installed on the driverstation as well as the programming laptop.

Thanks.


All times are GMT -5. The time now is 03:50.

Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi