|
|
|
![]() |
|
|||||||
|
||||||||
![]() |
|
|
Thread Tools |
Rating:
|
Display Modes |
|
|
|
#1
|
||||||
|
||||||
|
Re: Bizarre cRIO Server Errors (FDIO)
Could you be running into this problem? http://www.chiefdelphi.com/forums/sh...d.php?t=126102
|
|
#2
|
||||
|
||||
|
Re: Bizarre cRIO Server Errors (FDIO)
Quote:
Also, we'll be switching to an 8-slot soon. I'm always somewhat optimistic that the 8-slot instantly solves all these mysterious problems. ![]() |
|
#3
|
||||
|
||||
|
Re: Bizarre cRIO Server Errors (FDIO)
Quote:
Reproducing the error is easy if you replicate the exact circumstances in which the bug occurs. Two things have to happen:
I've found that some wireless cards can induce this behavior more easily than others. For example, our driver station laptop causes this to happen very easily, whereas I find it difficult to randomly get this to happen on my work laptop. I theorize that wireless interference can cause the connection to drop randomly, but vxWorks doesn't always pick up on it and it hangs. The freeze is caused by the send buffer filling on vxworks for the network connection. So, if you send data less often, you will be less likely to run into this problem -- but the potential for a freeze is still there. Last edited by virtuald : 23-02-2014 at 16:28. |
|
#4
|
||||
|
||||
|
Re: Bizarre cRIO Server Errors (FDIO)
Thanks for the tips. Oddly, I was never able to reproduce the problem when I wanted to. I tried disconnecting the ethernet cable from our driver station for varying amounts of time but it seemed to have no effect. Then it would just be sitting there, perfectly tethered in, no one touching the robot or pc, and it would freeze.
I could try to make sure that a PutNumber is constantly trying to write, but I think our robot already does that - we have sensors actively writing to the dashboard whenever teleop is running. |
|
#5
|
||||
|
||||
|
Re: Bizarre cRIO Server Errors (FDIO)
The last few days we have been seeing problems again, seeing the RobotDrive timeouts. They typically (and somewhat consistently) begin about a minute into the robot being enabled in teleop (regardless of whether auto ran), and we are required to reboot the cRio because it is completely unresponsive. The SmartDashboard doesn't get any updates from switch states and the robot does not respond to any inputs. NetConsole shows a new RobotDrive safety timeout (the same as the one posted in the first post) roughly every 80ms. This makes sense for an output safety timeout, and suggests that some threads are still in operation while some other ones, particularly the ones that update outputs and the SmartDashboard, are hanging. We have not been seeing any more of the FDIO errors, only RobotDrive timeouts.
We have had other things to test and can still do so by rebooting the cRio, but we haven't a great idea where to start debugging this problem. We are in the process of rewriting our most basic functionality in other frameworks (both LabVIEW and a C++ IterativeRobot Project) to get away from the Command Based model. It seems likely that our set of commands (all of which seem pretty standard) might be finding some edge case that causes the hang. Does anyone have any more insight into this? If it is network table writes as has been suggested, why do these timeout? Do we really need to be guaranteeing packet delivery on our SmartDashboard updates? It seems something like UDP would be fine in this case, and would avoid timeout issues. |
|
#6
|
||||
|
||||
|
Re: Bizarre cRIO Server Errors (FDIO)
So I actually read all of that other thread now... it sounds like we are probably seeing the same bug with Network Tables. It looks like the bug is being worked through and a patch is in the works. In the meantime, I think we will also try establishing a separate thread dedicated to the SmartDashboard updates, so that the output threads can still keep up.
|
|
#7
|
||||
|
||||
|
Re: Bizarre cRIO Server Errors (FDIO)
Quote:
If you want, I can compile the WPILib binary tomorrow morning and send it your way. |
|
#8
|
||||
|
||||
|
Re: Bizarre cRIO Server Errors (FDIO)
I'm fine installing the patch as it is now, but I'm not sure what the process for that is. Would I have to download the source and compile it myself for that to work? If so, I would appreciate if you compiled the binary for us...
|
|
#9
|
||||
|
||||
|
Re: Bizarre cRIO Server Errors (FDIO)
Steps:
I'd also recommend changing the constant at RobotBase.cpp line 22 to indicate that you've changed the code. It currently reads "C++ 2014 Update 0". You can view this string in your driver station diagnostics, so if you change it then you can know you're running your version of WPILib. I'd give better instructions, but I don't have my Windows computer home with me. |
|
#10
|
||||
|
||||
|
Re: Bizarre cRIO Server Errors (FDIO)
Note that the full patch is attached to the issue, and the patch in the gist link has been deleted since it's incomplete.
|
|
#11
|
||||
|
||||
|
Re: Bizarre cRIO Server Errors (FDIO)
Sorry for the delay, today was busier than expected. I've posted a compiled binary to http://firstforge.wpi.edu/sf/go/artf1719 . Unfortunately, I haven't been able to verify it on a cRio as I don't have access to one at the moment. However, the original WPILib binary is 13.0 MB, and this one is 13.1 MB, so I expect that it should work.
Let me know if this helps your issue! |
|
#12
|
||||
|
||||
|
Re: Bizarre cRIO Server Errors (FDIO)
Quote:
We have ported all of our code to Java and LabView, and neither of these implementations seem to show the same problem, though we hope to run the bot into the ground a little more over the next few days to tease any more errors out of it. Some C++ code that doesn't use Commands or the SmartDashboard/NetworkTables is in the works so testing that may also give some insight. We will probably stick with the Java implementation for the rest of the season. |
![]() |
| Thread Tools | |
| Display Modes | Rate This Thread |
|
|