![]() |
Bizarre cRIO Server Errors (FDIO)
Short version: We just got these messages in our cRIO NetConsole after our robot mysteriously stopped accepting inputs. This is a recurring problem.
Code:
write error: : read error: : S_errno_EPIPE--------------------------------------------- Long version: So, at our week 0 event yesterday our robot started bricking in the middle of the field. Still had communications and code, but it just stopped accepting inputs and stopped driving. The only clue we had to why this was happening was this message, every 100ms: Code:
A timeout has been exceeded: RobotDrive... Output not updated often enough. ...in Check() in C:/WindRiver/workspace/WPILib/MotorSafetyHelper.cpp at line 117Code:
write error: : read error: : S_errno_EPIPECode:
write error: : read error: : S_errno_EPIPE |
Re: Bizarre cRIO Server Errors (FDIO)
FDIO is File Descriptor Input/Output. A fd stream is a bunch of data being written to or read from a file, or a network socket, or some other I/O channel.
Are you using camera data in your robot program? The last time I heard of something like this, several years ago, one team said that it happened whenever their camera came unplugged. |
Re: Bizarre cRIO Server Errors (FDIO)
We had the same problem at our scrimmage. Sad to say, but we didn't figure out the problem when we were there.
|
Re: Bizarre cRIO Server Errors (FDIO)
Try reimaging. We've seen some really weird errors pop up that were fixed after a reimage.
The fact that the error message came out with every other letter belonging to a different error is really weird. |
Re: Bizarre cRIO Server Errors (FDIO)
Could you be running into this problem? http://www.chiefdelphi.com/forums/sh...d.php?t=126102
|
Re: Bizarre cRIO Server Errors (FDIO)
Quote:
Also, we'll be switching to an 8-slot soon. I'm always somewhat optimistic that the 8-slot instantly solves all these mysterious problems. ;) |
Re: Bizarre cRIO Server Errors (FDIO)
Quote:
Reproducing the error is easy if you replicate the exact circumstances in which the bug occurs. Two things have to happen:
I've found that some wireless cards can induce this behavior more easily than others. For example, our driver station laptop causes this to happen very easily, whereas I find it difficult to randomly get this to happen on my work laptop. I theorize that wireless interference can cause the connection to drop randomly, but vxWorks doesn't always pick up on it and it hangs. The freeze is caused by the send buffer filling on vxworks for the network connection. So, if you send data less often, you will be less likely to run into this problem -- but the potential for a freeze is still there. |
Re: Bizarre cRIO Server Errors (FDIO)
Thanks for the tips. Oddly, I was never able to reproduce the problem when I wanted to. I tried disconnecting the ethernet cable from our driver station for varying amounts of time but it seemed to have no effect. Then it would just be sitting there, perfectly tethered in, no one touching the robot or pc, and it would freeze.
I could try to make sure that a PutNumber is constantly trying to write, but I think our robot already does that - we have sensors actively writing to the dashboard whenever teleop is running. |
Re: Bizarre cRIO Server Errors (FDIO)
The last few days we have been seeing problems again, seeing the RobotDrive timeouts. They typically (and somewhat consistently) begin about a minute into the robot being enabled in teleop (regardless of whether auto ran), and we are required to reboot the cRio because it is completely unresponsive. The SmartDashboard doesn't get any updates from switch states and the robot does not respond to any inputs. NetConsole shows a new RobotDrive safety timeout (the same as the one posted in the first post) roughly every 80ms. This makes sense for an output safety timeout, and suggests that some threads are still in operation while some other ones, particularly the ones that update outputs and the SmartDashboard, are hanging. We have not been seeing any more of the FDIO errors, only RobotDrive timeouts.
We have had other things to test and can still do so by rebooting the cRio, but we haven't a great idea where to start debugging this problem. We are in the process of rewriting our most basic functionality in other frameworks (both LabVIEW and a C++ IterativeRobot Project) to get away from the Command Based model. It seems likely that our set of commands (all of which seem pretty standard) might be finding some edge case that causes the hang. Does anyone have any more insight into this? If it is network table writes as has been suggested, why do these timeout? Do we really need to be guaranteeing packet delivery on our SmartDashboard updates? It seems something like UDP would be fine in this case, and would avoid timeout issues. |
Re: Bizarre cRIO Server Errors (FDIO)
So I actually read all of that other thread now... it sounds like we are probably seeing the same bug with Network Tables. It looks like the bug is being worked through and a patch is in the works. In the meantime, I think we will also try establishing a separate thread dedicated to the SmartDashboard updates, so that the output threads can still keep up.
|
Re: Bizarre cRIO Server Errors (FDIO)
Quote:
If you want, I can compile the WPILib binary tomorrow morning and send it your way. |
Re: Bizarre cRIO Server Errors (FDIO)
I'm fine installing the patch as it is now, but I'm not sure what the process for that is. Would I have to download the source and compile it myself for that to work? If so, I would appreciate if you compiled the binary for us...
|
Re: Bizarre cRIO Server Errors (FDIO)
Steps:
I'd also recommend changing the constant at RobotBase.cpp line 22 to indicate that you've changed the code. It currently reads "C++ 2014 Update 0". You can view this string in your driver station diagnostics, so if you change it then you can know you're running your version of WPILib. I'd give better instructions, but I don't have my Windows computer home with me. |
Re: Bizarre cRIO Server Errors (FDIO)
Note that the full patch is attached to the issue, and the patch in the gist link has been deleted since it's incomplete.
|
Re: Bizarre cRIO Server Errors (FDIO)
Sorry for the delay, today was busier than expected. I've posted a compiled binary to http://firstforge.wpi.edu/sf/go/artf1719 . Unfortunately, I haven't been able to verify it on a cRio as I don't have access to one at the moment. However, the original WPILib binary is 13.0 MB, and this one is 13.1 MB, so I expect that it should work.
Let me know if this helps your issue! |
| All times are GMT -5. The time now is 17:58. |
Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi