View Single Post
  #8   Spotlight this post!  
Unread 27-02-2014, 23:11
Aren Siekmeier's Avatar
Aren Siekmeier Aren Siekmeier is offline
on walkabout
FRC #2175 (The Fighting Calculators)
Team Role: Mentor
 
Join Date: Apr 2008
Rookie Year: 2008
Location: 대한민국
Posts: 735
Aren Siekmeier has a reputation beyond reputeAren Siekmeier has a reputation beyond reputeAren Siekmeier has a reputation beyond reputeAren Siekmeier has a reputation beyond reputeAren Siekmeier has a reputation beyond reputeAren Siekmeier has a reputation beyond reputeAren Siekmeier has a reputation beyond reputeAren Siekmeier has a reputation beyond reputeAren Siekmeier has a reputation beyond reputeAren Siekmeier has a reputation beyond reputeAren Siekmeier has a reputation beyond repute
Re: Bizarre cRIO Server Errors (FDIO)

The last few days we have been seeing problems again, seeing the RobotDrive timeouts. They typically (and somewhat consistently) begin about a minute into the robot being enabled in teleop (regardless of whether auto ran), and we are required to reboot the cRio because it is completely unresponsive. The SmartDashboard doesn't get any updates from switch states and the robot does not respond to any inputs. NetConsole shows a new RobotDrive safety timeout (the same as the one posted in the first post) roughly every 80ms. This makes sense for an output safety timeout, and suggests that some threads are still in operation while some other ones, particularly the ones that update outputs and the SmartDashboard, are hanging. We have not been seeing any more of the FDIO errors, only RobotDrive timeouts.

We have had other things to test and can still do so by rebooting the cRio, but we haven't a great idea where to start debugging this problem. We are in the process of rewriting our most basic functionality in other frameworks (both LabVIEW and a C++ IterativeRobot Project) to get away from the Command Based model. It seems likely that our set of commands (all of which seem pretty standard) might be finding some edge case that causes the hang. Does anyone have any more insight into this?

If it is network table writes as has been suggested, why do these timeout? Do we really need to be guaranteeing packet delivery on our SmartDashboard updates? It seems something like UDP would be fine in this case, and would avoid timeout issues.