Robot Randomly Disconnects

Our robot randomly disconnects itself when connected via ethernet or wifi. We have tried two radios and have also tried re imaging the radios but nothing seems to help. The errors given are
Warning 4408 <radioLostEvents> 174.585 <radioSeenEvents> 173.585
Warning 4404 FRC: The Driver Station has lost communication with the robot. Driver Station
Warning 44002 Ping Results: link-bad, DS radio(.4)-bad, robot radio(.1)-bad, cRIO(.2)-bad, FMS-bad FRC: Driver Station ping status has changes. Driver Station
We have also tried using another computer as the driver station and have the same errors.

A few questions to help diagnose the problem:

• How long does the disconnect last?
• While it’s disconnected can you ping the roborio or connect to the webdash?
• What’s the CPU and memory usage on the computer(s) when they disconnect? Are any other programs running on the computer at the time?
• Does this only happen when the robot is enabled, or also when disabled?
• Does it also happen when connecting via ethernet straight to the rio (bypassing the radio)? What about connecting to the rio via USB B?

P.S. - you posted this thread twice, you might want to delete the other copy

There was a point at which sending tons of print statements to the driver station was a cause of strange behavior, including disconnects. Looks like it was fixed in RIO image 2018V17. Are you on this version or later? And if not, do you have lines of code that print to the console frequently?

We were having this on are 2017 robot when we tried to get it running again and it was doing the same thing. What we assumed it was, was a short somewhere on the robot as and after we did some testing we assumed it was that. Was the robot connecting then just about a minute later disconnecting also was the robot fully powering off or just disconnecting?

Have you been able to determine the situation where this is occuring? Just simple things like, “It only happens when we are driving, or when a system is running.” can be super helpful for debugging.

Other questions I have, what language are you using? It most likely doesn’t make a difference, but is useful for pinpointing some errors.
Similar to is your rio updated to the most recent version, have you tried formatting it and redeploying everything? This is a bit of work, so I wouldn’t suggest it unless nothing else works.
Are there any other characteristics that appear when the robot disconnects, like a motor turns on or off right as the robot disconnects? A motor suddenly turning on may cause power issues if it is not properly connected.

If a new laptop didn’t solve the problem, it is more than likely on your robot. If possible, analyze and upload your DriverStation logfiles. They can provide a ridiculous amount of insight and be an incredible resource for resolving all types of robot problems. Check your ethernet cables. Those frequently short out and die. Also, check your radio power. Are you running PoE? You should be. Try reflashing your radio. Inside of my terrible HS network, we often run into channel interference and have to flash a few times to get onto a channel that isn’t terrible. You can use any of the wifi analyzing phone apps for this.

Finally, if these steps don’t resolve your problem, please give us as much specific information about your problem as possible, so we can best resolve it. Good luck!

I would look at your radio’s power connection and look at your ethernet cables. If it keeps happening, Please check that there is a green light for Firewal or Firewall and that your DS has a good IP (Anything other than 169.xxx.yyy.zzz). If that checks out, Attempt connecting via direct ethernet. If that does fail, attempt to connect using the USB-B connector. If this fails, This indicates there is a problem with either your windows install or your driver station install.

There are more stringent requirements for what the IP should be, depending on which connection method is used, and the competition year. For non-USB networking in 2018, the subnet must be 10.TE.AM.XX and the trailing octet is different depending on device.

Unless the DS IP address is getting changed during operation, I do not believe that a bad network config should not cause a disconnect (rather, no connection should ever occurr in the first place).

Are these the only other two sources of problems?

We have re flashed our radio multiple times and tried using another radio. It disconnects for about 5 seconds and then we can re enable. It takes about 5 seconds to enable. We are using java for our programming language. I will try bypassing the radio. This issue is only happening when the robot is enabled. Our pc cpu did not get above 50% usage and we had more than 8 gb of ram not being utilized.

First a 5 sec outage is too short for anything in this system to reboot if it had lost power (short of a dumb switch). I would first start with new cables between everything on the robot.

Then, for simplicity sake I would start with using an RJ45 cable directly to RoboRio and confirm the symptoms still exist (eliminate the radio completely).

I would also run a constant ping against the Rio from the drivers station (ping -t 10.TE.AM.Num. Using a second PC/Laptop with a switch and then pinging both the laptop and Rio would be better looking for any drop would pinpoint the problem.

If the ping is dropped at the Rio, I would factory reset the Rio and retest, if the ping is dropped at the laptop I would suspect the network stack. (uninstall the networking and reinstall).

Good luck
Mike

The disconnect, then re-connect behavior (versus simply loosing all coms and not reconnecting) is important - it can differentiate your problem from others.

Another theory - is the code experiencing a runtime exception?

Does the driver station display “No Robot Code” during this period?

If so, are there any errors or messages in the console window about java errors or “Robots don’t quit, but your’s did!” ?

This can be as simple as a 3rd party background task running on the PC that takes the network connection briefly.
It’s actually pretty common if you have 3rd party software installed that periodically checks the web for updates. It attempts to “phone home”, but can only reach the robot, not an Internet connection and releases for a short period of time before retrying.

That’s possible, but he did say the problem happened on two different laptops. That being said, it’s very possible that a problematic software caused this to happen. Autodesk’s Atlassian Helper has been known to do exactly that.

The most egregious incident of this happening at one of my events was a rookie team going through 4 different laptops that they brought before finally going with a loaner Classmate just to get through Practice Day. Talking with the head mentor after practice matches were over for the day, all of their laptops were identical hardware complete with identical school IT sanctioned images running the required security software, which happened to be trying to call home and doing strange things with the NIC (cycling between DHCP, static on 10-dot 172-dot 192-dot networks, and back again). He made a quick call to IT and after explaining what was happening, they gave him the password to disable the software and would work with the team for a resolution that didn’t hamstring the them. At their next event there were no more issues.