Robot Communication Problems

We have been fighting comm issues for 2 weeks now. We feel alone with these problems and would really like to hear from other teams to see if this is more widespread. We have some very experienced mentors and CSA’s working on these problems as well, but are still sitting with problems.

The issue we are seeing:

Communication between the DS and RoboRio is failing according to the Comm and Robot Code status on the DS.

Steps taken to attempt to resolve this.

Flashed RoboRio with current image.
Deployed basic command based C++ code built with RobotBuilder to RoboRio.
Flashed Radio with current image.
Used Radio Configuration Utility to configure it for team 4607, using fw and / or bw does not change status.
Wiped DS hard drive and built from generic Win 10 ISO.
Installed NI / FRC utilities / DS software, all current versions.

Interesting things to note.

DS and RoboRio will communicate for a while after a change in some system, flash of radio, flash of RoboRio, sometimes a reboot, etc… Once comms are lost, they stay lost according to the DS.

Comms between DS and RoboRio will succeed initially and then fail without any changes to the system. Once this happens the Comms light on the RoboRio will either stay green or turn Red.

Most of the time once comms are lost between DS and RoboRio, ping and ssh to RoboRio from the DS will succeed.

All items in the chain have been flashed and reconfigured by multiple team laptops and personal. Same failure result eventually.

Once the Radio and RoboRio are shutdown via power switch on robot, there’s a good chance they will not communicate with the DS after power up again.

I’ve never experienced this issue before and I can’t say I’m that experienced, but this is what usually works for us.

On our robot when the DS can’t communicate with the roboRIO but pinging works, it usually means the DS is looking for the wrong mDNS name. When you fill in the team number box in the DS, it looks for roboRIO-####-FRC.local, which for some reason doesn’t ever work on my laptop.

So try pinging the roboRIO with an mDNS name (not sure if the static IP will work). Then where it says “Reply from…” - copy that address into the team number field in the DS (it is caps-sensitive).
If you’re pinging the static IP right now, and if that doesn’t work in the DS team number field, you can try pinging roborio-####-frc.frc-robot.local and roborio-####-frc.lan.

As I said, I don’t really know about this issue and I’m sure your mentors are far more experienced than I am, but this is what usually works for us.

We had ALOT of IP camera issues last summer. Lost packets, High loop times, just bad experiences. Turns out it was the USB-Ethernet adapter on the Kangaroo Mini PC was to close to the radio. Moved it to the complete other side and comms cleared up.

Another thing to try when it fails is to Disable/Enable the network adapter.
Some PC adapters get stuck on an old address after a disconnect and that makes it ask for a new assignment.

If it’s any consolation, you’re not alone. We’ve been having comm issues this year, too. In past years, we’ve always configured our robot radio to be a “bridge” and have used a separate wireless router as the “1519” access point. This year, we’re trying out the “recommended” configuration of having the robot radio be configured as an access point in order to leverage the OpenMesh radio configuration features for “firewall” and “bandwidth limit” to approximate those characteristics on the competition field. So far, we’ve had more problems than ever before, but we haven’t given up yet…

When you’re having trouble with the comms in the DriverStation, it sounds like you can still ping your robot from the DriverStation via pings to 10.46.07.2 ?

We were having that problem a couple days ago as well, intermittently, from one of our laptops. After trying many different things that didn’t help (like you have been doing), it turned out that the problem was that the Windows Firewall on that laptop was considering the wireless network for “1519” (our team number) as a “public network” and was thus blocking the traffic to the Driver Station. In that situation, however, the command “ping 10.15.19.2” (which sends an ICMP packet) would work just fine. I’d suggest turning off the Windows Firewall on the Driver Station laptop to see if that helps.

You mention having all items in the chain flashed and reconfigured by multiple team laptops. Do multiple team laptops have this issue, or just the one that you’re referring to as “the DS?”

Have you tried using the DS to control the robot with the robot tethered (connected directly via Ethernet cable) to the DS, and the DS wireless turned off / disabled? Does that work correctly?

I would also suggest setting a static IP address on your DS laptop to 10.46.07.5 (10.TE.AM.5) to avoid any issues with your DS laptop having trouble getting a DHCP address from the robot radio access point.

We had a problem with our school network, where they had this system that would automatically “block” all wifi hotspots that weren’t a part of the school system. This was to discourage students from setting up hotspots on their phones. This resulted in us constantly dropping connection to the radio etc. Finally realized what the school was doing, took all of our radios to them, and they registered their MAC addresses as “safe” and they don’t get “blocked” anymore.

I highly recommend giving every device you can a static IP address.

Does it all work when the DS is tethered into the Radio? If so, then it might be a firewall setting on your DS.

If you have the same problems tethered and wifi, then the static IP address should solve your problem (unless you have a bad radio).

You may want to swap Openmesh ports on your radio. If using the one next to the power connector, use the other, etc. The gateway name when the radio is an AP handing out DHCP addresses seems to be a bit funny this year, sometimes adding the .frc-robot suffix.

When it is in this broken state, does USB also fail? If that is the case, I generally restart the DS. It seems that sometimes the mDNS calls never leave the laptop, returning cached data. Unloading and reloading the DLL generally fixes it.

Every now and then, it seems that rebooting the roboRIO is necessary to get it to respond to mDNS.

If using ethernet, be sure to observe the link light on either the roboRIO or the DS. Due to power conservation features, the interface sometimes shuts down and doesn’t come back up again. An unplug and replug will sometimes correct this, but other times you should disable/enable the interface in the control panel.

Let me also mention a few DS tricks. On the setup tab, if you click on the team number dropdown, it will display robot(s) that respond and match the pattern with your team number. If you shift click it, it will show all roboRIOs that respond, regardless of team number. If you want to type a DNS name or IP into the field, you can do that as well - it is pretty tolerant.

As for always using static IPs. The system is certainly compatible with this approach, but this is by no means a nirvana. Now, you, the mentors and students are responsible for changing IPs each and every time you want to go online to check in source, run code, etc. I do not highly recommend this approach.

Greg McKaskle

PROBLEM SOLVED

So after a lot of troubleshooting and messing around, we came to very interesting conclusion. We are currently up and running and know what was causing issues. It’s the NavX code. We use a NavX board on our robot and have it in our code, but didn’t have the NavX physically connected to roboRIO. Every time we boot the roboRIO without the NavX installed, the comms would get knocked out along with robot code. Whenever we boot the roboRIO with the NavX actually installed, everything works normal and we can connect just fine.

This is really weird, because the deployed robot code should never be able to knock out the comms. (Maybe someone from NI could comment on this?) When the comms and code get knocked out, the issue looks like it’s a communication problem, but it’s the NavX code killing it.

So if you are using a NavX in your code, make sure you have it plugged in or you will have the issues we had

I’d like to try to reproduce this. Are you using navX-MXP or navX-Micro? What language are you using? And what communication interface are you using?

We are using the navX MXP and c++ over wifi on the open mesh AC radio. NavX is plugged directly into MXP port on top of roboRIO

Thanks; I’m assuming you’re using SPI to communicate w/navX-MXP (please let me know if that’s incorrect).

Yes, we’re using spi