During competitions, we were having comm issues with our robot in the matches. We still had some trouble after fixing our other electrical issues, however we couldn’t replicate the comm issue in our Pit. Since we are out of competitions now, we have have devoted our time to systematically troubleshooting our robot.
An interesting thing came up when we were testing our bridge: we noticed that the OM5P-AC bridge from this year had intermittent packet loss. It would lose a single packet at random intervals. We tried using two Windows laptops and:
- Pinging an ethernet-connected laptop from a wifi-connected laptop
- Pinging an ethernet-connected laptop (to another port on the bridge) from a wifi-connected laptop
- Pinging a wifi-connected laptop from a wifi-connected laptop
- Pinging a laptop from a laptop over an ethernet Linksys switch
- Running the first two scenarios with last year’s bridge (OM5P-AN)
- Reconfiguring and reflashing the bridge and doing the first 2 steps again
Needless to say, we narrowed the issue down to the bridge instead of the laptops or ethernet cable. We tried last year’s bridge (OM5P-AN), too, but it had an even stranger issue with it. Since we don’t have another OM5P-AC, we were just wondering if it was normal to expect this kind of behavior and would it be a significant issue to momentarily drop comms on the field?
Another issue we see with the bridge often at demos is that the roboRIO mDNS name is sometimes cannot be resolved. Even using the same laptop without making any changes to its network configuration, we had trouble getting pinging the roborio over wifi one day, whereas it would work perfectly fine when tethering the laptop to the bridge. We tried the same laptop a later day at our shop without changing anything and it connected find over wifi.
Our team seems to be plagued with comms issues since our rookie year and are puzzled every year trying to find what the issue is.
We’ve started to look at other systems that could cause this comm issue like testing all of our components for their voltage and current draw.
What do teams do to set up an environment to test all of the various functionality of their robots individually? We would like to replicate the issue we were having in match, so achieving a test environment similar to the FMS setup would be ideal. I’ve also heard some about using unit testing for Java, but I don’t know much about the effectiveness of it.
There is continuing discussion on CD about this. Search is your friend. Anyway issues come from many causes
-
If you lose comms for about 35 S, it is likely a bridge reboot. Most of those are wiring/plug issues.
-
packet loss/ short drops can come from many places. Your DS computer is the first possibility. Software including windows will look for updates. Make sure auto updates and firewalls are turned off. Keep non-essential software installs to a minimum. Which means you shouldn’t use the DS computer to connect to the internet. Make sure you are not sending too much data to/from the DS. The 7 megabit per sec is a absolute max, not a target. On some fields it might be a good bit less.
Remember that PING is a very low priority protocol in the stack. If there is something else that needs to be processed first, it will be. If something in the path is too busy to process PINGs, they may get dropped. What is the Rio’s CPU like when it is dropping those single PINGs?
You can program your radio to behave like being on the competition field. Set it for the Firewall rules and the 7mbit throttle.
We have it currently set to those configurations, but we still notice a significant difference in behavior when it is on the field versus when it is in our shop.
I’ll check the logs tomorrow to see what it was, but, off the top of my head, I think it was around 30-40%. However, we just tested connection to two CPUs that were both connected to the bridge and there was still packet loss even though there was no packet loss when we connected the same computers to another switch.
Can you tell me where you are getting this information from?
On your driver station computer have you tried disabling the NICs (network interface) you are not using?
Yes, we have done that on all the computers we tested/drove with.
Going to be a little rambly from lack of sleep from the newborn that thankfully waited for 1.5 days after my getting back from tearing down the Houston Einstein field, but this should give the gist.
You won’t find anything in an RFC, but in how ICMP traffic has been treated by equipment vendors over time. A very specific example from the Juniper Knowledge Base (but also found in Cisco docs):
ICMP messages are considered low priority within Junos OS, so the routing platform will respond to and process other higher priority messages, such as routing updates, before processing the ICMP messages. The microkernel may introduce tens of milliseconds of processing delay to ICMP message handling. The delay is not uniform, meaning that some ICMP messages might be delayed while others may not be delayed.
In most cases, the issue is with the endpoints of the path and not the in-between transit nodes. It is up to the vendor how they want to prioritize traffic, and many choose to low-priority ICMP in favor of “real” packets. If you are pinging a Juniper router, it is best to ping an interface IP on it rather than the management IP as the latter has be forwarded to the routing engine for the reply vs being done by the interface itself. The RTT delay is very measurable in that case.
You can observe this by doing a traceroute and a PING to the same destination (particularly if the route is through many hops). The traceroute will most likely show some hops as being a higher RTT than the PING showed the full destination RTT due to the packet needing to come out of whatever fast forwarding that particular router architecture uses and be processed to generate the return packet. Most big iron routers will prefer to process routing table updates and such than some random ICMP packet.
I do not know how the roboRIO prioritizes packets, but with the real-time nature of the communications in FRC, I have to imagine that there is some prioritization going on.
Sum up, PING is just meant to verify connectivity is there, not how good it is. If you want a real RTT, send real data.