We’ve been facing a myriad of obnoxious robot gremlins, most notably severe packet loss between the robot and the DS.
Our “standard” networking set up involves the following devices:
- Radio
- RIO (v1)
- Limelight v3
- Ethernet panel mount
- TP Link FIRST Choice network switch
- Monoprice Micro SlimRun Cat6 cables
The radio, rio, and network switch are all the exact same devices as used on our 2022 competition robot, which did not experience any of these issues. Our switch and radio are powered by the CTRE VRM connected to the REV PDH - the radio using a REV POE injector.
We typically connect wireless; the panel mount is typically used for competitions.
We began to notice during testing that our operator would press a joystick button to set the elevator setpoint, then nothing would happen for ~3 seconds, and then the elevator’s setpoint would change. But this doesn’t happen every time. We went into the DS logs and were greeted with this beautiful graph:
I have approximately 50 more logs of the exact same nonsense if anyone would like the raw files.
As you can see, we seem to have gotten 120% packet loss, which I am not really sure how is even possible, unless there are retries going on that are also getting dropped.
We are experiencing (what we believe to be) unrelated swerve module issues, where offsets are not being initialized correctly. We believe this to be CAN frames being dropped (we are at 70%ish utilization), and the message that sets the spark’s initial encoder value is not being received in time, or something. I am mentioning this in case it is somehow the same issue propagating elsewhere, but we think it’s totally separate and can be fixed by lowering CAN utilization (which we will get to tomorrow, likely).
We use NT quite heavily due to AdvantageKit.
We have taken a few steps to debug this.
- We connected directly to the switch via the panel mount. This lowered packet loss to mostly 0% but occasionally 5-10% packet loss spikes. I do not know if this is normal.
- We took the network switch, limelight, and panel mount out of the equation, by connecting the rio directly to the radio’s POE injector. Packet loss still occurred, even while idling and disabled.
- We deployed an empty robot project (command based template advanced), still packet loss.
- We connected the Rio to the radio’s second port, such that it does not use the POE Injector for comms. Still packet loss.
- We swapped out to a different radio. Still packet loss.
- We swapped to a different driver station. Still packet loss.
- We moved to a new building. And then did it again. Still packet loss.
- We swapped to different Ethernet cables. Still packet loss.
- We connected from DS straight to Rio, without the radio involved at all; no packet loss.
We are running the latest firmware and versions of everything. Latest wpilib version, latest gradlerio version, latest AdvKit version, latest DS, etc. This is not a laptop issue, as we are using the same DS we did last year without any problems at all, and we’ve disabled firewall and other possible stuff.
We are getting packet loss in the same room as another team’s robot that is not experiencing any of this.
We are completely out of ideas and are getting rather nervous about connectivity problems on a real field in week 1. Help would be appreciated.
Code, in case its some dumb code issue. GitHub - FRC2713/Robot2023