Custom SysId causes flaky connection?

Since our drive motors are on a canivore, my team isn’t able to use the normal SysId tool to do characterization. Instead I’m using WRueter’s version, which basically just changes a couple lines to initialize some can devices on a canivore instead of the default. I use ./gradlew run to run the custom sysid program as detailed in the readme.

I’m relatively sure this sysid program is causing connection issues with the roborio. The issue only happens on average a couple minutes after I first run a sysid test, and I’m able to ping the roborio address just fine, but if I try to establish an ssh connection the command just hangs. Restarting the robot is the only way to fix it. Here’s a picture of what the driverstation looks like.


Any ideas of what the problem could be?

I can’t speak to what might cause this from SysID, but I’ve seen three things cause similar problems.

  1. Using the roboRIO’s onboard I2C connection.
  2. Plugging something into the radio’s second port.
  3. Windows firewall

P.S. I didn’t realize SysID doesn’t support CANivore, that’s a bummer. We have a CANivore on order and are planning to use it for our 2023 drivetrain.

There’s an issue open for CANivore support, but it’s not been done yet. Contributions welcome! https://github.com/wpilibsuite/sysid/issues/406

1 Like

I’m hoping there will be routines one can call to pass data between the robot and the SysId utility, so that a custom robot project (perhaps running in test mode) can be used. This would make the tool slightly harder to use for teams that chose to go this route, but would be very flexible. :slight_smile:

The “latest” docs do not mention this though…

I don’t believe this is being planned, or even being looked at. The fundamental problem is a number of measurements are very dependent on having good timing on the robot side. So you can’t just toss some calls into your Java robot code and hope to get good results out the other end of analysis.

If someone wrote a Java version of sysid/sysid-library at main · wpilibsuite/sysid · GitHub, we would accept it. However, as @Peter_Johnson mentioned, we can’t guarantee any temporal noise bounds on the recorded data in Java. The ordinary least-squares formulation we use for data analysis assumes a constant timestep.

We’re planning to use https://github.com/SleipnirGroup/Sleipnir/ to implement a nonlinear fit without that restriction, but Sleipnir still needs some robustness improvements before I’m willing to put it to work (the descent directions can become unproductive on poorly behaved problems, causing the solver to get stuck until it times out).

As for the OP’s issue, SysId sets the HAL notifier thread RT priority to 40 and the main robot thread RT priority to 15 to improve sample time determinism (larger numbers have higher priority). Overruns could be starving FRC_NetComm because FRC_NetComm doesn’t run with RT priority (arguably it should, but we don’t control netcomm, so we don’t know if the service is designed for RT and would respond well). Idk for sure because connection issues means we don’t get to see the console output indicating the problem.

2 Likes

Thanks very much for the info and insight!

FWIW, we use C++ and would be fine with having to pull in an entire Notifier or thread of code. But I take the point – and the number of teams that would use this is small, making it hard to justify any extra work. I’ll take a look at the code at some point, mostly out of curiosity. I wonder if it is capturing fine-grained timestamps with the data, how much CAN delays might introduce jitter, etc. This is a very neat addition to the S/W side of things – thanks again!

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.