VMX-Pi: [CAN SPARK MAX] IDs: 1, timed out while waiting for Periodic Status 0

Hello Everyone,

I am on a collegiate robotics team, and we are using a VMX-Pi with SPARK MAX Motor controllers. Our code runs fine, but every second or so, an error is printed to the Driver Station console:

[CAN SPARK MAX] IDs: 1, timed out while waiting for Periodic Status 0
[CAN SPARK MAX] IDs: 1, 3, timed out while waiting for Periodic Status 0

We have four SPARK MAX controllers connected, IDs 1-4, and the IDs listed in the error will change. ID 1 is almost always listed (only seen the error once without ID 1), but ID 3 also appears frequently. Occasionally, when the robot code starts, this error will be printed:

[CAN SPARK MAX] IDs: 1, 2, 3, 4, WPILib or External HAL Error: CAN Blackboard Entry Not Present

Now, the code runs as we expect it to, but we are worried about these errors being constantly printed to the driver station console. It appears to cause a spike in latency when the errors are refreshed that quickly. Anybody have any ideas? We checked wiring and everything is connected properly.

After some additional testing today with the addition of 6 more SPARK MAX controllers, we have uncovered more issues. When directly controlling the original four motors, we did not notice any issues. With these six additional controllers, three of them are designated as motor followers. The follower motors, when the masters were commanded, would “jerk” back and forth, as if they were being commanded to move and then immediately stopped. We have a 1 second ramp time configured on these motors, and the jerking was instant, which leads me to believe this is some kind of a safety feature being triggered. We double and triple checked wiring, everything is Green to Green and Yellow to Yellow. We have terminating resistors at each end (VMX-Pi and CANabale usb adapter), we have also tried with a 120 ohm resistor instead of the CANable in case that was the issue, no change.

To eliminate the possibility of broken “master” controllers, we swapped the CAN IDs around, and then ALL of the 6 new motors started jerking, regardless of being the driven controller or the follower. The errors that showed up in the Driver Station also started to include the IDs of the new motor controllers. We are relying on the CAN bus connectivity of these controllers in order to read the encoder data. Does anyone have any ideas of what could be causing this?

Below are a couple of suggestions.

But first, consider how many different computing devices are all interacting in real time in your system (driver station computer, raspberry pi, VMX-pi microcontroller, 10 Spark Maxes each with their own microcontroller). It a distributed, real-time system, and with the addition of “follower mode” there are also multiple controllers in the system. Once issues are discovered, finding ways to simplify the system and isolate potential trouble areas is important. With that in mind, here are some areas to consider:

  1. SparkMax Firmware versions. VMX-pi currently uses the 2020 WPI Library, and is compatible with the Spark Max Firmware versions documented here. Perhaps the SparkMaxes being used have a version of firmware which isn’t compatible (since 4 SparkMaxes can be controlled without much issue, your description doesn’t really suggest that, but it’s worth checking). Similarly, are all the SparkMaxes all using the same firmware version?
  2. VMX-pi firmware version. It’s recommended to use the latest VMX-pi firmware, directions for updating it are here. Perhaps there’s a CAN performance update here that would help.
  3. CAN Bus loading. If more issues are seen with adding more CAN devices, one area to investigate would be: can every message needing transmission actually being sent? Especially once there are 10 SparkMaxes on the CAN bus, and if they’re all sending encoder updates as well as receiving motor speed set requests - not to mention also the “safety” messages, the bus could be getting busy enough that certain messages are delayed.
  4. CAN Bus Prioritization. On the CAN bus, the lower message IDs have higher priority. Since the device ID is one of the higher bits, the device with the lowest ID would have the best chance to send it’s message soonest. So especially if using “follower mode”, the “master” SparkMax should likely have the lowest CAN Device ID.
  5. Simplifying the system. One way to avoid the CAN bus contention would be to take some of the devices off of the CAN bus; another would be to reduce the rate at which encoder updates are being sent, and there are likely other opportunities. Since the VMX-pi also has many PWM outputs, perhaps these could be used instead of CAN to control some of the SparkMAXes. While PWM is not as sophisticated as CAN, the simplicity of the approach means it should be more isolated, and easier to debug what’s going on.

Cheers,

  • scott
2 Likes

Scott,

I didn’t have much time today to check things out, but I check out your recommendations:

  1. SparkMax Firmware: The listed version, 1.5.2, is the latest available and all of our controllers are on this version.
    2.VMX-pi firmware version: I updated the VMX-Pi firmware to the latest, but I ran into issues using the software downloaded from Studica. I ended up using CubeProgrammer to directly upload the firmware file. Checking the firmware version displays the same version that I uploaded, 3.0.435.
  2. CAN Bus loading: We aren’t sure if “more issue” cropped up with the addition of more motor controllers, as we weren’t using followers with the original four motor controllers. When I have more time to test, I will try to isolate just four of the controllers, with 2 groups of followers. How can I check which messages are being sent/received and which arent? Or is there a way for me to check overall bus utilization?
  3. CAN Bus Prioritization: I will reorganize our CAN IDs to put the master motors at lower IDs than the follower motors, hopefully this will help.
  4. Simplifying the system: Taking the motor controllers off of the CAN Bus isn’t an option for us, as we need the encoder feedback. Otherwise we would have just used an arduino+rpi instead of a VMX-Pi to drive the motor controllers over PWM. There are certain mechanisms where we do not need to read the encoder data every 20ms, I will try lowering the interval of the periodic frames and report back. Would you recommend increasing the interval for Periodic Frame 0, which contains applied output, faults, and follower status, and by default sends every 10ms? I would think that frame is the most important and should remain at the default value. Or should it be left alone on masters and decreased on followers?

I appreciate the assistance, and will report back again once I complete more testing.

Thanks!
David

Isolating the unexpected behavior from the rest of the components makes sense to me. While the logging will indicate when an expected periodic message isn’t received, a CAN bus analyzer is the tool that would allow monitoring the bus and might even be programmable so that it decodes the packets so their meaning and sequencing can be understood.

As a note, VMX-pi has 5 quadrature decoders, so these continue to be an option; some discussion of this is here.

A WPILib application is designed to run it’s periodic task every 20 ms, so it’s arguable you wouldn’t need an update rate more frequent than that. Also, my interpretation of follower mode is that the encoder of the master is used as feedback that drives the others - if you agree with that, then yes it seems the encoder data from the followers is less critical, and perhaps not needed. So decreasing the periodic frames seems a reasonable approach, and it should definitely decrease bus traffic.

I recommend considering the approach of slowly building the system up from a single controller, and then adding in controllers and enabling features sequentially to help identify where troubles are occurring. Also, I have no idea if there can be more than one SparkMax master on the CAN bus, so if you are attempting that, I recommend checking w/REV on that.

Despite updating the VMX-pi firmware to the latest I could find, the Driver Station reports that the firmware is inconsistent.
image
I just tried running the motors, and I am still getting the errors in the console, but the jerking action is gone. This is the same state we were in with the original four motor controllers, where they acted fine but spit out errors every second or so. I am going to try more testing, as I don’t like issues that show up and then disappear.

I think I have eliminated the issue. I don’t think the problem was bus utilization, but rather the multiple layers that the CAN messages must go through to reach the robot program (CAN transceiver, VMX microcontroler, pigpio library, VMX HAL, WPI HAL?, robot code). I used this page to figure out which frames we needed more often than others, and ended up with the following settings:

Controller Role Periodic Frame Update Period
Leader Frame 0 25ms
Leader Frame 1 50ms
Leader Frame 2 50ms
Follower Frame 0 100ms
Follower Frame 1 250ms
Follower Frame 2 250ms

Since changing these settings, I have not seen any jerking on the motors, and the errors to the console have mostly disappeared. There is still the initial “WPILib or External HAL Error: CAN Blackboard Entry Not Present” error, but that only displays once. In a test of running multiple motors at the same time for 5 minutes, only a single error about “timed out while waiting for Periodic Status” appeared. I will consider this solved.

For anyone else using a VMX-Pi with CAN-based motor controllers (at least the SPARK MAX, have not tested with CTRE stuff), it seems there is a limitation of how fast the VMX can process incoming CAN messages, so slowing down the refresh rate is required.

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.