PSA: New crash bug in REVLib 2024.2.2

TLDR: REVLib 2024.2.2 makes every call to (the zero-argument version of) getEncoder() on a SPARK Flex liable to crash your program!

Workaround:
0. If you don’t use velocity feedback on SPARK Flex, I would advise that you do not update to REVLib 2024.2.2. For everyone else:

  1. If you trust yourselves to use the correct motor controller class for your hardware, then replace your usage of CANSparkFlex.getEncoder() with .getEncoder(SparkRelativeEncoder.Type.kQuadrature, 7168).
  2. As a safeguard against other unknown issues caused by the same bug, replace CANSparkMax.getEncoder() with .getEncoder(SparkRelativeEncoder.Type.kHallSensor, 42).
  3. Even with the above replacements, call them only once and retain the returned encoder object. This is REVLib’s nearly undocumented usage assumption.

The below was what I emailed to REV’s support line last night:

Tonight we updated to REVLib 2024.2.2 for the SPARK Flex velocity filter changes, and identified an issue with the library that causes fatal crashes if the message to check SPARK model fails. In short (as far as I can tell):

CANSparkBase.getEncoder() now calls getSparkModel() to determine whether to set up the main encoder as a hall sensor or the Spark Flex high-res quadrature encoder. This is a JNI method that makes a CAN message to query the device for its model type on every call.

If getSparkModel() fails due to CAN bus issues or other causes, it seems to default to Spark MAX while printing an error message. In this failure case the encoder for a Flex would be setup in code as a hall sensor.

If a user then calls getEncoder() again and the model check succeeds, the encoder will be set up as a quadrature, causing a fatal exception.

The opposite situation can also cause a crash, where the model check succeeds at first and later fails, because now the model check is being done on every getEncoder() call. (This is a separate performance/CAN utilization issue, but less critical).

I understand calling getEncoder() every time it’s needed is not intended usage, but until this update it was “safe” to do so, and I have seen many teams, including my own, doing so. The issue in 2024.2.2 can crash the program of any team using Flexes and calling their getEncoder() multiple times, and a failed model check on any of those calls will trigger it.

7 Likes

This is baffling. I can’t really understand why the RIO code should have to query over CAN to figure out what sort of device it’s controlling, much less why this is done dynamically at runtime. That’s a pretty serious code smell.

7 Likes

My guess would be that it’s trying to find what motor is connected due to the spark flex being able to control an original neo (when the hardware is released)

Why does the RIO-side code care about this in the first place, though? And why is the check done dynamically every time the encoder is queried?

Okay slight amendment:
getSparkModel() doesn’t default to SparkModel.SparkFlex, it defaults to SparkModel.Unknown, which doesn’t matter for much given the relevant code

  public RelativeEncoder getEncoder() {
    throwIfClosed();
    if (getSparkModel() == SparkModel.SparkFlex) {
      return getEncoder(SparkRelativeEncoder.Type.kQuadrature, 7168);
    } else {
      return getEncoder(SparkRelativeEncoder.Type.kHallSensor, 42);
    }
  }

getSparkModel() does not appear to have anything to do with the motor being attached, and this code will be even more broken when the flex can control a neo.

Same issue here. accessing the encoder to know when our Vortex’s powered by Spark Flex’s are up to speed.

using getEncoder()

Reverted back to 2024.2.1.

Question. Since the Vortex is brushless wouldn’t it’s default encoder be Hall Effect?

It has an integrated high-resolution quadrature encoder, and I imagine REV want users to get the benefits of that by default.

Why they overloaded the API to do it, is another question…

My team experienced this issue today as well, causing our code to crash on init!

3 Likes

I’ve reworked PurpleLib to get around this issue.

1 Like

Do you happen to have a stack trace or other indicators of this crash? …asking as a CSA so if I see this at events this week I can help identify this quicker.

We (8513) have been seeing this issue, but were unable to diagnose. It always coincided with other CAN issues, and when they went away the code stopped crashing. Before we rollback to 2024.2.1 we will induce a can error to force a crash to get a stacktrace for you.

1 Like

Any idea if REV is aware of this issue?

Yes I sent it to their support line before making the post, and my email has been passed to the RevLib people.

The “root” error message will be “This encoder has already been configured as a Quadrature with countsPerRev set to 7168!” Or “as a Hall sensor!”. A few layers into the stack trace will tell you which part of robot code is calling the getEncoder().

1 Like

Off topic, but what would be nice would be, from REV, a list of API calls that generate CAN traffic. I would have never dreamed a call to .getEncoder would generate CAN traffic.

I had a periodic loop that made that call several times in a single iteration of the loop.

4 Likes

Revlib 2024.2.3 appears to have a fix for this.

4 Likes

2024.2.3 doesn’t drive Vortexes correctly for me, so I reverted to 2024.2.2.

What kind of issue are you seeing in 2.3?

Vortex motors not running when being commanded. The API appears to put the motor in a weird state that makes it not able to run. I can do a “complete factory reset” on a Flex/Vortex, then it runs correctly, but starting the code breaks it again. It appears to not commutate correctly at first glance.

We didn’t change anything in our code, only change is the REV vendor library.

2 Likes

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.