Spark MAX controllers stuttering

Hey Everyone,

I’m from Team 219, and recently we’ve been having problems with our Spark MAX motor controllers. While we’re driving, we experience a jittering/stuttering with our motors and it causes the robot to shake. We had this issue before when we first started using the NEOs and Sparks and fixed it via updating the Spark MAX. Another programmer, updated the Sparks and redownloaded the Spark Client (we had an old version). Does anyone have any ideas on how to continue? I did just redownload the Drivers for Spark MAX but is it a software issue? The Sparks say that they are on the latest update. I have not changed anything from last year and I’m not sure why we would suddenly start having this issue again.

Thanks!

219

The most common reason for this sort of jitter/stutter is code that is trying to set the motor to two (or more) speeds simultaneously. This is often due to having multiple “if” statements that sometimes allow multiple settings to take place. Use of “else if” and switch/case* is cleaner and can help avoid this.

If you’re sure your software isn’t doing it, the next place I would look is the encoder cable, both along the length and at the connector.

And as always, if you can post [a link] to your code, there are plenty of people here who’ll give it a good looking over.

* And don’t forget to use break; between the branches unless you really want one to run into another!

4 Likes

Yep – we had exactly this issue at our DCMP and we were sure it was an electrical or mechanical issue for the longest time. We were looking at weird spikes in current draw and thought those were causing the current limits to activate and kill the motors. It turned out it was the opposite – software causing the motors to stop and start was causing the current spikes.

It’s very important to have a process to trace failures back to every possible cause, even if you’re certain that, for example, it’s not a code issue. For us, that means start with checking electrical connections all the way from the PDP to the motor, then checking mechanical connections from the motor to the physical output, then looking at every possible thing in our code that could cause those motors to run.

We now go through this entire process (sometimes rearranged depending on what we think the root cause is) every time we have a failure. I can’t count how many times we’ve been positive that there have been no code changes in the time an error cropped up, but in fact there was a minor change somewhere along the line that broke things.

We’ve been running NEOs all season and haven’t had this specific issue while driving, but last week we attempted to use the frc-characterization tool and got lots of stuttering while running it’s test. This tool generates a new robot project, so this should isolate anything we’re doing at the team code level.

We reflashed firmware on the sparks and it seemed to make the stutters less prevalent, but not disappear correctly. We’re going to try adding a call to restoreFactoryDefaults() per this Github issue and see if it solves it at tonight’s meeting.

@plusparth It seems likely that we’re having the same issue. Looking at the data out of frc-characterization it does look like the motor controllers are getting commanded a value of 0 for a very short duration of time, though nothing in the simple generated code is telling them to do so. Did you find a solution to that issue? Seems plausible that we’re dropping CAN packets which causes the motor to get a value of 0, or something along those lines.

For us, it was a very simple mistake – in attempting to fix something about the ramp rate of the drivetrain, we just simply commanded the motors to go to two different speeds in one loop iteration.

More specifically, when we switched from Mini CIMs to NEOs, our driver noticed significantly increased acceleration (and therefore tippiness), and since we had characterized our drivetrain, our programmers decided to attempt to fix this by mapping our joystick to velocity instead of motor power. We just used the kV and kA with no P or D values for this, so it was essentially open loop but scaled with a real feedforward. I think this would have worked, but we accidentally left in our old call to our Drive subsystem to also set the motor powers according to the joystick values (pure open loop control). Part of the error here was a quirk in how we set a target velocity in our drivetrain – when we switched from open loop to velocity control, we set motor power to 0 in the middle.

We did have a separate problem where we had a lot of loop overrun errors, which we weren’t able to trace, other than knowing that our CPU utilization was very high (like above 90%). That was our second suspicion for the cause of the errors, but we haven’t looked further into this since fixing the simple error seems to have solved our issues.

Anyway, I hope this helps you debug your issues, but I don’t think your issues were the same as ours – ours was entirely programmer error on our part.

2 Likes

here’s the src code that I uploaded a couple months ago (nothing has been changed except I added some limelight values to put to the Smart dashboard). I really don’t know what would be the cause of any double speed settings. And I don’t know enough about our electronics to be able to tell if something went wrong.

Thank you for all your help tho!

After looking through your code, I’m inclined to agree that there isn’t really anything that would cause double speed settings – though I’m not super familiar with the command-based structure, so I may be wrong about some assumptions there about how default commands work or something that may actually cause this issue.

You said that flashing firmware to the Spark MAXes seemed to help the issue earlier – this makes me think whatever the issue is has something to do with the settings for these motor controllers. For example: I think current limiting is one potential reason this could be happening, if your motors are pulling too much current and the controllers are shutting them down for safety purposes.

Can you post screenshots of your driver station logs from a time that this happened? See https://firstmncsa.org/2019/02/15/the-driver-station-log-file-viewer/ for how to do this – just show the plots for current from the PDP slots that your drive motor controllers are plugged into.

Also, does this issue occur only on the ground when you’re actually driving around, or does it also happen if you’re driving in the air (with the robot propped up on blocks)? If it’s only on the ground, it’s more likely to be a result of current limiting (more current is drawn when you’re on the ground, because friction from the carpet causes more of a load on your motors in this case)

1 Like

On the surface, it seems like PrecisionDrive is using getNSpeed(), which is mapped to a different joystick than the xbox joystick that instantiates the PrecisionDrive command.

PrecisionDrive should call Robot.m_oi.getxboxYSpeed() (etc) instead.

OI.java
public OI()
{
  mainDriver= new Joystick(0);
  driver2=new Joystick(1);
  xbox=new Joystick(4);
  ...
  lstick= new JoystickButton(xbox, 9);
  ...
  lstick.whileHeld(new PrecisionDrive(3,2,2));
}

public double getYSpeed() 
{
  if(Math.abs(mainDriver.getRawAxis(1)) >= .2)
....
}
PrecisionDrive.java
protected void execute() {
  double ys = Robot.m_oi.getYSpeed(), xs = Robot.m_oi.getXSpeed(), zs = Robot.m_oi.getZTurn();
}
1 Like

We’ve been looking at the same problem in our own codebase. We were able to eliminate the problem today by turning off some code that we had in our periodic methods that was querying the encoder positions and/or velocities. Looking at your codebase, assuming that this is the command you’re using, it looks like you’re doing the same kind of thing here: https://github.com/lizzigan/krakencode219/blob/master/src/main/java/frc/robot/commands/Drive.java#L31

On our robot, doing this in periodic methods, we could see visual indicators of this problem as the Talons that we use for other purposes would briefly flash red and then go back to normal and the Spark Max lights would kind of stutter, just as the drivebase would (we use the Spark Maxes and NEOs on the drivebase). Once we eliminated the code that was querying the encoders, those problems went away. We also reproduced the problem using the frc-characterization code. It also queries the encoders in robotPeriodic(). Again by removing that code, the stuttering problem went away.

Not using the encoders isn’t a solution but if you could try it in your codebase and verify that it also makes the problem go away, that’s another datapoint. I plan to contact Rev about this.

Hope that helps.

Edit: We’ve been testing with the latest firmware (1.4.0) and software (1.4.1 java).

1 Like

Have you contacted REV? They should be able to help you troubleshoot.

@Greg_Needel

I haven’t. We wanted to get something reproducible first. I plan to contact them.

1 Like

I’ll forward this to our programmers and see if we can replicate, since we exhibit the same symptoms under the same scenario.

REV prefers you contact them directly versus posting on chief.

EDIT: I don’t understand why this post was flagged… It’s just paraphrasing what the owner of REV recently posted.

13 Likes

I don’t understand why Adam’s post was flagged. It’s clearly just factual.

At this point, I have sent REV a report with everything I included here and more details, including a small project that reproduces the problem for us. That’s clearly the only way this will get resolved. That said, I don’t think it’s inappropriate to communicate what we found with others who are looking at what seems to be similar problems.

5 Likes

@adias - is this an issue only from calling getPosition(), or does it also happen with getVelocity()?

I don’t know. Both programs that we tried called both of those and, in testing, we either called both getPosition() and getVelocity() or neither of them. We never tried just one of them. We won’t be able to do anything more before Monday.

For when you check on Monday: did your CAN bus utilization drop significantly when you made this change? What about your CPU utilization? I think we regularly queried encoder values from the Sparks and noticed high CPU utilization at least, though I have no idea if these are even correlated (we never actually properly traced our CPU usage issues).

1 Like

CPU utilization dropped dramatically. We didn’t notice a change in CAN bus utilization.

2 Likes

Hm, that’s really interesting. It’s possible that your issues could be caused by loop overruns if that’s the case — if the high CPU utilization is caused by the call to getPosition() causing the thread to hang, then you could be hitting the motor safeties due to a set() command not being sent frequently enough. When we get a chance, we’ll take a look at this on our end — we’ll see if taking out encoder requests reduces CPU utliization.

1 Like

Right. We’ve been speculating about that since our meeting ended. The student who was working on this says he looked for motor safety messages and didn’t find any but it’s a logical enough explanation that we really need to check again.

1 Like