Review of 1918 control system failures during Traverse City playoffs

Our robot functioned properly at Traverse City until our second semi-finals match, when things went south.

We did not have time to do proper failure analysis between matches. We didn’t have a chance to properly debrief the drive team and watch match videos until Monday. It was amazing just how inaccurate all our memories were about what happened and when.

We think we may have figured out the root cause (RoboRio failure), but intermittent electronic/electrical issues are hard to pin down and not everything is clear. We generated the following “re-construction” as part of our investigation. It describes what happened, what we did, and we still need to do. We request and welcome any comments and questions from the FIRST community that might see something we missed.

All was performing well up thru SF1-1. https://www.youtube.com/watch?v=fS5asGjTXsk

Problems started in SF1-2: https://www.youtube.com/watch?v=-yZWA52HlkA

  • Good thru autonomous
  • Moved briefly at beginning of teleop (rotated chassis)
  • Lost comms (lost code??)
    o RSL (Robot Signal Light) blinking until 108 sec left in teleop, then goes to solid and stays solid for remainder of match.
  • Driver attempted to reset RoboRio (clock time??)
    o RoboRio did not reboot
    o Drive team called FTA over
  • Robot never moved again that match
  • Post match, FTA log shows RoboRio reboot attempt approx every second
  • Post match, Drivers did systems check
    o Driver’s XBox controller – All OK
    o Operator’s XBox controller – nothing worked
     Replaced USB hub – nothing changed
     Replaced Operator controller – everything worked
  • Replaced battery and returned to field for Finals 1

F1: https://www.youtube.com/watch?v=t1vwcZj073Y&t=159s

  • No control system issues were observed throughout the match
    o Ball jam in shooter feed wheel at ~45 sec remaining
    o Climber slide damage due to impact with switch bar
    o Neither issue believed to be related to controls

F2: https://www.youtube.com/watch?v=PpUu20Dc3sY&t=117s

  • Auton: Lowered the intake OK, missed all 3 shots low, then moved off the line OK.
  • Teleop started OK.
    o Collected 5 balls in Rendezvous Point
     Driver noted that the gyro was acting “wonky”
     Exited RP, interacted with defense OK, oriented the chassis, reset gyro (111 sec on match clock)
  • Drove toward tip of the target zone with some difficulty (111 - 103 sec)
  • Stopped moving at 103 sec.
    o RSL still blinking
    o Driver notes lost code. Resets code.
    o Driver notes lost comms. Resets comms.
    o Collector retracts at 97sec.
    o Driver notes lost code. Resets code.
    o RSL continues blinking
  • Robot comes back up at 72 sec
    o Driver rotates chassis, jogs it back and forth
    o RSL goes dark and robot stops at 70 sec.
    o RSL comes on solid at 43 sec.
    o RSL starts blinking at 17 sec.
    o RSL goes solid at 0 sec.
    o No chassis movement during final 70 sec.
  • Robot removed from the field. 6 minute field time-out + 6 minute alliance time out.
    o Decided to replace RoboRio
     That took pretty much all the available time. We brought the robot to the field for F3 without calibrating the swerve pods (our mistake + no time). It shot 3 balls in auton and moved irregularly off the initiation line, but wasn’t controllable after that (due to uncalibrated swerve pods, not believed to be related to control system failure).
  • After the match, we took it to the pit, calibrated the swerve modules, and everything worked properly.
  • Back in our shop on Monday, we did extensive drive testing (without making any changes or repairs) and didn’t have any problems.
  • We plan to install the “failed” RoboRio on our practice bot and see if we get failures.

Thoughts? Does this sound consistent with RoboRio failure? Driver Console issue? Bad electrical connection somewhere? Something totally different?

Thanks

Congratulations to 6087, 7160, and 5504 on your win and to 3618 for Chairman’s. You all deserve it! Thanks to the people at Traverse City for a well run event with great technical support.

Could you do me a favor and post the driver station logs? I would like to see the voltage variation on the battery.

Loose power connection to the RIO?

You say “reboot attempted every second” - that’s the RIO cycling on the robot every second, or the station sending a request that goes unanswered?

Did any of your ds logs survive (showing connection status and 12v bus voltage, in particular)?

(Surprising that you could have a trouble free match between the two bad ones.)

What sensors do you have attached directly to the RIO?

(The xbox controller seems Unrelated, just bad luck.)

1 Like

Wouldn’t the RSL be flickering on and off if the RoboRio was losing power? They said it went solid and stayed solid so the RSL has power.

I don’t know if it’s related to this thread, but swapping the roboRIO fixed our trouble too. We swapped ours out two days before our week 1 competition, and made it through unscathed.

This is a common problem. We also swapped our roborio.

What gyro do you use?

After seeing wonky controller issues we typically have completely restarted the driver’s station laptop. What specific controller was it? I know @tim-tim 's team had issues with a non-responsive Xbox controller at one point this weekend. Also as noted in the other thread about RIO replacements - I believe 836 uses LabView still (they did last year, at least).

@Nyxyxylyth

We also had this issue, a reimage of the Rio worked for us.

Seems very strange that others are having the exact same problem, this was also at the same point where this Limelight Caught On Fire (and other Electrical Woes) was happening.

We have seen the exact same issues (DC after less than a match length, never reconnect without a power cycle, always see the rio properly) Reimaging/updating firmware somehow fixed it, and we haven’t had issues with it since.

This issue doesn’t seem to be the same as 1918’s however.

@Wayne_TenBrink,

Could you post the driver station logs if you have them? This disconnect sounds like an issue that NI is tracking, and we’re just trying to get more information at the moment.

If you see the issue again at home, you can also call 866-511-6285 for NI support. The help line isn’t open on weekends unfortunately, but at an event a CSA can play largely the same role.

1 Like

Sorry, but I have been away from a computer since I posted this. Our programmers will attempt to post data from the DS logs. We think they are still available. We are using a NavX sensor board for the gyro. We program in LabView.

At least going by Wayne’s description (I haven’t watched the videos yet), this could be the case for SF1-2 but probably not in F2 - my understanding is that when the issue mentioned in that thread occurs, the RoboRIO locks up entirely, including the RSL, until it’s power-cycled.

Yes, we still use LabVIEW. Our controller problem we believe has resolved, there was a portion of code that would move the controller input from slot 1 to slot 2, this is why our controllers would sometimes go unresponsive - sounds pretty unrelated to this, but wanted to close the loop.

Back on topic, we had some similar occurrences as what the OP described last year. We would randomly die in matches (1 or 2 a competition). This plagued us on both practice and competition robots and several RIO’s. I don’t have all the details, but it was later found during/after DCMP that we were experiencing kernel panic. We were encouraged to slow down loop times, and a few other things - these seemed to help and remove the problem.

Other Info

We had many lengthy conversations with NI leading up to and during DCMP on this issue. They had us add in some breadcrumbs to help them better understand the problem, we were back and forth between hardware and software folks. The whole time they were very helpful and even responded late at night and on weekends.

We’ve definitely built a reputation of breaking the RIO over the past few years. We’ve had NI in our pits at champs for the past 3 years trying to understand what is going on - and what are we doing differently that causes these issues. It is very frustrating when no one knows the answer, but also says that nothing looks like it shouldn’t work.

In 2017 we couldn’t load code to the robot - error code 63195. We were literally rewriting code in our pit during DCMP simultaneously while trying to get it to load. This became the error that followed us through 2018’s season - although not as bad, usually. In 2019 it was kernel panic. 2020 is still TBD.

I don’t know if any of this will help, but wanted to provide some backstory. I can get you in touch with the appropriate programming folks if you want further info or assistance.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.