cRio Constantly Rebooting

Thanks for the link. We are using the power cable included with the DLink, but I will make sure that the connection is secure and robust.

Hopefully the issue is that simple. The fact that the field could tell that our battery voltage had dropped precipitously immediately before losing comms, however, makes that unlikely in my opinion.

We had several robot resets in Pittsburgh that were very similar to what 341 experienced.

We have no RS-775’s on our robot (thankfully). 4-CIM drivetrain, 2 RS-540’s in the roller claw, 1 RS-550 for the arm.

The FTA reported the same low voltage reading to us at one point that was reported to 341.

Our first reset was after a hard impact with a tower. Most others occurred during normal operations, with no contact or other anomalous activity occurring at the time of reset. There were a few impacts later on in the competition that did not interrupt robot function at all.

We use C++. Others report using Java.

cRIO is isolated from the frame.

It appears that reports of this occurrence are getting more and more widespread, and the robots experiencing them more and more diverse in design and function. What are the common hardware and software ties that bind these robots and reboots together?

If any other teams have experienced similar issues, please share them.

Travis,
A small number of main breakers have a manufacturing defect that makes them intermittent. If you tap the red button and the robot lights flicker, this is likely the cause. Replacing the breaker is the only fix. Other sources of low voltage can be loose connections on the battery terminals. I recommend that a star washer be placed between the terminals before the hardware is inserted. This not only breaks through any surface crud, it also locks the terminals together so that they can’t loosen. Any loose connection or crimp in the robot primary wiring can also cause this brown out condition that takes out the Crio and radio power. Although it doesn’t occur often, some teams add #6 wire to the provided Anderson plug to extend the distance between battery and PD. This is the one place on the robot that wiring should be kept to minimum length as all robot current flows through this wire.

Al:

Thanks - we will check these items out first thing in Tennessee - after the rush of the elimination rounds, I actually thought to check how secure the robot’s primary power wiring was…after we bagged the robot. :rolleyes: Usually, that issue is never a problem for us.

I do know all the battery leads were securely bolted to the terminals.

I will likely check and replace the main breaker too, just in case. We’ve got plenty of those handy. I’ll grab a few star washers for good measure.

I also know our radio power adapter is secured and has good strain relief at both ends, but I’ll likely add a dab of hot glue to the barrel connector just to be safe.

I am willing to bet it’s the 6 motor drive, combined with the additional current draw of a drivetrain being pushed against and pushing. We did some magic math at the beginning of the season and found that heavily loading all of those motors would lead to pretty quick battery drains (1:30 or so into a match with some other mechanisms running).

Talking to some friends with similar issues, changing the 6 motor drive’s gearing to a lower speed (10 FPS is apparently a good sweet spot) made the issue go away entirely. From what I can recall, you guys are running at 13 FPS, which is much heavier on the motors.

It’s not out of the question, but a week+ of practicing (often for longer and/or harder than during a real match) and then perfect operation on practice day suggests, to me, that the issue is elsewhere. If it’s not electrical, or somehow code/firmware related, then the only other possibility is that our bolted frame is losing its stiffness and causing the motors to work harder during turns (though the robot doesn’t jump or otherwise seem to struggle at all to turn).

Besides, I don’t believe that Travis is running a 6 motor drive, nor are either of our teams’ issues happening as an obvious result of pushing matches (it happened to us less than a minute into a match where we hadn’t even touched another robot). We also ran one match with just the 4 CIMs (still geared at 13 fps, though) and the issue didn’t go away.

4 motor drive: 2 CIM’s and a SuperShifter per side.

4" wheels - 4 traction, 2 omni, no drop center

Interesting - I was watching the Waterloo Regional feed, watching 2056 score twice in autonomous (very impressive!) before dying shortly thereafter for about 40 seconds. I wonder if this was the same issue?

As an update:

  • We received our tool crate from Florida, and began doing diagnostics on our electrical board. We have been unable to find any debris, loose connections, damage, etc., to which we could attribute the problem.

  • We did notice that with our practice robot up on blocks (wheels off the ground), slamming the stick from full forward to full reverse did cause a (predictable) brief but severe drop in battery voltage. Other than simply asking too much from our battery, though, we’re almost out of theories.

  • This still begs the question: why didn’t we see the issue in practice? Or on practice day or the first half of Friday of our first Regional?

  • Lastly, we looked at video of each and every time we died on the field. I see no real pattern - sometimes we were lightly bumped, sometimes we were turning in open field, sometimes we were picking up or carrying a tube. I was able to tell that our RSL DID go out completely each time, meaning that it was the cRIO (and NOT the radio) that was doing the rebooting.

Have you tried swapping your PDB?

Haven’t yet - might be worth a shot.

I’m sure I’ll be asking Tyler Holtzman to find out.

As an update:

  • We received our tool crate from Florida, and began doing diagnostics on our electrical board. We have been unable to find any debris, loose connections, damage, etc., to which we could attribute the problem.

  • We did notice that with our practice robot up on blocks (wheels off the ground), slamming the stick from full forward to full reverse did cause a (predictable) brief but severe drop in battery voltage. Other than simply asking too much from our battery, though, we’re almost out of theories.

  • This still begs the question: why didn’t we see the issue in practice? Or on practice day or the first half of Friday of our first Regional?

  • Lastly, we looked at video of each and every time we died on the field. I see no real pattern - sometimes we were lightly bumped, sometimes we were turning in open field, sometimes we were picking up or carrying a tube. I was able to tell that our RSL DID go out completely each time, meaning that it was the cRIO (and NOT the radio) that was doing the rebooting.

3 robots. 3 events. 3 fields (well, 2 - I believe the Pittsburgh field was sent to Waterloo…). Exact same symptoms.

Hmmm…

Let me preface this by saying I don’t see what I am about to suggest as a long term solution, more a quick fix/diagnostic. I suggest you put in a voltage ramp (a la Lunacy) in your code for the drive train. See how that impacts your voltage drop and potential for reboots. If it eliminates your reboots, you will know that you are asking a bit much of the battery . Then you can make the tough decisions related to reducing your usage.
Also, does your roller claw have some sort of clutch (or torque limiter) mechanism to prevent you from stalling out your RS775?

I also agree that it looks like a cRIO reboot was plaguing 2056 today, it would be interesting to hear their story since they didn’t seem to have issue at FLR.

This is a game has the potential for more frequent high current bursts than any other we have seen with this control system. Lunacy tended toward a moderate to low constant draw. Breakaway had bursts for climbing, kicking and pushing matches, but for the most part was low.

A couple observations:

A voltage sag that reboots the cRIO should look almost identical to pushing the reset button on the cRIO. Forty seconds seems way too long for the cRIO to reboot, load user code and start driving again. Sometimes more than one device will reboot.

The default voltage monitoring of the battery is sorta slow and doesn’t have any logging or history. If you are using LV, you can open up the Start Communications VI, scroll to the right and find the code that reads the analog. You can either change that to run faster and publish the value, or you can publish the refnum and read it in a periodic task or tele and monitor the batter more closely. I’m not certain where this is done in the other languages, but I suspect a similar monitor is pretty easy to add.

At events, it is pretty common to see reboots attributable to each type of fault. Some are due to exposed wiring at the cRIO or PD that shorts when a bump shifts wires on the robot. Some are due to opened circuits due to a loose crimp or connector and shifting wires. Some are due to sagging voltage often made worse by wiring layout or gearing and tire choices. Some are due to code issues.

Robots that stop moving, but don’t reboot are another matter and are typically either code or a radio issue with a cable or a switch/button being changed due to movement on the robot.

Obviously there are lots of other failure types, but these are the ones that i’ve seen most often with robots that move for awhile, but then stop entirely.

Pure speculation leads me to believe that motor shorts may contribute to some of these this year.

Greg McKaskle

Greg,
The Crio reboot seems to be affected by the amount of code and type of development platform a team uses. While 30 seconds seems typical, I have seen longer times.
Although it doesn’t happen often, we did have a team that kept having Crio reboots this past weekend. After removing the Crio and replacing they had no further problems, so they opened the bad one and found it contained significant shavings and metallic dust. I had forgotten how critical this is. When I asked if the gasket was in place, the team said they had always planned to install it but hadn’t gotten around to it yet.

An internal short on the cRIO due to debris should definitely be added to the list of potential causes. From what I’ve seen, this is far less common than the other issues and belongs towards the bottom of the solutions list.

It sounds like we agree that forty seconds is not the typical boot time. The approach was to use the time between movement to help determine whether it was a power loss at only the radio, only the cRIO, or on both is useful to know when diagnosing and fixing the issue.

The cRIO-only reboot due to power sag can be approximated using a reset button press or duplicated by pulling and reattaching a cable. A team or CSA can measure this for a given robot and use this number to help diagnose what is happening on the robot. Typical numbers aren’t nearly as accurate as a robot-specific boot time, but since most teams don’t state that information in their problem statement, we fall back to typical values.

Personally, I’d be interested in seeing poll results on cRIO reset times. Quicker is obviously better, and this could help identify what impacts it. But this is an independent topic and far less of a priority than avoiding the immobility in the first place.

Greg McKaskle

It is my belief this is a power sag problem due to having only one power source (18AH SLA)

It is due to the high peak simultaneous startup current drawn by the Drive motors (or joystick(s) full fwd to full reverse = even worse as all motors are generating power so multiplying the instantaneous current draw at the instant the reverse occurs) then added to any other motors that may be starting at this same instant… Banebots and Compressor…

The zero speed motor represents a near short The CIMS draw 125A at stall (all motors start at stall) As the motor rotates the current decreases due to rotation creating AC impedance added to DC R=~.1 ohm (ImotorStall=12.5v/.1ohm = 125A)

6 CIMs - 750A throw in couple simultaneous Banebots and compressor startup ~1000A !!

The Battery has an internal R of ~.01 ohm fully charged and new…

(for now lets ignore the battery connections connectors 125A breaker 20A breaker to controller if you are lucky and make good connections add another .01 ohm)

So battery voltage sags due to ohms law instantaneously
and for up to a few hundred milliseconds until the motors spin up

Vsag = .01ohms * 1000A = 10V

Voltage available to the controller and all electronics sags to:
12.6v - 10v = 2.6v !!

So much potential for something to reset!!

so even if we halve the start up current to 500A… the likely battery internal+wiring resistance is .02 ohm… result is same 10v drop!!

IFI has oscope capture of this motor drive startup current: 350A with 2 Bosch drill motors… 6 CIMs represent 2-3X this

RESET depends on design “holdup margin” through the use of capacitor energy storage and isolation diodes etc… to isolate such sags.

Note: quicker drive motors come up to speed the shorter the high current period surge period

more gears = better i.e. slower range 7fps/high torque mode yields shorter “stalled rotor time”

less gearing (fast 14fps gear mode = longer stalled rotor high current draw time

Best solution as in some past years:
Allow use a separate battery to independently power all electronics
(currently NOT an option)

i.e. isolate motor startup current sag & subsequent reboot problem scenario

Solution for this year:

  1. on the fly shifting

  2. implement a timed startup algorithm…to smoothly increase motor current draw but this will reduce torque!! and increase time to get up to speed solves the problem at a performance cost

but less likely to reset a controller of Radio!! which is VERY costly in a match

Dale and Chris,

In the absence of our inability to isolate any other failure modes on our practice machine, we are arriving at the same conclusion: at times, we are simply asking too much from the battery and browning out the cRIO and/or the radio. Other teams have used successfully 6-motor drive setups geared for similar speeds in the past, but as you pointed out, this was in the IFI era when we often had backup batteries (even before backup batteries, does anyone have data on how low the main battery has to go before the RC reset?).

Brian,

A voltage ramp to the motors is one fix we did talk about. However, as you pointed out, this is a band-aid. For Philly we will swap out our ~13FPS drive modules for ~10FPS drive modules and run without the RS-775s (at least until we verify that we have corrected the issue). With our howitzer of a human player, we seldom ended up making dashes of more than 15 feet at a time anyway, so we feel this fix has a high probability of resolving the issue while not impacting our scoring ability.

Greg and Al,

Check your PMs.

Thank you all for your help!

Dale,
So we all get on the same page here…
Only four CIMs are allowed on the robot. CIM stall is 133 amps. Battery can supply 500 amps for a few seconds, although I would guess closer to 600 on newer batteries. (also the spec on MK batteries is 750 amps) Battery internal impedance is 11 mohms (.011 ohms), #6 is .0005 ohms per foot. Figure the same loss the Anderson connector, main breaker, and terminals on the battery and PD and you are likely pretty accurate on the total losses. I would guess that .015-.02 ohms would be pretty close.
The battery connection on the analog breakout in Crio slot 1 monitors battery voltage and inhibits Crio output when the supply voltage falls below 5.5 volts. The internal power supplies on the PD (+24, +12, and +5) drop out around 4.5 volts. If the voltage droop is sufficiently low and sustained, the Crio and radio are likely to reboot. The Crio will take about 30 seconds or more (typical) while the radio takes 52 seconds minimum. The power supplies are independent. The Crio is speced to work down to 19 volts and the +24 volt load will affect this power. This includes +24 volt solenoid valves and sensors. Please remember that battery temperature also affects the amount of available current as well as internal impedance.
While a second battery would help, most teams do not need it and play fine without brownouts.

Jared, after watching the videos I believe the Crio and in one case the radio as well are rebooting. I commented in my reply but if you would like to tell here what drive train you are using, I can reply here as well.
AL

I have anecdotal data about the pre backup battery from 2002. MOEhawk our 2002 bot was geared for in my estimation well over 20fps (seriously it was the fastest top speed FIRST Bot I’ve seen including '08 lap bots) with 2 drill motors and 2 CIM (traction wheels in back, homemade omnis in front in the days before Andymark :wink: ). It employed a 3 goal grabber strategy similar to 71’s legendary bot, so the fast drive train was only intended to be used in the first couple seconds. After that we would anchor down (didn’t have shuffle drive like 71) for the rest of the match.

At VCU, we noticed the 4 motors would drop the battery votage to <5 and the IFI controller would reboot into some sort of Safe Mode where the latency between joystick commands and actions was very long (again this what I remember about a weird problem many years based paritailly on driver observations so take it with a grain of salt). It also blew fuses on the IFI controller. At VCU, We re-geared the drive a bit slower to fix this problem (and also because our 14 ft wingspan had not fully deployed in the time we could hit the goals so we had to slow down). On ship day, we just finished the massive bot (including deburring hundreds of lightning holes), drove it with the moto “Don’t break it” and stuck it in the crate. Let this be a lesson to drive your bot hard before shipping to find problems (or as we call it Break Robot. Re-engineer. Repeat! :smiley: ).

That is the only part that doesn’t make sense to me about the “simply too much power” theory.

We drove this thing HARD for 10 days prior to ship. For far longer than 2 minutes at a time. With brand-new, not run-in gearboxes. With brand new treads, and with an even faster gearing on our arm (I remember the FP getting HOT to the touch during practice - we increased the reduction in our competition bot so that it barely gets warm to the touch now).

All the math, intuition, and experience says that the problem is drawing too much current. But I can’t figure out why, for the life of me, we didn’t have any problems until midway through the regional.