Robot Keeps Losing Com and Code

We have been experiencing a strange issue since the beginning of the 2020 season. Every now and then when using our robot, we will suddenly lose both Com, and Robot Code.


However, the driver station reports that it’s still connected to our radio.

When this occurs, we are unable to ping our roboRio. The only way we’ve found to reliably resolve this situation is to power cycle the robot. Once we do so, communications and robot code are restored.

This problem occurs with no discernible pattern. We can sometimes go hours without having an issue. Other times the issue will resurface after only 10 minutes. The driver station logs don’t reveal any errors other than “roboRio disconnect” shortly before we lose communication, and this has happened repeatedly with two completely different controls systems (i.e. two different roboRios, PDPs, Radios, Driver Stations, etc.) So while this could still be a wiring issue, at this point, I find it very unlikely. I don’t any have pictures of the roboRio’s indicator lights on hand. But if I recall correctly, they don’t display any unusual behavior or errors when this occurs.

There are several other threads that describe problems very similar to those we are experiencing. However, none of them have definitive solutions. I’ve also noticed that 3/4 of the threads I found on this issue were either started by teams using LabView (like us), or were necroed this year by teams using LabView.

Any help is appreciated.

9 Likes

Can you look at your driver station logs for when it happens? I’m curious if your Rio CPU is too high. My understanding is that there is QoS setup to prevent robot code from completely hosing the Rio, but it’s still possible (tight/unbounded loops and unbounded recursion are the most likely causes). If this assumption is at all related, your code essentially takes up all the available resources on the rio, making it impossible for the other processes (e.g. netcomm) to do anything. Without anything else able to run, the DS will not receive responses from the Rio, and the Comm light will go red. If you share your code, I’ll take a look through to see if anything jumps out at me as weird.

1 Like

I’ll post a picture when I get a chance this evening. Our Rio CPU usage has been pretty high the past couple of years (90-95%). But we never experienced this problem last year, and our code this year is fairly comparable in terms of CPU load.

That being said, we added path following code to our Autonomous.vi this year. I’ve often considered whether this code could be the culprit, but I find it unlikely, as we have had the problem I described while running in TeleOp.vi as well, which doesn’t have any path following code to speak of.

Ah, it’s labview. I won’t be as much help with that

1 Like

We have been having similar issues, and I am wondering if the nearby (as in, very near) VRM could have something to do with this. The VRM also contains an ethernet hub power supply. Could that have something to do with this fluctuation in radio performance?

1 Like

Are you referring to this POE cable…? http://www.revrobotics.com/rev-11-1210/

As another sample point: We had very similar symptoms till about week 5. At which point we updated all spark max firmware’s and REV’s libraries to the latest, and haven’t seen the issue reoccur.

This is pure correlation, not any proof of causation.

We use Java.

2 Likes

No, I am referring to the power input of an ethernet hub that runs off of 5v 0.6 amp or something of the liking. I am positive it is plugged into 0.5 amps, and that is what I find questionable. The hub supports a limelight 2 and that is all, but it is almost always active. I use python lol.

I’m not sure that radio performance has anything to do with the problem we are facing, as

You may be experiencing a different problem than we are.

That’s the thing though: we also are still connected to the radio. I think it has something to do with the radio -> RIO connection. I am wondering if our cable’s performance is less than desirable. I am not sure if that ethernet hub has anything to do with these issues, i.e. cable speed based off of radio power supply. Sorry to cause any confusion.

1 Like


Here is the log file for when we lose connection. Ignoring the scary voltage drops (I’ll have to look at fixing that later), you can see that the robot disconnects with only 50% CPU usage, and no other obvious errors or CAN bus overloads.

Does this happen when tethered via USB? (with the same symptoms, e.g. can’t ping)

  • Are there any USB/I2C/MXP devices hooked up to the Rio?
  • Have you tried reimaging the Rio?
  • How old is the Rio? Could there be metal shavings etc inside the case?

If you have the means, it might be good to hook up to the serial port and see if anything gets printed there. My sense is that it’s a kernel crash, but without a trace/dump it’s hard to demonstrate that’s the case.

Yes. We cannot ping over USB when this occurs.

We are using a Navx plugged into the MXP port on our Rio. We are also communicating with a Jevois camera using serial over USB.

Not yet. I’ll make sure to try that. But as I said before, we’ve had this same problem with two different rios, so I doubt this is the issue.

I’m not sure how old our rios are, but they are in good condition. As for metal shavings, again, something that we plan to check, but that is unlikely to be the culprit due to two different Rios exhibiting the same behavior.

I think we’ll do this next. I’ll post what we find.

I’ve heard of at least one other team who has seen this with similar peripherals, especially the Jevois camera, but other teams have also reported it with no devices attached, so we don’t really have a common thread here yet.

Great, thanks. Port should should be configured to 115200-8-N-1. You can tell the serial port is working if you hit enter and see a login prompt.

Is there any documentation on how to access the serial output of the roborRio?

Not really. It’s just a standard RS-232 serial port on a 3-pin 0.1" header. The signal levels are RS-232, not TTL. On modern machines you’ll almost certainly need need a USB to RS-232 adapter. Sometimes you can find them with individual wires, but more often you’ll find them with a 9-pin D-sub connector “DE-9” which you’ll have to wire up to–you just need to connect GND, TXD, and RXD (and potentially swap TXD and RXD).

We just set up a connection between the roboRIO and the RS-232 port using the settings you described along with some others we found online for RS-232 communication with Putty. We have a connection, but when we send a carriage return; instead of receiving a login prompt as you described, we receive a single charter: “C”. Do you have any idea what we could be doing wrong?

Ok, so we might have resolved our issue. We had a limelight go to an ethernet hub, then to the radio; however, tonight we switched things up. We instead directly connected the limelight to the radio, taking the hub out. Not sure if this was only a momentary fix or if this related to the issue.

Most likely either the baud rate is wrong (should be 115200), you have swapped TX and RX, or your port is TTL instead of true RS-232. Serial is kind of hard to debug because it’s such a simple protocol it doesn’t give any positive indication of an actual connection.

We removed our NAVX and our loss of comms stopped.