![]() |
How to determine root cause of robot dropping from Teleop to Disabled during match.
1 Attachment(s)
At West Michigan.
Match SemiFinal 2 Video of match: https://www.youtube.com/watch?v=ib3zz6QXHbc Robot in question is 2137 TORC, bottom right hand corner. We experienced two "field disconnects" in this match, the first one lasted about a second, and the second one lasted about 3/10ths of a second. This data comes from our logging on the roborio on the bot. Our logging completed on the bot, is only done during autonomous and teleop, so when we dropped to the disabled state, we didn't log data, but I can tell this based on the gaps in the data time stamp. These events are highlighted yellow in the attached robot log file. I am the lead electrical and programming mentor, but was not at the event. I am doing a postmortem trying to determine the root cause. The team was told by the FTA that the issue was not with the field, and there no basis for a replay. TORC bounced back from this anomaly, and the alliance was able to capture the blue banner. Based on our log data, I know the roboRio did not lose power or reset. From the length of time of the glitch, only being a second, that the robot radio probably was not the cause, as that is a typical 30 second reboot. Also, the robot worked fine after the comms were re-established, but our elevator carriage had been sprung from the frame, rendering it useless. In the video, you can see the drivers and FTA moving the driver station around, and the robot springing back to life. You can also see the stack light go from solid red, to flashing red, back to solid red. We should have driver station logs available on Tuesday, when we unpack, so one of the questions is what should I be looking for in the driver station logs for this match? I have not done much with the driver station logs, as we typically try to log what is important to us on the bot itself. Can you point me to a doc that describes driver station log files, and how to read the data? A couple other questions for my general knowledge, what determines if who is at fault on a FRC field, when a glitch occurs? What data determines this, and do we have access to this? Just a general comment about the field at West Michigan. There seemed to be many delays, much more than any of the other fields in Michigan, along with more replays on the West Michigan field than the other two. Just an observation as I watched the other two webcasts play finals, while we were trying to finish quarter finals. Thanks for any help. |
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
You are correct - connectivity drops that short are not indicative of a robot power issue. Here are the things I would look for if I was FTAAing and you came to me with this.
- Error logs stored on the DS (there's a tab in the log viewer to show these). It's possible that some of your code is taking too long to execute and tripping a watchdog, disabling you briefly. This should be logged for you to go back and look at. This is more likely to explain the second, shorter disconnect. - It's hard to tell from the video where/how your bridge is mounted. The antennas run the long ways down the bridge, and that should not be parallel to metal and the bridge shouldn't be near any forms of RF interference (power wires, motors, etc.). This could lots of dropped packets depending on your robot location/proximity to the field AP/current wifi situation. The DS logs should show packet loss, and if this is the problem, you'll probably see it throughout the match, just maybe not as bad. - How's the ethernet port on your DS laptop? The ports tend to not be very resilient to the constant plugging/unplugging of field connections, especially if it's an older laptop. Ethernet pigtails are a good investment. If something is stuck in the port, or if it's worn out, it's possible you could momentarily lose connection. - What other software is running on the DS? Is there a firewall? If so, that's most likely the cause - they like to block DS traffic and cause disconnects. My recommendation is to totally disable it (that way you know it's not interfering with anything). If you don't want to do that, make sure that the Driver Station program and the NI mDNS Responder are whitelisted. - Is your DS laptop used for anything else? I've seen instances where other programs that require an internet connection (Dropbox, the Autodesk updater thing, other update services) interfere with DS communication in just the way you described - momentary losses while the ping for updates. Killing their processes in task manager is the quick fix, uninstalling the thorough fix. If you can, have a laptop totally dedicated to being a DS, with as few other things running as possible, so there are minimal causes for niggling issues like these. That's about all I can suggest without seeing more info. Post the logs when you get a change, and I'll be happy to take a look. Quote:
http://wpilib.screenstepslive.com/s/...og-file-viewer It has some examples of logs indicating common failures, which would be good to compare yours too. Quote:
The main data points we use on the field are: - Watching the Field Monitor and area lights for indications of robot issues. Read the section entitled "What Information Does FMS Log" in the White Paper. It's a little dated now, but the list of items logged is basically the same. The Field Monitor looks like a table, with one row for each of the 6 robots and has the following columns: ethernet links (is something plugged into the player station), DS connected (is the DS communicating with FMS), radio connected, robotRIO connected, battery, bandwidth usage, and packet trip time. You've probably seen this up on a monitor at the scoring table. If you ask one of the field staff to take a look with something specific to check, they'll usually show you what's logged. - Status lights on your robot, both radio and roboRIO. Make them visible from across the field, it's unbelievably helpful. If we can't see the status lights, it's much harder to diagnose issues, which means you'll be less likely to be granted a replay. - DS logs. Data there is logged more frequently than on FMS, so often those are the most helpful. Check for logged errors, battery drops, packet loss, things I mentioned above. - Looking at your physical robot. Is everything plugged in properly? Are any wires loose? Are the PDP fuses in all the way? Are you sure they are? |
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
http://wpilib.screenstepslive.com/s/...og-file-viewer
The log has a great plot that should help visualize your problem I would be looking closely at radio placement, and would consider having a fresh radio at your next regional to substitute in. Review logs from your other matches and see if there are signs of connectivity issues. In general terms, connectivity issues in one robot won't cause a replay. If the FTA isn't busy during practice day at your next regional, I suspect they'd be happy to show you the field monitor and discuss how replay decisions get made. |
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
Thanks for the help so far.
Just a couple quick answers, we are using a 2014 classmate as the driver station, the power options have been disabled, the firewall has been turned off. This classmate sole purpose is this years driver station, but I had found some instances in the past where Spoitfy was loaded... I had a discussion about the importance of this box remaining pristine, but like I said, I was not there this weekend. :) As to code issues bogging down the roborio, we log the "loop time" of the 20ms timed task loop where our control loops run, and you can see from the data, that the loop time is consistent. This has been a great indicator of code issues, and we have learned if that data is not right to figure out the problem, and fix it. I have worked with the FTA at the event, over the last two years I have FTAA on a couple FTC events with him. Just don't know the FRC FMS. |
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
Watching the RSL pattern in the video, and looking at column B of your logs, it looks like the first period where your outputs were disabled was 11 seconds, and the second was 2.3 seconds. The log file on the DS will have some messages indicating whether the DS lost comms with the field, with the radio and robot, or just the robot. It also contains messages every few seconds showing the laptop CPU usage, roboRIO CPU usage, and lots of other stuff. Please post the file or contact me and I'll give you email instructions and I'll be happy to help you identify the issue.
Also, it looks like they started working about the time the FTA made it to their station. Does the drive team have an idea of what was done at the DS? Did they reconnect the ethernet cable? Did they reboot their code? Greg McKaskle |
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
Quote:
Looking at the log, there are two periods, ~23s-34s, and ~35s-39s into the match, where your LoopTime (Column E) stays flat at 20ms. They correspond to your video when the the bot was staying still. This and all other log data during these same periods indicate that the bot was running fine but was idle. So, it must be the communication or the DS not sending commands over. |
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
Hey Scott, can you zip up all the logs from that day (yes all of them)? If you can get them up before wed, I can take a quick look (Wed is travel day for next event).
Default folder location: C:\Users\Public\Documents\FRC\Log Files |
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
1 Attachment(s)
Attached are all of the log files from last Saturday.
I believe the match in question is 2015-03_21 16_24_57 Thanks for all the help guys! P.S. There appears to be no additional software made it onto this classmate. |
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
1 Attachment(s)
Picture of the electronics board. Radio is mounted on 1/4" Baltic birch plywood, and the cover, that removes is also 1/4" Baltic birch. They are separated on the sides by 1"x 2" Al. tube.
|
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
The logs have info for the following matches, and I put some comments beside those worth commenting on.
All of the matches have an error near the beginning of auto having to do with WAIT_FOR_PID. I don't know what impact this would have on auto. Many of them also have joystick errors that seem to occur in Begin. Qual 3, 13, 20, 26, 29, 37, 52, 55, 66 -- clean Qual 47 -- two second glitch during auto after your robot stopped moving, another that was one second long. Qual 71 -- lots of stuff happening before the match started, but clean for the match. Qual 79 -- pretty clean, but some packet loss a third through tele, no disable. Elim 4, 8, 12, 15, 16 Elim 10 -- Comms are very clean, but robot drops at 2:24:49.250. It came back for just a few tele packets about 11 seconds later, dropped again a few times over the next 5.5 seconds, and then continued with very clean comms. At 2:24:59, there is a message that says no code is running, and at 2:25:50, there are joystick output errors that occur in most of your log files when your code starts up. Possibly in Begin. So it looks kinda like the code restarted, but I can't say why it happened, and I can't say for sure that this even happens in Begin. I also see the FMS times out and disappears about the same time as the robot. So I tend to think the ethernet cable came loose from the laptop and that was the main issue. It would be good to know if the FTA or students think that was what happened. As for the error regarding joystick outputs, it would be good to know when the code does this to know if this is really Begin or somewhere else. Elim 14 -- clean for the match, but noisy after the match I wouldn't worry about the radio or radio placement. I'd check the laptop ethernet retention and speak with the drive team to see whether they think the cable could have been the issue. I'd look at the code a bit for the joystick output error, and perhaps even pull the cable to see if the error shows up in the log when the DS and robot reconnect. Hope this helps. Greg McKaskle |
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
Greg hit all the important things in there, and I agree the next step is getting some input .
Next time you get some hands on time with your robot, go check the PDP fuses. Nothing I've seen indicates its an issue, but if not pushed in all the way they slowly wiggle out over time and can cause some strange issues (not always full reboots, either). Here's a post showing what a completely pushed in fuse looks like. |
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
1 Attachment(s)
The WAIT_FOR_PID is a command we use in our beescript autonomous. When auto starts, we do an elevator reset, which homes the elevator, and resets the encoder, then puts the elevator in pid control. This command just waits until the elevator is under the pid control, before sending it a target height. Obviously, they have a typo in the script or the command name and they don't match up. Not an issue for this event, as we were just sitting and not moving in auto, but something on the list to fix.
The joystick error has been happening all season, but has not been an issue, except for throwing the error. The code runs in periodic tasks, and is setting the joystick rumble for the driver and operator. The operators joystick will rumble if he hits an overtravel on the elevator, and the drivers joystick will rumble when the there is a tote in position to stack. Greg, if you want to send me you github username I will add you to the repository, if you want to check out the joystick error. I will try to get additional information from the drivers at tonight's meeting. We also checked all of the fuses after this match, and nothing was found to be loose. |
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
Quote:
If you can get in touch with your alliance partners and get their logs, you may find a correlation in communication events. There seemed to be a number of gremlins, including the 'you most close and reopen the driver station' between each match or FMS can't see you. |
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
Tom,
Many gremlins, and anomalies with that field that weekend. On the webcast they said it was a complete "power loss" on your side of the field and that was why the replay was granted. One of the things that concerns me, is what is the determining factors between the decision to replay your match, and not replay our match. As that decision, could have been a deciding factor on who moved on. Another glitch in one of our other matches would have knocked us out for sure. As it happened, our elevator self destructed due to being disabled and re-enabled too, and that almost knocked us out with a single event. From our logs, I understand what our robot did, and there are code fixes we can make so that we don't react the way we did when being disabled, and re-enabled. It was kind of a perfect storm scenario, where the elevator was being jogged manually when we lost comms, and when re-established, the pid was reset and sending it to lowest position, those events made us try to rip the elevator when we jammed the container in our arm.. This failure mode was not something I have seen in the past, but I now know to test for this scenario. |
Re: How to determine root cause of robot dropping from Teleop to Disabled during matc
Quote:
Even if not fully seated, they usually won't feel loose. But not feeling loose doesn't mean they are making proper contact. If you can pull them out by hand, they weren't in all the way. It takes some serious pushing to get them in correctly, and once that's done, they definitely won't come out on their own. |
| All times are GMT -5. The time now is 16:24. |
Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi