![]() |
Errors at Colorado Regional
Team 1977 was wondering about the error of some kind at the 2014 Colorado Regional. By our count, somewhere between 10 and 20 teams had issues during matches. Some of that is likely coincidence, but there was way too much interference to mark it all as something from a team. Our robot worked perfectly on the practice matches and in the pit while tethered, but for the first day of qualifiers, we could do nothing. No autonomous, no teleop, nothing. As soon as we put it on the field, we went dead, with no robots hitting us or interference. When we brought it back to the pit, it worked perfectly every time. We had several officials look it over and find no explanation. We also had several people check over the code and find no explanation. We finally made it work, albeit with lag, by disabling the autonomous feature and switching out the CRIO. However, other teams that had communication issues had been cleared up on their own, leading to the possibility that making these changes was coincidence. Our robot was the worst off, losing a full day of competition, but another team was only able to run for half of every match before losing connection. Several times, a team would be dead at the start of a match and gain connection about 10 seconds into teleop mode. Rebooting the CRIO helped for a time, but it takes a lot of time within each match and only seems to be a temporary fix. It seems too big of a deal involving too many teams to be marked down as an error on every team. (Unsure, but it may have happened to a team in the finals as well. Last 10 seconds they were not moving, but it could have been driver error.)
It's possible that all this could be coincidence, it just doesn't seem likely. If anyone has had a similar problem where a robot works tethered in the pit but not on the field, please reply. |
Re: Errors at Colorado Regional
When you were testing in the pit, did you put the Driver Station into 'Practice' mode?
- Nick |
Re: Errors at Colorado Regional
Team 1305 also had a similar CRIO issue at Waterloo. It worked perfectly fine the first day, but on the final day it suddenly stopped working when connected to FMS. It worked perfectly in our pit as well in all of the modes, none of the volunteers or other teams could replicate the problem off of the field. We solved it by switching out the CRIO as well, and were able to compete in our next two regionals without issue.
To any teams who have this issue and need a quick fix, reboot the CRIO in the driver station before the match starts while connected to FMS. In our driver station logs, we determined that the CRIO never entered disabled mode, somehow dying in the middle of starting up, no matter what the code was. By rebooting the CRIO, it somehow let the robot enter this disabled mode, and therefore function normally. We have no clue what triggered it to fail though, we only left it overnight. |
Re: Errors at Colorado Regional
Team 4293 experienced problems at the Colorado Regional. Our cRio would reboot during matches with no clear explanation.
Our problem was traced down to debris in the cRio and a flaky power connector. We owe a huge thanks to the FTAs and the CSAs for all of their help trying to troubleshoot everything! |
Re: Errors at Colorado Regional
I noticed this happened a lot at the Dallas regional, as well as Hub City too. We had code that would run perfectly while tethered, and then when we put the robot on the field, we would run into all sorts of issues. I don't know what happened, and I don't know why it did, but I think that FIRST should be a little more transparent with the FMS. Yes, it must be secure, but I know for a fact that there was a driver station that caused more robots to be down than others. I can pull scouting data and show that the robots that occupied this station had problems. And even some scouts came up to me and asked what was going on because they noted it too. It was tough to play qualification matches with this colored alliance, because often we were having to run two robots instead of the three.
I don't want to blame FIRST, but things like this seem a little suspicious to me. And often all I got from asking an FTA is "It must be problems with your robot", Granted, I did sit down and find my errors at our first regional, but there were still several times in which this happened, even after scrutinizing my code between regionals. |
Re: Errors at Colorado Regional
Our robot was working tethered in the pit but not on the field. Towards the end of the match we would lose connection to our robot. Originally we thought it was the field. However, after many hours of looking it over, we figured out that our PD board was outputting voltage incorrectly and causing all sorts of issues. So the moral of the story is: in all likelihood the teams appearing to have difficulty with the field just had electrical or programming errors that were giving them trouble. The FMS itself was probably not the problem. Because you were able to fix it by disabling your autonomous, I'm guessing that somewhere in there you were calling something that was causing a whole bunch of errors and stopping you from connection to the field.
Thanks to the FTA's and CSA's for helping us figure this out! |
Re: Errors at Colorado Regional
If you can provide the following information, perhaps we can provide a differential diagnosis (without the House sarcasm/snarking)
Programming Language Which CRIO Pneumatics? If so, how much? Vision processing? If so, on or offboard via Raspberry Pi or such things? Drive Train/wheels Electronics correctly wired, including the dedicated wiring? What the CSA/FTA/FTAA advised? When you swapped the CRIO, were the modules swapped too? Digital Sidecar tested good? No tripped breakers? I had a team with similar issues. They were using Java with offboard Raspberry Pi Vision processing, mecanum drive and LARGE 24 Solenoids wired off of the CRIO. In the end, we corrected the wiring, disabled the Pi, Went to a basic code and swapped all the modules. By then, she was purring like a kitten and rebuilt the code. We didn't open the CRIO or swap it but I think if I had opened it up there would have been some debris in it. I know it is a lot to ask post mortem but it helps to put issues to bed before St. Louis and the "Off Season". |
Re: Errors at Colorado Regional
In general, scouters and anyone sitting in the stands for that matter are really bad sources of random data on robot/field troubles.
I used to be one of them and I remember how wrong it all looks from the distance of the stands. All they ever take notice of is that a robot isn't moving (yes/no). Folks in the stands can't know the difference between a drive team choosing not to move (strategic, threw a chain, because the robot will break), a Ref disabling a robot for transgressions (dragging bumpers, outside starting configuration), a Driver Station issue (poor default power mgt settings, low battery, bad USB/Eth ports, unplugged USB/Eth), or a robot issue (wiring loosening up after being hit all day, code errors, DSC shorts or CAN wiring glitches that prevent driving but leave all other functions running, low battery dips rebooting everything). Because the robot side of FMS is so simple, network problems are easy for the field crew to spot and deal with. If you want to be scientific and gather actual data rather than anecdotal evidence, then scouters have to:
|
Re: Errors at Colorado Regional
Quote:
|
Re: Errors at Colorado Regional
Quote:
|
Re: Errors at Colorado Regional
I agree most of the issues by robots were probably due to team, rather than FMS error. I would also like to mention though that even though most of the errors happened to robots, there were plenty of errors on the parts of the Referee's at Colorado. From what I could tell, and the general consensus I have found is that the Referees waited till the end of the match to score or log fouls. This caused some major issues in Refs signaling for fouls during the match and then not logging them at the end of the match. The most obvious case happened in SF 2-1, nothing against these teams just an observation, 2996 had team 4153 pinned against the one point goal the entire match. 2996 was halfway into 4153's robot and could not back out. The ref on the field closest to the goal counted off the foul, waved her flag, and signaled for the foul. Then the head ref came over and counted off an additional foul for a continuous pin. Neither foul was counted at the end of the Match, and cost 1987, 1619, and 4153 the match. Again no offense to 2996, 1410, and 662 but that was the clear cut definition of a pin, the refs called it on the field during the match, but failed to score it, to me that is a huge error, and think that it should be considered that fouls have to be logged the instance that they are called.
|
Re: Errors at Colorado Regional
Quote:
Alternatively, the ref who flagged the foul might radio another ref at the less-busy end of the field to enter it into the system, but at a noisy event, radio communications can be difficult. |
Re: Errors at Colorado Regional
Quote:
http://youtu.be/HQdVzEqjz7E?t=47s |
Re: Errors at Colorado Regional
Thanks for the clarification to both, RallyJeff and animenerdjohn, both posts were very helpful. Particularly animenerdjohn because, even though I watched the match live, and several times on video, didn't consider that 1619 forced them into the foul, I will check on that again, and if that is the reason then I at least understand the referee's decision better. Not that I agree, but I will understand. I still believe that the system of refereeing could have been better. At several other regionals, the fouls were logged on the spot, and then were also explained after the match, even if the fouled was called back after the match they would still give the reason for the foul. Just an observation. Thanks for the feedback.
|
Re: Errors at Colorado Regional
I just watched the match in question again, SF1-1, and I see that 1619 did indeed push 2996, however, I still believe that a foul could have been called for a pin, because of the fact that 2996 put themselves in that position to be pushed into 4153. Again thanks to animenerdjohn for your help, with this. I find it interesting though that no announcements were made about fouls. Would have been helpful for those of us watching the eliminations.
|
Re: Errors at Colorado Regional
Quote:
|
Re: Errors at Colorado Regional
Can anyone think of a reason that ISN'T the FMS that would cause a robot to always work tethered in the pit, but never on the field? Switching routers and flashing both several tomes didn't help.
|
Re: Errors at Colorado Regional
Quote:
|
Re: Errors at Colorado Regional
Quote:
I can think of at least two wiring-related things and two mechanical ones that could keep wireless communication from working properly, and two others involving software that wouldn't necessarily show up unless you enabled practice mode. |
Re: Errors at Colorado Regional
Quote:
|
Re: Errors at Colorado Regional
Quote:
And by the way, it was great playing against you guys at Utah and Colorado. You guys are a fantastic team with a fantastic robot. You do deserve to be going to championships. Maybe next year. |
Re: Errors at Colorado Regional
Quote:
Quote:
|
Re: Errors at Colorado Regional
Quote:
|
Re: Errors at Colorado Regional
Quote:
Quote:
|
Re: Errors at Colorado Regional
5 Attachment(s)
Whew, I have a lot to address here. Apologies in advance for a very lengthy post.
I’m the head programmer for 1977, and consequently likely the most closely involved with the issue that walruspunch described. I will give a brief overview of the symptoms we experienced and some of the attempts we made to fix them. At this point I am at a little bit of a loss as to what to check next, so any ideas are welcome. Before I start, a few specifications are notable. Our code was written in Java via Netbeans from the Command Based Robot framework and run off of our 4-slot cRio from 2012. The robot featured no pneumatics system this year and drove on four mecanums directly-driven by Cims. Image processing was performed on the SmartDashboard using an Axis 206 mounted on the robot, but this was disabled as soon as communications issues arose. The Issue: Starting with our first qualification round, every match we entered was made nigh unplayable by loss of control and crippling latency. These issues were non-reproducible within the pit, even through the use of practice mode on the driver station. None of these issues had occurred over wireless communication during build season, and none of them had been present during practice matches (a point which all teams that I talked to who suffered from communications issues similar to ours seemed to agree upon). As walruspunch stated, the disabling of autonomous commands allowed us full, if laggy control of our robot for our last few matches. However, while it seems to be the obvious conclusion, I have several reasons why I doubt the autonomous is the (only) culprit. For one thing, autonomous did not create any issues when practice mode was run in the pit. The final code used for autonomous was a CommandGroup composed entirely of commands used also during teleop processing (excluding WPI’s built in WaitCommand), making me doubt any loops stalling. Furthermore, I heard from our couch that another team (3807) had faced similar issues that they had also solved by disabling their autonomous (if anybody from 3807 is reading this, would you mind hooking me up with your programmers so that we can compare notes?). Things our team tried: - Code alterations: On Thursday, during practice rounds, our robot did have some legitimate lag which was due to a recent increase in image processing frames. This issue was solved by disabling the process during teleop, and we did not experience any other communication problems on that day. The next day, after crippling lag on our first qualifier, I removed our image processing widget from the Smart Dashboard, and I later completely unplugged the camera. Afterwards, all calls to the Smart Dashboard and network tables were removed (aside from their initialization), and later most outputs to the console. All periodically run blocks (teleopPeriodic, our Userdrive command) were thoroughly examined and pared down to the absolute minimum necessary for functionality. Finally, all autonomous commands (which should have been self-terminating and were functional in off field practice) were removed. - Electrical alterations: Our pit electricians, mechanics, and I, along with CSA helpers, looked over everything on our electronics board in detail. Connections were checked, anything suspicious was repaired, and several major changes were made. Notable attempts included multiple re-flashings of two different routers (DAP-1522 Rev B1, one from the KOP this year, one from last year’s), both of which were tried during matches; the aforementioned removal of power from the camera; the diverting of wires near the router to avoid possible interference; the fastening and taping of the power and Ethernet-to-cRio connections on the router to prevent them from being shaken loose; disassembling and cleaning out our four-slot cRio from 2012; replacing our cRio entirely with a loaned four-slot from the spare parts counter; reimaging said loan cRio; replacing our Can 6-pin RJ12 wire to our black Jaguar; and replacing said Jaguar with a brand new black Jaguar which was re-formatted as well (note that the last two steps were taken to solve an issue with our catapult-tensioning Cim not running that I do not believe was directly related to the connection issues). - cRio resets: These were carried out mid-match many times and before each of our matches after we realized their effectiveness. By resetting the cRio during teleop, we were able to regain temporary control of our robot if it had been incapacitated; however, lag was still ever-present during these bouts. - Bandwidth testing: After the issue began, I was called in to be coach on our drive team so that I might examine the issue more closely on the operator console. During several of the matches I ran a bandwidth monitor in the background; according to it, we never used more than three megabits of the available seven. - Operator console variation: We ran our driver station on two different laptops across the matches during which the problem occurred: the first being our team’s, an HP Probook 6550b that runs Windows 7, and the second being my own, an Asus G75VW running Windows 8. Both had their firewalls disabled during matches and neither fared any better or worse than the other. - Discussion: Over the course of the two days during which the issue occurred, the rest of my team and I spoke extensively with each other, our alumni, our mentors and parents, the regional CSA, the field management team, and pretty much every other team present at the competition. Many of our changes and attempted fixes were made at the recommendation of any number of these sources. Members of the programming, electrical, and mechanical teams all worked very closely together in an attempt to diagnose the problem. I spoke to all of the team alumni present at the regional, including some of our best programmers and electricians, who were just as clueless as I was. Our mentors provided a diverse range of potential solutions but were unable to think of the reason for the problem. The CSA examined both our robot and code together with us multiple times and gave electrical suggestions (cleaning the cRio, replacing the cRio, various wiring refinements), but these did not solve the issue. Finally, our scouts talked to pretty much all of the other teams present, some of whom were suffering from the same problems at least intermittently. These teams helped us look over our code and robot and gave us loads of useful advice and tips, whether they had the same problems or not, and for this I am infinitely grateful. Yet, the problem still remains unsolved. To directly address some of the questions posed here: - The digital and analog modules were never replaced, as we did not bring extras with us to competition. However, as far as I aware, any issue with these would have led to a lack of function from digital sidecar outputs or a lack of battery voltage readout from the breakout board (both of which we had) and would have been reproducible within the pit. - The digital sidecar was functioning fine as far as we could tell; we had to move a limit switch input due to what seemed to be a bad port, but all other digital sensors, relays, and victors worked correctly. - The router power cable was fastened with electrical tape as soon as issues began and did not disconnect. Also, the cRio never reset during matches except when we did so manually (though after returning to our shop the power cable for the cRio did seem loose; could this cause functionality issues without a full shut-off? I rather doubt it, and I don’t think it was loose at competition, but it’s worth asking). As for the positioning of the router, it was mounted high up on our robot’s catapult structure, away from the chassis and electronics board. I rather doubt it was a Faraday cage related issue, as I saw other functional robots with their routers position next to their batteries and other electronics in the center of their frames. - Mark McLeod, your post does make a valid point. However, while I was in the pit the entire time and thusly do not know the exact number, I am aware that the other teams that we cited as having similar issues were ones who had approached us directly or whom we had approached. Friday morning I was part of a conversation between CSA representatives and representatives from some five or six other teams regarding the problem. - On the issue of driver station logs, my examinations of them showed me little more than spikes in dropped packets and latency. Disconnects occurred during our earlier matches, but quickly trailed off. The only error events recorded frequently were controller disconnects at the driver station, which we were aware of. However, I am hardly experienced in the matter. I have attached the images from two matches in which we had issues in case someone can shed some light (match 7 was our first on the morning of the second day, match 77 was our first on the morning of the third day. These were all I could include due to attachment limits, though I may upload more later). Also, if it is of any use, here is a link to the latest revision of our robot code on Sourceforge: <http://sourceforge.net/p/lhrobotics/code/HEAD/tree/trunk/y2014/>. I apologize in advance for its somewhat messy state; debugging in the pit was rather frantic. |
Re: Errors at Colorado Regional
A couple observations:
1. Your network connectivity in match 7 is pretty rough compared to what a typical team on the field would see, at least at the three regionals I was at this year. 2. There's a cRIO reboot in match 77 -- was that commanded by you or did it happen spontaneously? 3. Match 7 shows no voltage on the chart. 4. Match 7 has higher (not bad, just higher) CPU utilization, during both auto and teleop. Probably the result of removing your camera code. 5. You had something going on during match 77 in teleop that was using lots of power - was the robot driveable during the time the cRIO was up? |
Re: Errors at Colorado Regional
Quote:
|
Re: Errors at Colorado Regional
1 Attachment(s)
I'm going off of memory here, so I'm not entirely certain on all aspects. All of the cRio reboots were manual. I did notice the voltage thing as well; I'm certain we had voltage readout on the DS when we were inspected (else we wouldn't have passed), so I'm not sure what happened there, but it did return during later matches when the issue still was unsolved. Regarding the voltage drain, it would have been driveable after the reboot, and I can see the Cim drive causing notable drain. However, I can't off the top of my head remember any motors running before the reboot (perhaps autonomous got far enough to start the lifter intake, which would mean it also would have driven), so that is a little bit perplexing. Regardless, would that kind of drain be significant enough to mar cRio or networking functionality?
For reference, here is the first match during which we had mostly full control (autonomous disabled). The voltage usage is pretty similar. Edit to address Alan: Apologies for not providing the log files; I went with the file extensions that were acceptable as attachments. I did have a few dashboard variables running early on, but as with most network processing I disabled them when issues cropped up. Though it is worth noting that, while the problem persisted, connection status did improve over time (as Steve pointed out), so that may have been related to it. |
Re: Errors at Colorado Regional
Quote:
Did you ever swap the 12v-to-5v converter? It's a potential source of problems for wireless operation. |
Re: Errors at Colorado Regional
Quote:
Quote:
Thanks for all the feedback. I'm going to get off to do some homework now, but I'll see what more I can think of or look into. |
Re: Errors at Colorado Regional
Quote:
|
Re: Errors at Colorado Regional
Alan's point about the voltage is a good one. Regular incursions below 8v can indicate an issue. At the same time, I saw robots with much more frequent incursions below 8v work just fine on the field this year. I don't have any experience with the CAN brownouts that Alan discusses, but that might be a line of investigation -- wouldn't be too hard to swap in PWM and see whether that makes a difference.
The wifi environment on the field is quite different from most team's practice/work spaces. Differences include: . more radios active . other robots using bandwidth . different space means different propagation/reflection of signals -- different both in good and bad ways . tendency to operate robot differently in practice area than on the field . distance from robot to driver station when practicing is typically smaller than the distance from robot to the antenna array on the field Given that you haven't tried a digital sidecar replacement, I'd also try that. Swap the cable too. If you have a history of swarf on your robots, consider opening each digital sidecar and cleaning them out. Be sure to do this away from other electronics, so that you're not moving the problem elsewhere. Our team uses compressed air to blow them out, followed by a vacuuming. |
Re: Errors at Colorado Regional
1 Attachment(s)
Looking a little more closely at the match 77 chart, the gaps in the top blue line are curious. I've marked them up in the attached image to indicate what I'm talking about. Not quite sure what to make of it.
Here's what the log viewer documentation has to say about that: "The top set of the display shows the mode commanded by the Driver Station. The bottom set shows the mode reported by the robot code. In this example the robot is not reporting it's mode during the disabled and autonomous modes, but is reported during Teleop." Compare those to the lines in your match 77 image |
Re: Errors at Colorado Regional
I'm told that documentation is reversed -- the top set of lines is reported by the robot code, and the bottom is reported by the DS.
Makes things align a bit more with Kuragama's observations, I think. |
Re: Errors at Colorado Regional
One note about your Match 7 graph. Looks like the radio rebooted. It takes typically 10 seconds to reboot which looks like started at 10:09:33'ish.
Your "clean" match graph is showing the battery voltage going down below 6V a couple of times and below 8V and 7V very regularly. Do you have a way of testing your batteries to make sure they are not underperforming? With all of the in-event testing it sounds like you did, was one of them making sure the 6ga cables were all very tight and crimped well (the Panduit LCA6 lugs in the KOP require a special tool to crimp correctly)? The Match 77 log, the battery voltage almost looks like as soon as the robot moved, a power connection might have been jarring loose and then reattaching. Assuming all cabling is firmly tight (and power terminals are the correct type and crimped correctly, Panduit Ct-1700), can you write a test code that doesn't allow all of the motors to power at the same time to see what the voltage does then? Aspiring FTA doing some Straw Grasping here... |
Quote:
That is correct. |
| All times are GMT -5. The time now is 07:34. |
Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi