![]() |
Re: Errors at Colorado Regional
Quote:
|
Re: Errors at Colorado Regional
Can anyone think of a reason that ISN'T the FMS that would cause a robot to always work tethered in the pit, but never on the field? Switching routers and flashing both several tomes didn't help.
|
Re: Errors at Colorado Regional
Quote:
|
Re: Errors at Colorado Regional
Quote:
I can think of at least two wiring-related things and two mechanical ones that could keep wireless communication from working properly, and two others involving software that wouldn't necessarily show up unless you enabled practice mode. |
Re: Errors at Colorado Regional
Quote:
|
Re: Errors at Colorado Regional
Quote:
And by the way, it was great playing against you guys at Utah and Colorado. You guys are a fantastic team with a fantastic robot. You do deserve to be going to championships. Maybe next year. |
Re: Errors at Colorado Regional
Quote:
Quote:
|
Re: Errors at Colorado Regional
Quote:
|
Re: Errors at Colorado Regional
Quote:
Quote:
|
Re: Errors at Colorado Regional
5 Attachment(s)
Whew, I have a lot to address here. Apologies in advance for a very lengthy post.
I’m the head programmer for 1977, and consequently likely the most closely involved with the issue that walruspunch described. I will give a brief overview of the symptoms we experienced and some of the attempts we made to fix them. At this point I am at a little bit of a loss as to what to check next, so any ideas are welcome. Before I start, a few specifications are notable. Our code was written in Java via Netbeans from the Command Based Robot framework and run off of our 4-slot cRio from 2012. The robot featured no pneumatics system this year and drove on four mecanums directly-driven by Cims. Image processing was performed on the SmartDashboard using an Axis 206 mounted on the robot, but this was disabled as soon as communications issues arose. The Issue: Starting with our first qualification round, every match we entered was made nigh unplayable by loss of control and crippling latency. These issues were non-reproducible within the pit, even through the use of practice mode on the driver station. None of these issues had occurred over wireless communication during build season, and none of them had been present during practice matches (a point which all teams that I talked to who suffered from communications issues similar to ours seemed to agree upon). As walruspunch stated, the disabling of autonomous commands allowed us full, if laggy control of our robot for our last few matches. However, while it seems to be the obvious conclusion, I have several reasons why I doubt the autonomous is the (only) culprit. For one thing, autonomous did not create any issues when practice mode was run in the pit. The final code used for autonomous was a CommandGroup composed entirely of commands used also during teleop processing (excluding WPI’s built in WaitCommand), making me doubt any loops stalling. Furthermore, I heard from our couch that another team (3807) had faced similar issues that they had also solved by disabling their autonomous (if anybody from 3807 is reading this, would you mind hooking me up with your programmers so that we can compare notes?). Things our team tried: - Code alterations: On Thursday, during practice rounds, our robot did have some legitimate lag which was due to a recent increase in image processing frames. This issue was solved by disabling the process during teleop, and we did not experience any other communication problems on that day. The next day, after crippling lag on our first qualifier, I removed our image processing widget from the Smart Dashboard, and I later completely unplugged the camera. Afterwards, all calls to the Smart Dashboard and network tables were removed (aside from their initialization), and later most outputs to the console. All periodically run blocks (teleopPeriodic, our Userdrive command) were thoroughly examined and pared down to the absolute minimum necessary for functionality. Finally, all autonomous commands (which should have been self-terminating and were functional in off field practice) were removed. - Electrical alterations: Our pit electricians, mechanics, and I, along with CSA helpers, looked over everything on our electronics board in detail. Connections were checked, anything suspicious was repaired, and several major changes were made. Notable attempts included multiple re-flashings of two different routers (DAP-1522 Rev B1, one from the KOP this year, one from last year’s), both of which were tried during matches; the aforementioned removal of power from the camera; the diverting of wires near the router to avoid possible interference; the fastening and taping of the power and Ethernet-to-cRio connections on the router to prevent them from being shaken loose; disassembling and cleaning out our four-slot cRio from 2012; replacing our cRio entirely with a loaned four-slot from the spare parts counter; reimaging said loan cRio; replacing our Can 6-pin RJ12 wire to our black Jaguar; and replacing said Jaguar with a brand new black Jaguar which was re-formatted as well (note that the last two steps were taken to solve an issue with our catapult-tensioning Cim not running that I do not believe was directly related to the connection issues). - cRio resets: These were carried out mid-match many times and before each of our matches after we realized their effectiveness. By resetting the cRio during teleop, we were able to regain temporary control of our robot if it had been incapacitated; however, lag was still ever-present during these bouts. - Bandwidth testing: After the issue began, I was called in to be coach on our drive team so that I might examine the issue more closely on the operator console. During several of the matches I ran a bandwidth monitor in the background; according to it, we never used more than three megabits of the available seven. - Operator console variation: We ran our driver station on two different laptops across the matches during which the problem occurred: the first being our team’s, an HP Probook 6550b that runs Windows 7, and the second being my own, an Asus G75VW running Windows 8. Both had their firewalls disabled during matches and neither fared any better or worse than the other. - Discussion: Over the course of the two days during which the issue occurred, the rest of my team and I spoke extensively with each other, our alumni, our mentors and parents, the regional CSA, the field management team, and pretty much every other team present at the competition. Many of our changes and attempted fixes were made at the recommendation of any number of these sources. Members of the programming, electrical, and mechanical teams all worked very closely together in an attempt to diagnose the problem. I spoke to all of the team alumni present at the regional, including some of our best programmers and electricians, who were just as clueless as I was. Our mentors provided a diverse range of potential solutions but were unable to think of the reason for the problem. The CSA examined both our robot and code together with us multiple times and gave electrical suggestions (cleaning the cRio, replacing the cRio, various wiring refinements), but these did not solve the issue. Finally, our scouts talked to pretty much all of the other teams present, some of whom were suffering from the same problems at least intermittently. These teams helped us look over our code and robot and gave us loads of useful advice and tips, whether they had the same problems or not, and for this I am infinitely grateful. Yet, the problem still remains unsolved. To directly address some of the questions posed here: - The digital and analog modules were never replaced, as we did not bring extras with us to competition. However, as far as I aware, any issue with these would have led to a lack of function from digital sidecar outputs or a lack of battery voltage readout from the breakout board (both of which we had) and would have been reproducible within the pit. - The digital sidecar was functioning fine as far as we could tell; we had to move a limit switch input due to what seemed to be a bad port, but all other digital sensors, relays, and victors worked correctly. - The router power cable was fastened with electrical tape as soon as issues began and did not disconnect. Also, the cRio never reset during matches except when we did so manually (though after returning to our shop the power cable for the cRio did seem loose; could this cause functionality issues without a full shut-off? I rather doubt it, and I don’t think it was loose at competition, but it’s worth asking). As for the positioning of the router, it was mounted high up on our robot’s catapult structure, away from the chassis and electronics board. I rather doubt it was a Faraday cage related issue, as I saw other functional robots with their routers position next to their batteries and other electronics in the center of their frames. - Mark McLeod, your post does make a valid point. However, while I was in the pit the entire time and thusly do not know the exact number, I am aware that the other teams that we cited as having similar issues were ones who had approached us directly or whom we had approached. Friday morning I was part of a conversation between CSA representatives and representatives from some five or six other teams regarding the problem. - On the issue of driver station logs, my examinations of them showed me little more than spikes in dropped packets and latency. Disconnects occurred during our earlier matches, but quickly trailed off. The only error events recorded frequently were controller disconnects at the driver station, which we were aware of. However, I am hardly experienced in the matter. I have attached the images from two matches in which we had issues in case someone can shed some light (match 7 was our first on the morning of the second day, match 77 was our first on the morning of the third day. These were all I could include due to attachment limits, though I may upload more later). Also, if it is of any use, here is a link to the latest revision of our robot code on Sourceforge: <http://sourceforge.net/p/lhrobotics/code/HEAD/tree/trunk/y2014/>. I apologize in advance for its somewhat messy state; debugging in the pit was rather frantic. |
Re: Errors at Colorado Regional
A couple observations:
1. Your network connectivity in match 7 is pretty rough compared to what a typical team on the field would see, at least at the three regionals I was at this year. 2. There's a cRIO reboot in match 77 -- was that commanded by you or did it happen spontaneously? 3. Match 7 shows no voltage on the chart. 4. Match 7 has higher (not bad, just higher) CPU utilization, during both auto and teleop. Probably the result of removing your camera code. 5. You had something going on during match 77 in teleop that was using lots of power - was the robot driveable during the time the cRIO was up? |
Re: Errors at Colorado Regional
Quote:
|
Re: Errors at Colorado Regional
1 Attachment(s)
I'm going off of memory here, so I'm not entirely certain on all aspects. All of the cRio reboots were manual. I did notice the voltage thing as well; I'm certain we had voltage readout on the DS when we were inspected (else we wouldn't have passed), so I'm not sure what happened there, but it did return during later matches when the issue still was unsolved. Regarding the voltage drain, it would have been driveable after the reboot, and I can see the Cim drive causing notable drain. However, I can't off the top of my head remember any motors running before the reboot (perhaps autonomous got far enough to start the lifter intake, which would mean it also would have driven), so that is a little bit perplexing. Regardless, would that kind of drain be significant enough to mar cRio or networking functionality?
For reference, here is the first match during which we had mostly full control (autonomous disabled). The voltage usage is pretty similar. Edit to address Alan: Apologies for not providing the log files; I went with the file extensions that were acceptable as attachments. I did have a few dashboard variables running early on, but as with most network processing I disabled them when issues cropped up. Though it is worth noting that, while the problem persisted, connection status did improve over time (as Steve pointed out), so that may have been related to it. |
Re: Errors at Colorado Regional
Quote:
Did you ever swap the 12v-to-5v converter? It's a potential source of problems for wireless operation. |
Re: Errors at Colorado Regional
Quote:
Quote:
Thanks for all the feedback. I'm going to get off to do some homework now, but I'll see what more I can think of or look into. |
| All times are GMT -5. The time now is 07:34. |
Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi