Intermittent connection on field only

We’re competing at the LA regional this weekend, and are having nothing but trouble when competing on the field. We lose communication at completely random times (moving, not moving, before match, during match, etc), and for random durations (for anywhere between a few seconds to almost the entire match). However, this only occurs when we are on the competition field with FMS, but has happened in every single match. When we run tethered in the pit, tethered on the practice field, and wireless on the practice field (with the practice field radio), it works perfectly, no hiccups.

We’ve replaced the following components, with no effect: cRIO, radio, ethernet cable, driver station computer, and radio 12-5v power converter. It occurs with the default LabVIEW code as well as our own code. We pulled every single circuit breaker, so that the only items powered were the cRIO and the radio, and it still occurred. We’ve had multiple people from our team, other teams, and event staff check our wiring, and wiggle the wires looking for shorts or loose connections, but haven’t found anything. Our chassis is open to ground. cRIO is mounted to wood, and the camera is mounted to plastic. Our radio is mounted pretty high, and not overly surrounded by metal. The closest motor is around 12 inches away. We are not using any of the banebots 775 motors.

About the only control system component that we haven’t replaced is PD board. I’ve used a DMM to look at the 24v and 12v power supplies, and both seemed fine. 24v supply was at 23.9v and 12v was at the battery voltage (12.5). I didn’t see any change when communications dropped. While everything seems good with the PD board, since it’s the only thing we haven’t changed, I’m worried it’s our only choice tomorrow. The PD board is brand new from the kit this year.

Any ideas or additional troubleshooting steps?

Joe,
sorry to hear of your issues as I can only imagine the frustrations.
I didnt get to watch you folks today, but members of our team clearly saw what was happening to both you and 1717 today.

The only hiccups our team had during the end part of build season was our laptop/driver station blanking out or blue screening at times. Randy found the issue to be with our bluetooth mouse interfering when we ran Labview. We bought brand new i7 core laptops this season due to the kinect.

I hope you find the culprit.
You are usually the source for finding solutions vs. vice versa.

A few more data points.

This does happen while setting up prior to a match also, while the robot is disabled. We set the robot out on the field, it links up, and then it will randomly lose connection while other robots are being set up.

We are using the camera connected directly to radio. All the code to access the camera on the robot is disabled.

We have customized the dashboard, and our using our own laptop. However the issue occurs on two different laptops. Driver Station CPU utilization is low. We are not using the kinect or the cypress board. cRIO CPU utilization is around 50%, but this happens with the default LabVIEW code also.

We’ve had two CSA’s, the Field Supervisor, and the FTA all look at both the robot and the field logs, without any obvious answers, the robot just loses communication.

There is no correlation with battery voltage as seen in the FMS logs or the DS charts tab. (unless something is happening too fast to see). It happens with multiple batteries.

In addition to wiggling the wires, have you tightened down the connections (particularly on the PD board)? That bit us one year through an entire regional; one of the wires from the battery to the board was not tightened down really hard. If we wiggled the wire it seemed good, but it wasn’t until we tried replacing the wires that we found they weren’t tightened properly.

It looks like this problem happens to the best of em.
We had this same problem in Breakaway, we would lose connection randomly and for a random period of time.

I wish I knew the solution, have you considered changing both CRio and Driver Station computer?
Edit: Nevermind, i see that you have.

It seems that 364 is having similar issues at the Bayou regional. In that thread, Greg asked for cRIO and radio info. Adding it here for completeness.

We’ve tried 3 radios, all rev A. Two with firmware version 1.21 (one was ours and one from spare parts) and one with firmware version 1.4 (purchased last month). We’ve tried 2 cRios, both have been 4 slot cRios (no room for 8 slot).

If you unplug the camera from the Wireless Gaming adapter, does the problem go away? Can the FTA tell if your robot is more bandwidth hungry than the others at the event?

We had a situation with the symptoms you describe happen at the Chestnut Hill District Event, and the FTA ultimately traced the problem down to a pair of particularly bandwidth hungry robots. After they changed their camera settings (lowered resolution, lowered framerate, increased compression), I believe the problems went away.

I know that there is no worse feeling in FRC than watching your robot die in the middle of the field with no discernible cause.

3193 was having very similar issues to what you were describing in Pittsburgh. For some reason, the staff was reluctant to let them program a new radio at the kiosk - they brought a brand new one with them from home (not sure of the firmware/rev level on that). Ultimately, the staff relented - they programmed the new radio at the kiosk and then had no issues with on-field robot comms the rest of the competition. The team was also making other repairs to the bot inside the base at the time that might have affected wiring connections, etc., so this might have been a coincidence.

It sounds like you’ve swapped radios many times already, though.

We’re still at a loss. We’re going back to the classmate, eliminating CAN, eliminating the camera. That’s the last of the niceties of our robot. We’re back to basic framework.

At Alamo during week 1, the Robonauts had similar problems in a couple matches on Friday. In one match we never started autonomous, and in another we lost connection with 30 seconds remaining.

Changing out the classmate stopped the symptoms. Although I’m not fully convinced the classmate was the root cause, we didn’t have any problems afterward. Prior to changing it out, we were seeing delays off the chart (> 100 msec) and approaching 50 lost packets per unit time, even when tethered. This went away after we swapped the classmate.

No go still. This is sickening. We’re at the point where the only variable left is the FMS, and they’re not budging. It’s not code, it’s not electrical, it’s not us.

We also moved the router to an unpopulated part of the robot because of interference worries. I don’t know what’s left.

Frankly, I’m pissed off because of this. FTA has given little support so far.

Our problem still occurred even with the camera unplugged and unpowered. At one point we pulled all breakers so only the radio and crio had power.

This morning, we swapped the PD board and it ran perfectly in our first match. 2 more qualification matches to go.

Sent from my BlackBerry 9650 using Tapatalk

Murphy says it’s always the part that can’t be causing the problem, that is the problem…

Hope it stays fixed!

Joe,
The 12 volt output for the radio should be 12 volts +/- 0.1 volts. It should not read the same as the battery. Since you replace the PD already, I am going to guess that the 12 volt supply has a problem. However, none of this explains the action on the field vs all other trials. If the radio resets due to a power dump it takes about 50 seconds. The new FMS dashboard also reports lost packets. Were you told that there were cluster of lost packets on your data stream? They also have the ability to check emissions on your channel, have they looked at that? Normally they only check prior to the event and not during the event. It really sounds like something is interfering with your radio. I know it’s not supposed to happen but it obviously is. What is the possibility that your WPA key is corrupt in the kiosk? Have you always configured at the same kiosk or did you try the second one?

My understanding is that it tracks battery voltage at >12 volts. That may be wrong.

Joe,

This sounds very similar to what was happening to us at Chesapeake. We, too, thought it had to be the FMS, since we were only having the issues when we connected to the field. The FTA said, that in his experience, the type of connection issues we were having were usually due to hardware issues. We replaced several wires, swapped out our cameras, removed the cameras, etc… All approaches I see above. We re-checked every connection, taped down everything that might move, did a very thourough shock and vibe set of tests on the bot in the pits, and then, at least, everything seemed to work in our first quarterfinal, and thought we had found the issue. The next match, we were playing blind again, our cameras had dropped the frames, and the lag times between the camera and the dashboard were killing us. we asked the FTA if the logs showed any connection issues, and he said no. He then said, “You do have, however, a lot of latency showing up. A lot more than I would expect.”

We decided to bring the cameras home and run tests with our practice bot to see if we could re-create the issues, but were generally unable to do so. A post here on CD, regarding the 3-13 update had a post from a member of the Killer Bees (33) talking about issues he had found in some of the LabView code, and he posted a sample code he had written that had helped with some of the latency issues they were having.

It sounded familiar to our problems, so I forwarded the info to our programmers. Other than some of the basic code, our team had written a lot of the code ourselves. When they reviewed the info in this post, they found some of the same issues mentioned. They have been optimizing the code as described, and finding that the code is now a lot more efficient. We are continuing to run tests to see if there are other issues, but are hopeful that when we get to DC, we will not see a repeat of the problems we had in Chesapeake.
In previous years we had build rather “simple” machines, with very little reliance in sensors and complex code, this year, we have two cameras, complex image processing, a complex set of code that keeps track of the state of each ball in the system, etc… I’m wondering if this year’s general increase in the complexity of the bots teams are building is bring some of these code issues to light? If you all find anything else that is causing these problems, I would love to hear about it. Thanks!

Steve

One team at Boilermaker today had the same symptoms described here: inconsistent loss of wireless communication during a match. There didn’t seem to be any pattern other than noting that it never happened on the practice field. When I was called in to help troubleshoot, I discovered the 12v-to-5v converter essentially stuck to the bottom of the D-Link. After I suggested that they move it to a less potentially interfering location, they had a string of successful matches all the way through the elimination rounds.

I remember the same problem with the same fix with another team last year.

I’m going to engage in some speculation here. The 12v radio power from the PDB comes from a boosted supply. The supply should have filtering on the output, but if something is faulty it’s going to have some ripple in it from the switching. That could be causing electrical noise and RFI from the 5v converter. It might be that an insufficiently clean 5v input makes the D-Link act a bit flaky. Without taking a few hours with an oscilloscope and a collection of parts to do a careful analysis, I think that’s the best theory I’m going to come up with.

Just to document other occurrences of the problem, we had this identicle trouble at Greater Kansas City in week 1. We tried all the replacements and fixes mentioned in this thread, including filtering the power from the DC adapter. To this day we do not know the problem. Just to make sure it is not the router itself, we have replaced it with a new one. One possibility is physical damage to the router PCB/components caused by shocks/jarring/vibration. This router certainly was not designed for such abuse. Our older, 2011 router which has seen more mileage and abuse exhibits the same problem but more frequently. We have mounted the new router in a location with less shock and more cushioning to try to keep it pristine. Time and another competition will tell if this solves the problem.

We had a dead robot one match in San Diego, because of a problem that we were aware of but hadn’t fixed yet.

The ethernet connectors on our bridge seem to have a bad design or have been damaged or something. When the cable is plugged in and latched, it can be pulled out against the latch, and lose connection. We have tried several different cables, they all act the same, no connection if not pushed in all the way. We fixed it with some hot glue.

I doubt this is your problem, but it’s worth a look.

Max,
The 12 volt radio output is exactly 12 volts from a boost/buck regulator inside the PD. It continues to work until the battery falls to about 4.5 volts at which time it quits altogether. It does not track the battery.