I thought I’d seek input here on this rather intriguing problem:
When programming a prototypes system today, I discovered an issue with the wireless connection. What would happen was that when the electronics board was powered down and restarted, the wireless connection would only come back up about 50% of the time. On further investigation, this proved to be due to a malfunction in the link between the bridge and the cRio–ICMP pings sent to the cRio when the communication was down would just never get there, even though we could successfully ping the bridge.Changing out the cable did not affect the problem, and the robot runs fine tethered, which suggests to me that the problem lies with the bridge itself. After killing and reinstating power several times, the problem is resolved until the board is powered down again, after which it may or may not work.
Has anyone else had this issue or know of any solutions/workarounds? Our router and bridge settings are the same as we’ve been using to control our 2009 robot, and all our software is up-to-date.
I think I was having this problem today as well. The robot signal light was blinking like it had no connection(when in fact it did), but when enabled would pause for about 50ms every 30seconds. It was fine when programming, but I should start looking into fixing it sometime soon.
We have been having communications glitches that temporarily shut down the motors (as if disabled,) but then they resume after a fraction of a second.
Occaisonally, we have had it shut down entirely, but this may be unrelated. Out light wouldn’t blink, and the driver station would report that there was no communication. Once, I also noticed the lights for the power regulators were out (on the power distribution board,) so I think this may be an entirely unrelated problem.
I’m wondering if it is a network configuration issue.
When we are developing, and the 'bot is close to the driverstation, we can sometimes switch from tethered to wireless, and in the process, lose communications for a while. Then, without changing anything, it starts working again.
My theory is that it is network related … but I need to know something about the cRIO’s gateway settings before I can conclude anything.
Here’s my theory:
When wireless, the cRIO (with IP addy: 10.xx.yy.2) uses the bridge (with IP addy: 10.xx.yy.1) as its gateway
When tethered, the cRIO goes straight to the Linksys wireless router via one of the 4 hard-wired ethernet jacks
If you leave the bridge powered on, even when you are tethered and not using it, then it could be confusing the network connectivity between the devices. (that is, when the bridge is on and unused, it could be confusing the route between the driverstation and the cRIO)
I need to know if the cRIO gateway setting is hard-set to 10.xx.yy.1. If so, then going from tethered to un-tethered could cause some headaches, especially when the bridge is left on.
I plan on doing some more investigation this afternoon at our team meeting. Stay tuned.
I have also been having the troubles of the motors turning off for like a second and robot code and communication go red but then go green after a little bit. Also the communication only works like 50% of the time and sometimes it takes a little more than 2 minutes to load everything! I will stay tuned for updates or solutions.
I’m still not positive about the cRIO gateway setting, or if it is related to the problem, however, some of the guys on my team are theorizing that it could be a startup issue.
Here’s a timeline, a few educated guesses, and maybe a fix: (when operating in a wireless mode)
apply power to the bot
cRIO comes up first, wireless bridge still booting
cRIO network comes up, doesn’t see an “active” bridge, and times out
wireless bridge comes up, but too late for cRIO
no ethernet connection
reset the cRIO (using a paper clip on the reset button)
wireless ethernet connects almost immediately
I didn’t get much time with the 'bot yesterday, so it was hard to troubleshoot, but when we followed the timeline above, we had no disconnection issues at all.
I’d like to see if the cRIO will re-attempt after 30 or 60 seconds … stay tuned.
If this theory is correct, and we want to avoid delays at competitions, then we might expect a cRIO firmware patch to slow down its bootup process and wait a little while for the bridge to come up.
Like I said above, this is all just a theory. I hope that it either gets patched, or I’m wrong, because I don’t want to have to paper-clip-reset the cRIO every time we turn it on …
edit: I just posted a question to the NI forums. Perhaps their engineers could shed some light on this for us.
edit2: here’s a link to the NI forum question (no activity yet) http://decibel.ni.com/content/thread/5712?tstart=0
I think we had a bad game bridge last year or something because we were having similar issues. We also lost coms multiple times and changing the game bridge fixed our issues. We lost a couple matches because we just died in the middle…:ahh:
So needless to say, we haven’t been very impressed with the game bridges so far…
We have had similar experiences. The process we now have in place to always ensure that everything talks is to turn off the robot and unplug our router. Turn on the robot and wait about 10 seconds and then plug the router back in. Then wait for the connection to be made and everything works from that point.
We’re using the same router that the rookies were given this year the 160. We bought a whole new control system since we wanted to keep last year’s awesome robot demo and fundraiser worthy. We are using a factory reconditioned/recertified game bridge because we couldn’t find a new one.
We were going nuts trying to figure out why sometimes it would connect to the wireless bridge and sometimes why it wouldn’t. That procedure above is what has worked consistently for us. I just hope we don’t have similar issues at the regionals.
I am pretty sure this is right and I have had to press the reset button on our cRIO too because it won’t catch the bridge. I hope there is a fix. Thanks for pointing this out.
Thursday we had no issues. It came right up and connected pretty quickly. We’ll keep our eyes on it and report any more details when/if we see it happen again.
I’m glad to see that NI was able to reproduce it … but sporadic errors are difficult to troubleshoot.
Have you tried and seen this issue with the default robotic example?
We saw a very similar issue when we were overtaxing the C-rio with vision code and the processor couldn’t keep up. Shortly after that (when we added more code) it started watchdogging.
Double check that the default code exhibits this same behavior. If it doesn’t, I’d suggest reviewing your code and trying to cut out un-needed bits.
Agreed, errant code is something that can adversely affect comm’s … but I think that the nature of this thread is more based on coming up from a cold/hard/OFF situation.
… not speaking for the other posters here, but my team’s issues are only when establishing initial comm’s … once established, and everything stays powered-on, we’ve been A-OK.
We’ve identified what’s going on with the wireless bridge (gaming adapter). It seems that it maintains an arp table to direct traffic to its wired interface. It makes sense that it would want to filter what traffic it puts onto its wired interface to limit the wasted bandwidth from traffic destined for devices that are not on that segment of the network. However it only snoops arp traffic, so if the classmate has already cached the MAC address of the cRIO (i.e. you haven’t stopped the driver station app on the classmate) then there is no arp request for the wireless bridge to snoop.
The reason it works sometimes is that there is a race condition between the cRIO network interface coming up and the bridge connecting. When the cRIO boots and brings up the network interface, it sends a gratuitous arp reply to announce itself. If the bridge is already booted, this gratuitous arp reply will cause the bridge adds the cRIO to its list-of-devices-connected-to-its-wired-interface and begin to put traffic destined for the cRIO on its wired interface (where the cRIO is) and therefore will work. If the bridge is not booted when this gratuitous arp reply is sent, then when it is booted, there is no arp traffic for the bridge to snoop and therefore won’t forward traffic destined for the cRIO to its wired interface, making the cRIO unreachable. This very fact is why a warm boot of the cRIO fixes the problem (since there will be a new gratuitous arp reply from the cRIO).
The other way to address the issue is to force the classmate to resend its arp request for the cRIO so that the bridge can snoop it. After this happens, the bridge knows about the cRIO on its wired interface and starts forwarding traffic and everything works. This can be done by running “arp -d” on the classmate to clear the arp cache.
There will be an update to the driver station application that will clear the arp cache if it has trouble connecting to the cRIO.