Log in

View Full Version : Consistent robot flicker every 64 seconds


William Kunkel
20-02-2014, 18:25
We noticed a really odd problem with the robot when we started doing full testing. Every 64 seconds, the robot "flickers" for a fraction of a second. Relays (and possibly PWMs) shift into neutral, as well as solenoid valves. There is an audible clicking from the solenoid valves and almost every light on the robot goes out. However, we don't lose communications, and everything is back to normal a split second later. Still, the behavior is very disconcerting, and we can't see any obvious explanation for it. Before I get into trying to flowchart the code to see if anything is illuminated, is anyone aware of any sort of loops in WPILib that run on a 64-second timer?

Mark McLeod
20-02-2014, 18:35
Does anything show up in the DS log when this happens?

Have you substituted a different computer as the Driver Station in case it's a Windows task popping up every 64 seconds (which I have seen before)?

William Kunkel
20-02-2014, 18:56
There's nothing on the DS. And we tried deploying just the simple robot template provided by Wind River, which removed the problem, so I think we can rule out the computer (probably).

joelg236
20-02-2014, 18:56
We've had something similar happen, although it seems much less consistent (ie. could be 10 or 50 seconds). I've put it up to lag, although I've had a sneaking suspicion that it isn't. (how could it be even?)

Deetman
20-02-2014, 21:07
I spent a considerable amount of time troubleshooting with a team at their build site where they were disconnecting over WiFi every 6 seconds, no matter what code was on the robot. Ended up being something with the driver station laptop and swapping it fixed the issue.

Can you post your driver station log file and a screenshot of it? It really is one of the most important pieces to tracking down issues. If you don't know how to get to it, go to the "Charts" tab on the driver station interface and then click the "Launch viewer" button in the bottom right.

Jared Russell
20-02-2014, 21:33
What language are you using?

Could be garbage collection.

geomapguy
20-02-2014, 21:37
What language are you using?

Could be garbage collection.

I think C++

Wind River

k4mc
20-02-2014, 22:08
Perhaps you have a memory leak that is causing the processor on the cRIO to reset every 64 seconds? That would explain why you don't loose communication but everything else resets and could be likely given C++ doesn't have a garbage collector.

You could at least easily test if this is the issue by downloading sample code onto the bot and waiting 64 seconds.

Edit: I just noticed you said that sample code removes the problem, so I now highly suspect something in your code is periodically crashing. If its not a memory leak, you could try first uploading an empty file and slowly adding one class/file/functionality at a time to see which part is causing the crash.

Jared Russell
20-02-2014, 22:32
Do you happen to have a 16-bit unsigned int that is counting milliseconds?

Greg McKaskle
21-02-2014, 07:08
If the flicker is short, this isn't caused by a crash or reboot. I agree with Jared that something in the code is probably overflowing.

Greg McKaskle

E Dawg
21-02-2014, 09:30
I concur with the above. Go through your code and make sure that there are no overflow warnings (hopefully WindRiver alerts you to that).

MamaSpoldi
21-02-2014, 10:26
Check the NetConsole output. There could be helpful messages being printed there that could point you in the right direction.

William Kunkel
21-02-2014, 10:53
Sorry for not responding quickly. We're using C++, and I'm compiling with -Wall -pedantic -Wextra and not getting any warnings or errors. I agree that it's probably an overflow error of some sort, but 64 seconds seemed like such a specific number that I was hoping it might be something in WPILib. There is no output over netconsole. Looking at the DS Log Viewer, there seem to be spikes in dropped packets that correspond to the flickering.

PandaHatMan
21-02-2014, 11:05
Where is your D-Link on your robot? Is it close to any noisy circuits?

William Kunkel
21-02-2014, 11:16
Our D-link is pretty isolated. I don't think it's the problem.

Greg McKaskle
21-02-2014, 11:53
Does the 64 seconds hiccup go away when disabled, if you stay in auto for 64 seconds, or in teleOp. If not, then what of your code is running in all of those cases? Start disabling things a chunk at a time. Use your intuition as to what is more likely to be responsible, but if you have no guess, just use binary search approach. Carefully take half your subsystems out of the system. Did the problem belong to the ones that remain or the ones you took out. Iterate on the right half until you identify it.

Do this in an experimental branch of your code, don't submit things or lose work doing this, but this type of debugging exposition is a very valuable skill in all forms of engineering.

Even if the issue is in WPILib, it may well be due to how your team is using it, and this narrowing of the root cause will be helpful to the people who will debug it. But there is a good chance that you will discover the bug yourself if you take this approach.

Greg McKaskle

PhilBot
04-03-2014, 21:55
Hi

I just posted a similar issue and got pointed here.

See my post:
http://www.chiefdelphi.com/forums/showthread.php?t=127469

I've pretty much determined that it's related to the PC running the driver station. It's a new HP convertible laptop running Windows 8. My other personal Windows 8 laptop does NOT exhibit the problem.

What laptop are you using, and have you tried a different one?

Phil.

MrRoboSteve
05-03-2014, 00:43
Wifi or wired? Works on one but not the other?

Network drivers fully up to date?

Any difference in antivirus/malware software on working/failing machine?

If you plot these performance counters in perfmon:

\Network Adapter(x)\Output Queue Length
\Network Adapter(x)\Packets Outbound Discarded
\Network Adapter(x)\Packets Outbound Errors

where x is the network adapter you are using, do you see anything unusual? Perfmon instructions at http://technet.microsoft.com/en-us/library/cc749115.aspx

PhilBot
05-03-2014, 08:42
The DS is wireless to the router.

I'll check the other things on Thursday when I get access to the prototype robot.

MrRoboSteve
05-03-2014, 09:29
Some more ideas.

Look in the Event Viewer and see if there are unusual events related to the networking stack.

They'd usually be under Event Viewer -> Custom Views -> Administrative Events in the tree view.

I'd also go look at the manufacturer web site to see whether there is a new network driver.

Can you repro the issue without any of the FRC software in the loop? One easy way to see would be to copy a very large file to a share on another machine, by dragging in File Explorer. win8 has a very nice visualization of the copy throughput that should clearly show if you're getting blocked periodically.

Greg McKaskle
05-03-2014, 12:30
Phil. I sent a PM with some ideas. Let me know.

Greg McKaskle

PhilBot
07-03-2014, 08:28
Hi All.

Well, I narrowed down the symptoms even if I don't know the cause. The periodic glitch only occurs when the DS is connected to the router wirelessly.

That is, if I have the DS computer hardwired to a wireless router (with the DS wireless disabled), and the robot is connected to the same router wirelessly, there is no periodic hiccup.

However if the DS connects to the same router wirelessly (no hard wire) I get the periodic hiccup. I tried two different routers (one old and one really new) and I get the identical effect.

I was wondering if the computer was having a hard time because the Robot wireless network does not have access to the internet, and it's doing something to test the connection once a minute.

The good news is that this shouldn't be an issue during competition.... right ;)

Ken Streeter
07-03-2014, 08:45
Phil,

I haven't been able to track down the root cause yet, but, for what it's worth, you're not alone. We saw this problem (periodic dropout every 68-ish seconds, lasting only a couple seconds each time, according to the DS log) last night on 1519's practice robot. I don't recall having ever seen it before.

I'll investigate further tonight and let you know what we find out. We have many hours of DS logs on the driver station laptop that we can look through (including logs from Week 1 and Week Zero tournaments) to help see if the problem is caused by other things in the environment. I am quite certain we were NOT having this problem on the competition field at Week 1, as I had been looking at the DS logs for our robot at the tournament with an FTA to troubleshoot a different problem and didn't see this behavior in those logs.

For what it's worth, we're in a somewhat different programming environment, so I'm a little skeptical that the problem is in the robot code. It also sounds like you got the problem to go away by switching laptops, so it seems likely that the problem is induced by something on the laptop. Our DS is running Windows 7 and we are running Java on the robot, but are getting a very similar periodic glitch.

--ken

jmartin
07-03-2014, 09:36
I think I have been having the same problem - when connected wirelessly to the robot, there are occasional lapses in communication, and though I haven't measured the frequency, ~60sec seems about right.

I am pretty sure that this never happened when on the field while we were at GSD, but it is very annoying when testing at home.

Hoover
07-03-2014, 10:35
One time at our astronomy lab we were playing with a USB multi band radio. We hit a frequency that was some noise each 11.5 seconds. We never found out what it was but one said it might have been a weather radar sweep.

We are bathed in radio waves. With the right equipment you might be able to find if it is coming from an outside source. If changing channels on your wifi to the low one then the high one, if you still get this drop out and its external its got to be a fairly strong signal.

In our lab we have a second training robot with a program no more than just to drive it. A situation like this could prove to isolate it to one robot and then it would be software/hardware. Even with one robot, uploading the smallest program to drive the robot could implicate or eliminate software.

But what would be the most interesting is if both robots were to drop out at the same time.

Ken Streeter
08-03-2014, 14:09
We saw this problem (periodic dropout every 68-ish seconds, lasting only a couple seconds each time, according to the DS log) last night on 1519's practice robot. I don't recall having ever seen it before.

I'll investigate further tonight and let you know what we find out.

We investigated this further last night. We had the problem (lost packets every 64-68 seconds) with 100% consistency when the robot D-Link 1522 was configured as an "access point" with the Driver Station laptop connecting to the D-Link 1522 wirelessly.

We then switched to our normal driving practice configuration, with the robot D-Link 1522 in "bridge" mode, connecting to a Cisco/Linksys WRT-610N serving as the access point, with the Driver Station laptop wired directly to the WRT-610N. This latter configuration worked fine without any of the "64-68 second problem" occurring.

There is no evidence of the "64-68 second problem" in any of 1519's Driver Station logs from the Week Zero or Week 1 tournaments in which we participated.

Greg McKaskle
09-03-2014, 07:55
Ken, do you have a copy of WireShark? If you can cause this to happen, you can record the traffic on the laptop interface and may be able to determine whether the router is being slow or bombarded by something, or if the laptop is doing something. Also, I can provide an instrumented DS if that helps.

Greg McKaskle