Log in

View Full Version : Intermittent connection on field only


Joe Ross
17-03-2012, 02:38
We're competing at the LA regional this weekend, and are having nothing but trouble when competing on the field. We lose communication at completely random times (moving, not moving, before match, during match, etc), and for random durations (for anywhere between a few seconds to almost the entire match). However, this only occurs when we are on the competition field with FMS, but has happened in every single match. When we run tethered in the pit, tethered on the practice field, and wireless on the practice field (with the practice field radio), it works perfectly, no hiccups.

We've replaced the following components, with no effect: cRIO, radio, ethernet cable, driver station computer, and radio 12-5v power converter. It occurs with the default LabVIEW code as well as our own code. We pulled every single circuit breaker, so that the only items powered were the cRIO and the radio, and it still occurred. We've had multiple people from our team, other teams, and event staff check our wiring, and wiggle the wires looking for shorts or loose connections, but haven't found anything. Our chassis is open to ground. cRIO is mounted to wood, and the camera is mounted to plastic. Our radio is mounted pretty high, and not overly surrounded by metal. The closest motor is around 12 inches away. We are not using any of the banebots 775 motors.

About the only control system component that we haven't replaced is PD board. I've used a DMM to look at the 24v and 12v power supplies, and both seemed fine. 24v supply was at 23.9v and 12v was at the battery voltage (12.5). I didn't see any change when communications dropped. While everything seems good with the PD board, since it's the only thing we haven't changed, I'm worried it's our only choice tomorrow. The PD board is brand new from the kit this year.

Any ideas or additional troubleshooting steps?

waialua359
17-03-2012, 02:59
Joe,
sorry to hear of your issues as I can only imagine the frustrations.
I didnt get to watch you folks today, but members of our team clearly saw what was happening to both you and 1717 today.

The only hiccups our team had during the end part of build season was our laptop/driver station blanking out or blue screening at times. Randy found the issue to be with our bluetooth mouse interfering when we ran Labview. We bought brand new i7 core laptops this season due to the kinect.

I hope you find the culprit.
You are usually the source for finding solutions vs. vice versa.

Joe Ross
17-03-2012, 07:50
A few more data points.

This does happen while setting up prior to a match also, while the robot is disabled. We set the robot out on the field, it links up, and then it will randomly lose connection while other robots are being set up.

We are using the camera connected directly to radio. All the code to access the camera on the robot is disabled.

We have customized the dashboard, and our using our own laptop. However the issue occurs on two different laptops. Driver Station CPU utilization is low. We are not using the kinect or the cypress board. cRIO CPU utilization is around 50%, but this happens with the default LabVIEW code also.

We've had two CSA's, the Field Supervisor, and the FTA all look at both the robot and the field logs, without any obvious answers, the robot just loses communication.

There is no correlation with battery voltage as seen in the FMS logs or the DS charts tab. (unless something is happening too fast to see). It happens with multiple batteries.

s_forbes
17-03-2012, 08:22
In addition to wiggling the wires, have you tightened down the connections (particularly on the PD board)? That bit us one year through an entire regional; one of the wires from the battery to the board was not tightened down really hard. If we wiggled the wire it seemed good, but it wasn't until we tried replacing the wires that we found they weren't tightened properly.

Sean Raia
17-03-2012, 09:10
It looks like this problem happens to the best of em.
We had this same problem in Breakaway, we would lose connection randomly and for a random period of time.

I wish I knew the solution, have you considered changing both CRio and Driver Station computer?
Edit: Nevermind, i see that you have.

Joe Ross
17-03-2012, 09:58
It seems that 364 is having similar issues at the Bayou regional. In that thread, Greg asked for cRIO and radio info. Adding it here for completeness.

We've tried 3 radios, all rev A. Two with firmware version 1.21 (one was ours and one from spare parts) and one with firmware version 1.4 (purchased last month). We've tried 2 cRios, both have been 4 slot cRios (no room for 8 slot).

Jared Russell
17-03-2012, 10:04
If you unplug the camera from the Wireless Gaming adapter, does the problem go away? Can the FTA tell if your robot is more bandwidth hungry than the others at the event?

We had a situation with the symptoms you describe happen at the Chestnut Hill District Event, and the FTA ultimately traced the problem down to a pair of particularly bandwidth hungry robots. After they changed their camera settings (lowered resolution, lowered framerate, increased compression), I believe the problems went away.

I know that there is no worse feeling in FRC than watching your robot die in the middle of the field with no discernible cause.

Travis Hoffman
17-03-2012, 10:26
3193 was having very similar issues to what you were describing in Pittsburgh. For some reason, the staff was reluctant to let them program a new radio at the kiosk - they brought a brand new one with them from home (not sure of the firmware/rev level on that). Ultimately, the staff relented - they programmed the new radio at the kiosk and then had no issues with on-field robot comms the rest of the competition. The team was also making other repairs to the bot inside the base at the time that might have affected wiring connections, etc., so this might have been a coincidence.

It sounds like you've swapped radios many times already, though.

RyanN
17-03-2012, 11:10
We're still at a loss. We're going back to the classmate, eliminating CAN, eliminating the camera. That's the last of the niceties of our robot. We're back to basic framework.

Bill Bluethmann
17-03-2012, 12:03
At Alamo during week 1, the Robonauts had similar problems in a couple matches on Friday. In one match we never started autonomous, and in another we lost connection with 30 seconds remaining.

Changing out the classmate stopped the symptoms. Although I'm not fully convinced the classmate was the root cause, we didn't have any problems afterward. Prior to changing it out, we were seeing delays off the chart (> 100 msec) and approaching 50 lost packets per unit time, even when tethered. This went away after we swapped the classmate.

RyanN
17-03-2012, 12:04
No go still. This is sickening. We're at the point where the only variable left is the FMS, and they're not budging. It's not code, it's not electrical, it's not us.

We also moved the router to an unpopulated part of the robot because of interference worries. I don't know what's left.

Frankly, I'm pissed off because of this. FTA has given little support so far.

Joe Ross
17-03-2012, 12:55
Our problem still occurred even with the camera unplugged and unpowered. At one point we pulled all breakers so only the radio and crio had power.

This morning, we swapped the PD board and it ran perfectly in our first match. 2 more qualification matches to go.

Sent from my BlackBerry 9650 using Tapatalk

MrForbes
17-03-2012, 12:58
This morning, we swapped the PD board and it ran perfectly in our first match. 2 more qualification matches to go.

Murphy says it's always the part that can't be causing the problem, that is the problem....

Hope it stays fixed!

Al Skierkiewicz
17-03-2012, 13:42
Joe,
The 12 volt output for the radio should be 12 volts +/- 0.1 volts. It should not read the same as the battery. Since you replace the PD already, I am going to guess that the 12 volt supply has a problem. However, none of this explains the action on the field vs all other trials. If the radio resets due to a power dump it takes about 50 seconds. The new FMS dashboard also reports lost packets. Were you told that there were cluster of lost packets on your data stream? They also have the ability to check emissions on your channel, have they looked at that? Normally they only check prior to the event and not during the event. It really sounds like something is interfering with your radio. I know it's not supposed to happen but it obviously is. What is the possibility that your WPA key is corrupt in the kiosk? Have you always configured at the same kiosk or did you try the second one?

cgmv123
17-03-2012, 17:32
The 12 volt output for the radio should be 12 volts +/- 0.1 volts. It should not read the same as the battery.

My understanding is that it tracks battery voltage at >12 volts. That may be wrong.

SteveGPage
17-03-2012, 19:08
Joe,

This sounds very similar to what was happening to us at Chesapeake. We, too, thought it had to be the FMS, since we were only having the issues when we connected to the field. The FTA said, that in his experience, the type of connection issues we were having were usually due to hardware issues. We replaced several wires, swapped out our cameras, removed the cameras, etc... All approaches I see above. We re-checked every connection, taped down everything that might move, did a very thourough shock and vibe set of tests on the bot in the pits, and then, at least, everything seemed to work in our first quarterfinal, and thought we had found the issue. The next match, we were playing blind again, our cameras had dropped the frames, and the lag times between the camera and the dashboard were killing us. we asked the FTA if the logs showed any connection issues, and he said no. He then said, "You do have, however, a lot of latency showing up. A lot more than I would expect."

We decided to bring the cameras home and run tests with our practice bot to see if we could re-create the issues, but were generally unable to do so. A post here on CD, regarding the 3-13 update had a post from a member of the Killer Bees (33) talking about issues he had found in some of the LabView code, and he posted a sample code he had written that had helped with some of the latency issues they were having.
http://www.chiefdelphi.com/forums/showthread.php?t=104580&page=2
It sounded familiar to our problems, so I forwarded the info to our programmers. Other than some of the basic code, our team had written a lot of the code ourselves. When they reviewed the info in this post, they found some of the same issues mentioned. They have been optimizing the code as described, and finding that the code is now a lot more efficient. We are continuing to run tests to see if there are other issues, but are hopeful that when we get to DC, we will not see a repeat of the problems we had in Chesapeake.
In previous years we had build rather "simple" machines, with very little reliance in sensors and complex code, this year, we have two cameras, complex image processing, a complex set of code that keeps track of the state of each ball in the system, etc... I'm wondering if this year's general increase in the complexity of the bots teams are building is bring some of these code issues to light? If you all find anything else that is causing these problems, I would love to hear about it. Thanks!

Steve

Alan Anderson
17-03-2012, 21:03
One team at Boilermaker today had the same symptoms described here: inconsistent loss of wireless communication during a match. There didn't seem to be any pattern other than noting that it never happened on the practice field. When I was called in to help troubleshoot, I discovered the 12v-to-5v converter essentially stuck to the bottom of the D-Link. After I suggested that they move it to a less potentially interfering location, they had a string of successful matches all the way through the elimination rounds.

I remember the same problem with the same fix with another team last year.

I'm going to engage in some speculation here. The 12v radio power from the PDB comes from a boosted supply. The supply should have filtering on the output, but if something is faulty it's going to have some ripple in it from the switching. That could be causing electrical noise and RFI from the 5v converter. It might be that an insufficiently clean 5v input makes the D-Link act a bit flaky. Without taking a few hours with an oscilloscope and a collection of parts to do a careful analysis, I think that's the best theory I'm going to come up with.

jspatz1
17-03-2012, 21:35
Just to document other occurrences of the problem, we had this identicle trouble at Greater Kansas City in week 1. We tried all the replacements and fixes mentioned in this thread, including filtering the power from the DC adapter. To this day we do not know the problem. Just to make sure it is not the router itself, we have replaced it with a new one. One possibility is physical damage to the router PCB/components caused by shocks/jarring/vibration. This router certainly was not designed for such abuse. Our older, 2011 router which has seen more mileage and abuse exhibits the same problem but more frequently. We have mounted the new router in a location with less shock and more cushioning to try to keep it pristine. Time and another competition will tell if this solves the problem.

MrForbes
17-03-2012, 21:42
We had a dead robot one match in San Diego, because of a problem that we were aware of but hadn't fixed yet.

The ethernet connectors on our bridge seem to have a bad design or have been damaged or something. When the cable is plugged in and latched, it can be pulled out against the latch, and lose connection. We have tried several different cables, they all act the same, no connection if not pushed in all the way. We fixed it with some hot glue.

I doubt this is your problem, but it's worth a look.

Al Skierkiewicz
18-03-2012, 08:35
Max,
The 12 volt radio output is exactly 12 volts from a boost/buck regulator inside the PD. It continues to work until the battery falls to about 4.5 volts at which time it quits altogether. It does not track the battery.

McGurky
18-03-2012, 08:41
Can anyone confirm or deny seeing this problem during practice matches on Thursday?

drwisley
18-03-2012, 11:12
One team at Boilermaker today had the same symptoms described here: inconsistent loss of wireless communication during a match. There didn't seem to be any pattern other than noting that it never happened on the practice field. When I was called in to help troubleshoot, I discovered the 12v-to-5v converter essentially stuck to the bottom of the D-Link. After I suggested that they move it to a less potentially interfering location, they had a string of successful matches all the way through the elimination rounds.

I remember the same problem with the same fix with another team last year.

I'm going to engage in some speculation here. The 12v radio power from the PDB comes from a boosted supply. The supply should have filtering on the output, but if something is faulty it's going to have some ripple in it from the switching. That could be causing electrical noise and RFI from the 5v converter. It might be that an insufficiently clean 5v input makes the D-Link act a bit flaky. Without taking a few hours with an oscilloscope and a collection of parts to do a careful analysis, I think that's the best theory I'm going to come up with.

On behalf of Team 1756, thank you Alan.

PayneTrain
18-03-2012, 11:21
This may sound silly, but I haven't seen anyone with problems explicitly mentioning communications drops with their router plugged into the regulated port for the router.

Our alliance captain apparently got through two rounds of inspection and every qualification match before their communications started dropping Saturday afternoon. The problem was quickly traced to that wiring problem, and was rectified over a series of timeouts, cool-downs, and gentle, gentle driving.

Joe Ross
18-03-2012, 11:35
We ran a total of 5 matches on Saturday with the new power distribution board, without a single problem. After removing the "bad" power supply, we did discover that there was a few pieces of metallic swarf on the outside. It's possible that a piece worked its way inside and caused problems. The mechanical team has been sufficiently reprimanded.

I'm still interested in the explanation of what the PD board could be doing to cause issues with the radio. Like Alan speculated, perhaps some noise that isn't filtered by the power converter.

Max,
The 12 volt radio output is exactly 12 volts from a boost/buck regulator inside the PD. It continues to work until the battery falls to about 4.5 volts at which time it quits altogether. It does not track the battery.

That disagrees with the data sheet for the power distribution board. http://www.usfirst.org/sites/default/files/uploadedFiles/Robotics_Programs/FRC/Game_and_Season__Info/2012_Assets/Power%20Distribution%20Board.pdf It also disagrees with our experience, both the "bad" board and the "good" board output voltage above 12v when the battery is above 12v.

12V/2A boost supply with on-board 2A PTC for over-current protection (typically for powering a WiFi adapter, the boost supply tracks battery voltage when the battery is fully charged and greater than 12V)

Don’t panic if the 12V power supply output is a bit higher than 12V. The supply tracks battery voltage when the battery is fully charged and greater than 12V.

Can anyone confirm or deny seeing this problem during practice matches on Thursday?

We had a problem in one practice match, which was initially attributed to code, but may have actually been this problem.

This may sound silly, but I haven't seen anyone with problems explicitly mentioning communications drops with their router plugged into the regulated port for the router.

Our alliance captain apparently got through two rounds of inspection and every qualification match before their communications started dropping Saturday afternoon. The problem was quickly traced to that wiring problem, and was rectified over a series of timeouts, cool-downs, and gentle, gentle driving.

Our radio was properly connected through the 12-5v converter and the boosted power supply. On a side note, I'm amazed that every year I find a few teams that made it through inspection without using the boosted power supply.

cgmv123
18-03-2012, 11:41
Max,
The 12 volt radio output is exactly 12 volts from a boost/buck regulator inside the PD. It continues to work until the battery falls to about 4.5 volts at which time it quits altogether. It does not track the battery.

Don’t panic if the 12V power supply output is a bit higher than 12V. The supply tracks battery voltage when the battery is fully charged and greater than 12V.
http://www.usfirst.org/sites/default/files/uploadedFiles/Robotics_Programs/FRC/Game_and_Season__Info/2012_Assets/Power%20Distribution%20Board.pdf

Steve Warner
18-03-2012, 11:56
You said you tried running the robot with and without the camera but I'll mention this anyway. At Knoxville we had a similar problem with the robot stopping on the competition field but never having a problem anywhere else. We had very high trip times and a lot of lost packets. We were displaying the camera vidoe on the dashboard using one of the older E09 classmates. It turned out the classmate could not handle the video stream load even though it was only 4 or 5 frames per second. Disabling the video stream brought down the trip times and lost packets and the problem seems to have gone away. Check the stats in the Dashboard Log File Viewer.

rsisk
18-03-2012, 13:14
Joe,
The 12 volt output for the radio should be 12 volts +/- 0.1 volts. It should not read the same as the battery. Since you replace the PD already, I am going to guess that the 12 volt supply has a problem. However, none of this explains the action on the field vs all other trials. If the radio resets due to a power dump it takes about 50 seconds. The new FMS dashboard also reports lost packets. Were you told that there were cluster of lost packets on your data stream? They also have the ability to check emissions on your channel, have they looked at that? Normally they only check prior to the event and not during the event. It really sounds like something is interfering with your radio. I know it's not supposed to happen but it obviously is. What is the possibility that your WPA key is corrupt in the kiosk? Have you always configured at the same kiosk or did you try the second one?

I was the CSA that checked the logs. In a couple of cases, there was a flurry of dropped packets right before the robot was disabled. In other cases, there were some lost packets, but nothing unusual. There really was no pattern we could determine in the logs. Sometimes they were disabled for 4 seconds, a couple times for 30 seconds and once for the remainder of the match (> 45 secs).

Al Skierkiewicz
18-03-2012, 18:07
Sorry to mislead, I meant the 12 volt supply does not track the battery below 12 volts. The maximum output of the supply according to the sheet will not make more than 13 volts even with a 15 volt input.

Joe Ross
18-03-2012, 18:45
For anyone who's interested, I've attached ds log files from our second match on Friday, where we dropped out once prior to the match and twice during the match. It appears that upon reconnecting, the DS creates a new log file, so there are 4 files in the attached zip file.

dbarr4011
19-03-2012, 16:28
We have also been getting D-Link drop outs on the field. During our competition at Traverse City and at Western Michigan District we had several communications drop outs. Our robot was not driving hard and no major current draws on the battery. The Field controls personal said our radio reset. I am questioning this answer because I see the radio takes about 1 1/2 minutes to power-up and starts its wireless communications. We loss communications for about 40 seconds. After 40 seconds we are off and running again. My thoughts are we having a wireless network access point/network problem on the field control access point. We saw several other teams with the same problem at these two events.

We have checked the power to the 12V/5V converter. We have replaced the Ethernet cable with a shielded Ethernet cable. We added (4) 470uf 35V capacitors across the 5V power into the D-Link radio. We are running LabView and a camera on our robot network back to out DS. We process the camera image on the DS and send data back to the Robot through a UDP packet.

HELP.... We ordered a new D-Link radio and power converter today but I think there is an issue on the Field Control Access Point.

Racer26
19-03-2012, 16:59
I think I've noticed a pattern in who these failures happen to.

They're ALL using LabVIEW... Has there been a single failure of this type with a C++ or Java team?

pathew100
19-03-2012, 16:59
HELP.... We ordered a new D-Link radio and power converter today but I think there is an issue on the Field Control Access Point.

Your symptoms do point to the radio power dropping out momentarily. If you reconnect after 40 seconds then the power to the radio was interrupted.

Replacing the radio and converter are a good start, but don't overlook the Power Distribution Board. Also, check the connection to the barrel plug on the D-Link. (Many teams use hot glue, tape, etc to ensure a somewhat better connection).

I am wondering if PWBs in the radios are being damaged internally this year due to the harder shocks that robots are taking (falling off the bridge, going over the bump). The D-link radios were designed to sit on a desk, not to take high-G impacts. Might be worth it to make sure the radio isn't rigidly mounted to the frame.

Alan Anderson
19-03-2012, 17:13
The Field controls personal said our radio reset. I am questioning this answer because I see the radio takes about 1 1/2 minutes to power-up and starts its wireless communications. We loss communications for about 40 seconds. After 40 seconds we are off and running again.

While a DAP-1522's expected link-up time is 50+ seconds after a reset, 40 seconds is almost exactly right for communication to be reestablished after a cRIO reset.

I watched one D-Link DAP-1522 consistently work in less than 30 seconds after a complete power cycle of the robot. It was, to put it mildly, unexpectedly quick. This was not using wireless, but it's still faster than I generally see with a wired connection.

Dale
19-03-2012, 18:21
I think I've noticed a pattern in who these failures happen to.

They're ALL using LabVIEW... Has there been a single failure of this type with a C++ or Java team?

We're running Java and had the same problem multiple times at the Oregon Regional. Other teams were having problems as well.

MrClintBarnes
19-03-2012, 18:33
I was FTAA at the regional at Utah this last week, and we were having similar problems. The FTA found that every team on the field had high trip times (>30ms) when there was a team with a D-Link running firmware version 1.4.

We has the CSA downgrade those bridges to v1.2 of the firmware and all of the trip times went back down to normal. I am not sure if this is the same problem that everyone else was having, but downgrading the firmware on any of the older bridges running 1.4 seemed to help.

-Clint

Jared Russell
19-03-2012, 19:28
I was FTAA at the regional at Utah this last week, and we were having similar problems. The FTA found that every team on the field had high trip times (>30ms) when there was a team with a D-Link running firmware version 1.4.

We has the CSA downgrade those bridges to v1.2 of the firmware and all of the trip times went back down to normal. I am not sure if this is the same problem that everyone else was having, but downgrading the firmware on any of the older bridges running 1.4 seemed to help.

-Clint

I hope FRC ensures that every robot on the field is running v1.2 going forward.

MrClintBarnes
19-03-2012, 19:33
I hope FRC ensures that every robot on the field is running v1.2 going forward.

I am sure that the FTAs and FIRST staff will look into this and, hopefully, we will see a response from them at the next events.

dudefise
19-03-2012, 19:54
We also had this issue at the LA regional, and were advised that there was a "general issue with the field" by the FTA. I was informed that it was causing control problems and sending reset commands to robots.

I'm not sure if this was the cause, but I think it may explain some of the problems that occurred to various teams, including 2637.

n8many
19-03-2012, 22:54
I think I've noticed a pattern in who these failures happen to.

They're ALL using LabVIEW... Has there been a single failure of this type with a C++ or Java team?

Our team at Sacramento/Davis coded in C++ and we had this problem.

The odd part was that we were only hit by dropped packets on Saturday. Team 3256 was also hit by this however they might have fixed it by halving the resolution on both of their cameras.

Wetzel
21-03-2012, 10:22
While a DAP-1522's expected link-up time is 50+ seconds after a reset, 40 seconds is almost exactly right for communication to be reestablished after a cRIO reset.

I watched one D-Link DAP-1522 consistently work in less than 30 seconds after a complete power cycle of the robot. It was, to put it mildly, unexpectedly quick. This was not using wireless, but it's still faster than I generally see with a wired connection.
We had a problem at VCU where we would lose comms during the match, and the cycle time was about 40 seconds so we blamed the cRIO rebooting but couldn't figure it out. No problems showed up when tethered in the pits, and couldn't get a slot on the practice field. It turned out to be that the radio was not connected to the regulated supply on the PD board and was cycling when voltage dropped. This was noticed by our FTA who was able to watch the robot from a close vantage point and saw the radio reboot.

We had a handful of problems plauge us throughout Friday, but they were all traced to robot issues and not FMS. (bad jag, bad motor, improper wiring)

Wetzel

Mark 3138
21-03-2012, 11:14
We had issues last year with the D-Link disconnecting. We competed in two regionals before tracing it to the barrel plug, which we purchased from Radio Shack. The ID of this Barrel plug was slightly larger than the AC adapter cord that came with the D-Link. We changed the Barrel plug, using the one that came with the D-Link and the problem went away.

techhelpbb
02-04-2012, 18:54
I actually had an oscilloscope at the MAR Mount Olive District event and someone took the time to check the power on a teams robot that was experiencing these symptoms. This was a 25Mhz analog scope and there were no issues.

Immediately I suggested it could be the camera feed, but it seemed like the other teams who were helping them were using it with better success. I also suggested it could be the CAN bus or somehow related to an interaction with the CAN bridge in the Jaguar....not to toss any blame on the Jaguars...I was merely speculating.

They were using LabView. We don't use LabView and I am not clear on how LabView handles exotic exception situations during communications.

When they turned off their video (which was a serious handicap for them) they could reliably control their robot on the field. As soon as they started using it again they had problems again.

I would like to point out, that I was told our field was apparently configured with 802.11G and everything was sitting on channel 6. I was told that's how it was setup and we all had to deal with it (even Team 11 and we were the host). That would mean that the most you can hope for is 54Mbps. However, with distance, interference, TCP/IP loading issues (it's a reliable protocol and if you don't understand the implications do some research or this will get too complicated to follow) you don't really have 54Mbps.

If what I was told was correct, this is going to be a problem if your robot is bandwidth hungry.

daniel_dsouza
02-04-2012, 19:07
I think I've noticed a pattern in who these failures happen to.

They're ALL using LabVIEW... Has there been a single failure of this type with a C++ or Java team?

We're running Java and had the same problem multiple times at the Oregon Regional. Other teams were having problems as well.

Yup, our teams runs a javabot, and we had the same problem during our practice matches. Our problem was that there were a few inappropriately handled errors and too many output statements causing major peaks in our cRIO's CPU usage. Having a robot project where the loop iterates as fast as possible can also drive up that CPU usage.

RufflesRidge
02-04-2012, 19:17
I would like to point out, that I was told our field was apparently configured with 802.11G and everything was sitting on channel 6. That's how it was setup and we all had to deal with it (even Team 11 and we were the host). This means that the most you can hope for is 54Mbps. However, with distance, interference, TCP/IP loading issues (it's a reliable protocol and if you don't understand the implications do some research or this will get too complicated to follow) you don't really have 54Mbps.

If what I was told was correct, this is going to be a problem if your robot is bandwidth hungry.

It wasn't. The field setup uses 5GHz as you can see with a 5GHz NIC or by looking at your radio setup after programming.

Alan Anderson
02-04-2012, 19:31
It wasn't. The field setup uses 5GHz as you can see with a 5GHz NIC or by looking at your radio setup after programming.

The one field I have seen up close had all the robot networks on channel 60, not 6.

rbtying
02-04-2012, 19:51
Team 846 had similar problems at New York Regional, which were resolved by swapping the power distribution board. Though it will not pass inspection, you can try mounting a spare PDB as a custom circuit off of one of the 40A breaker slots, and powering the radio off of that.

We will likely have more data on this phenomena later, as we plan to scope out the failed power distribution board under load. It is perhaps notable that we have measured large ripple voltages (up to 0.6v on a WORKING board) on the 5v supply output, and that this is likely a marginal condition for the D-Link, which expects 5v +/- 5%.

techhelpbb
02-04-2012, 21:49
Team 846 had similar problems at New York Regional, which were resolved by swapping the power distribution board. Though it will not pass inspection, you can try mounting a spare PDB as a custom circuit off of one of the 40A breaker slots, and powering the radio off of that.

We will likely have more data on this phenomena later, as we plan to scope out the failed power distribution board under load. It is perhaps notable that we have measured large ripple voltages (up to 0.6v on a WORKING board) on the 5v supply output, and that this is likely a marginal condition for the D-Link, which expects 5v +/- 5%.

Aren't you supposed to power the D-Link AP with the DC-DC converter?

I know you can power the D-Link from that 5V source on the PDB, but you're pushing your luck as you've noticed.

I am absolutely sure this was a requirement last year because we showed up to Palmetto minus that DC-DC converter and electrical tried to get me to overnight it (luckily they found a donor).

techhelpbb
02-04-2012, 21:58
It wasn't. The field setup uses 5GHz as you can see with a 5GHz NIC or by looking at your radio setup after programming.

Thanks. I was basically the only living body in spare parts and I couldn't stop long enough to test that myself. Unfortunately I normally have one of my A/B/G/N 'war driving' kits with me, but I had to move quickly to absorb spare parts and get all my tools so it ended up sitting at home on the floor.

I did have my Droid-X with WiFi Analyzer App and towards the end during eliminations (while I was moving batteries) I did see 4 networks that are not normally present and they were definitely G networks (my phone does not do A/N).

Are there G networks for field controls, or do these access points continue to transmit the G networks even when you're not using them? I don't know so I'm asking.

Al Skierkiewicz
03-04-2012, 07:47
Team 846 had similar problems at New York Regional, which were resolved by swapping the power distribution board. Though it will not pass inspection, you can try mounting a spare PDB as a custom circuit off of one of the 40A breaker slots, and powering the radio off of that.

We will likely have more data on this phenomena later, as we plan to scope out the failed power distribution board under load. It is perhaps notable that we have measured large ripple voltages (up to 0.6v on a WORKING board) on the 5v supply output, and that this is likely a marginal condition for the D-Link, which expects 5v +/- 5%.

Yep, one and only one PD. However, ripple is not normal and the five volt output is to be used for the camera only. The correct wiring is to connect the five volt convertor to the +12 volt 'radio' output on the PD. The output of the convertor then feeds the radio.
Any idea what frequency the ripple was?

rbtying
03-04-2012, 17:20
Yep, one and only one PD. However, ripple is not normal and the five volt output is to be used for the camera only. The correct wiring is to connect the five volt convertor to the +12 volt 'radio' output on the PD. The output of the convertor then feeds the radio.
Any idea what frequency the ripple was?

We weren't able to get a good measurement on the ripple due to difficulty getting the oscilloscope to trigger, but it was approximately 2KHz by our reckoning.

The 5v I mentioned isn't the 5v output on the PDB itself, it's the output of the DC-DC converter after the +12V radio output on the PDB. We don't use an Axis camera on our robot, and so the dedicated 5v on the PDB is disconnected (Sorry for the confusion, I'd completely forgotten that there was a 5v output on the PDB).

DMike
04-04-2012, 08:31
We had the same issue in CT. 4 out of 10 matches we were fine, 5 had intermittent, 1 had no robot control post autonomous, in now particular order. The robot ran fine on the wire and also fine on test radio's in the pit and on the practice field. After much testing (on and off the field) we discovered that our four JAG controllers on the drive were overloading. It appears there was a delay or lag in signal quality, our driver would unknowingly increase voltage through joystick position. When the signal would return the voltage position was high and the JAGS would overload. The process would repeat itself, 3 seconds off- 15 seconds on- 4 seconds off 15 seconds on, over and over. If the driver waited 20-30 seconds the robot would drive again only to suffer the same issue soon after. Very frustrating.

Al Skierkiewicz
04-04-2012, 09:04
Mike,
What you describe is typical Jag over current trip. Once they go into fail mode, they wait about 3.5 seconds before returning to normal operation. Most teams using the Jags build in a ramp up to full throttle to help reduce current through the device when the motors are in stall.

DMike
04-04-2012, 09:59
We had a ramp up program and the jumpers on the jag moved to the appropriate position. It was definately a problem with tripping the jags, the question is what was the source of the problem. In any other drive condition the jags worked fine, on the field they failed at random intervals. If there was an overloaded bandwith issue and the controller was slow to react, would this craeate a situation where the joystick was in a full throttle position when the signal was processed ? If the bot was stationary and regained conscientiousness at full power would this overload the jag?

Al Skierkiewicz
04-04-2012, 10:06
Mike,
Conditions for drivers on the field are rarely what they are back in the pit, back at your school or on the practice field. The drivers are pumped up for competition and adrenaline is flowing. The Jag fail state is not ported back to the Crio and therefore not through the FMS to Driver's Station. If your student is at full throttle when the Jag re-enables itself, your DS will be sending the full throttle command to your robot during a stall condition on your drives. (stall occurs anytime the motors are not moving and you apply current)

DMike
04-04-2012, 10:17
Agreed, The nagging question is root cause for the initial over current condition. Why with acceptable programming were we entering the overcurrent ? What is the protocol for recovering or re-entering normal function state?

techhelpbb
04-04-2012, 10:46
Agreed, The nagging question is root cause for the initial over current condition. Why with acceptable programming were we entering the overcurrent ? What is the protocol for recovering or re-entering normal function state?

A few possibilities I can offer that a driver might do in competition that you might not see in another environment:

1. What happens if you drive your robot forward and backwards quickly several times?

2. What happens if you push another robot with your robot (put 130lbs of bricks on something with wheels and try it)?

3. What happens if you go full throttle up the bridges?

4. What happens if you skate along a wall with the robot?

5. If your robot can turn in place, what happens if you do that then suddenly reverse rotation?

6. What happens if you run your end effector and drive train at the same time (does the battery brown out)?

Just a few instances that merely driving around might not simulate that easily crop up during matches.

Al Skierkiewicz
04-04-2012, 11:07
Mike,
The only way to recover is to wait. The way to prevent over current fault is to limit the demand either in software or driver training under all conditions. Without watching your robot or knowing your drive train I can only speculate. Some of the more common factors are improper gear ratio choice, sticky tires, sharp turns, transmission losses, misalignment of chains and sprockets, binding drive parts, improper use of bearings or cantilevered drive shafts.

MrRoboSteve
04-04-2012, 12:22
At 10,000 Lakes, our robot regularly had issues with getting connected to the FMS on-field.

The access point is mounted high in the robot, on Pleiglas at least 2" from any metal. We installed a new Ethernet cable as well. We have the DC-DC converter installed.

The solution that finally worked for us was cycling the power on the access point as we entered the field.

Has anyone had any experience using a shielded Ethernet cable? Was reading a bit about it here: http://www.l-com.com/content/FAQ.aspx?Type=FAQ&ID=4803

techhelpbb
04-04-2012, 13:02
We had the same issue in CT. 4 out of 10 matches we were fine, 5 had intermittent, 1 had no robot control post autonomous, in now particular order. The robot ran fine on the wire and also fine on test radio's in the pit and on the practice field. After much testing (on and off the field) we discovered that our four JAG controllers on the drive were overloading. It appears there was a delay or lag in signal quality, our driver would unknowingly increase voltage through joystick position. When the signal would return the voltage position was high and the JAGS would overload. The process would repeat itself, 3 seconds off- 15 seconds on- 4 seconds off 15 seconds on, over and over. If the driver waited 20-30 seconds the robot would drive again only to suffer the same issue soon after. Very frustrating.

I am not a fan of suggesting it but I should tell you anyway.

If push comes to shove and you can't locate a reasonable resolution to your intermittent overload on the Jaguars, perhaps you could consider using Victors for at least the most likely overloaded systems.

Using the Victors won't actually stop you from overloading the system, but the Victors will just brute force until they self-destruct and if all else were to fail that might be sufficient to get you through till you can either adjust the software or the hardware to resolve this issue.

Keep in mind, what you'd be doing is not a good idea with the Victors either, but if it's rare and intermittent they'll probably survive it. If it's not a rare overload then you might literally be playing with fire (or at least smoke).

Obviously this would mean possibly adding PWM for the Victors to a robot that might not be using PWM to control the Jaguars and this might mean retuning the controls for the differences. This could also involve tinkering with some wiring. However, you could prepare for such a change in advance of competition.

Mark McLeod
04-04-2012, 13:32
I think the Victors will survive.
I actually tested this hypothesis a couple of weeks ago and the 40amp breaker on the circuit will trip before the Victor suffers harm.

The order seems to be:

Jaguars trip first (most sensitive)
40 amp breakers trip secondWhat this means is that swapping the Jag out for Victor will help in marginal cases, but won't do any good if the problem is more than borderline.
Some gear ratios just need to be recalculated.

techhelpbb
04-04-2012, 14:40
I think the Victors will survive.
I actually tested this hypothesis a couple of weeks ago and the 40amp breaker on the circuit will trip before the Victor suffers harm.

The order seems to be:

Jaguars trip first (most sensitive)
40 amp breakers trip secondWhat this means is that swapping the Jag out for Victor will help in marginal cases, but won't do any good if the problem is more than borderline.
Some gear ratios just need to be recalculated.

I generally agree. The risk is not so much the incidental overload, but the cumulative damage. The auto-reset breakers are not very fast and they often finally open the circuit at currents much higher than their ratings. The Victors are well built and we've been using them for years with very low failure rates but I can see how repeatedly banging on those MOSFETs could result in eventual failure.

DMike
04-04-2012, 15:23
Gearing and box's were mostly KOP, one issue that could have occured was misalignment under full load, potentially binding a gear. I don't think this was the issue we will test and answer that. We had four jags simultaneously overload at once. Four wheel drive, i think the bot would continue to move even if only 1 motor was active, albeit in a circle. Victors and direct drive might be the solution.

techhelpbb
04-04-2012, 15:47
Gearing and box's were mostly KOP, one issue that could have occured was misalignment under full load, potentially binding a gear. I don't think this was the issue we will test and answer that. We had four jags simultaneously overload at once. Four wheel drive, i think the bot would continue to move even if only 1 motor was active, albeit in a circle. Victors and direct drive might be the solution.

Can you clarify what you mean by direct drive?

Al Skierkiewicz
04-04-2012, 15:48
The Victor FETs are rated at around 40 amps each with three in parallel for each leg of the bridge for about 120 amps. This figure is continuous with peaks above that but must be derated as junction temperature goes up. Remember that neither controller has any heatsink for the power devices. The 40 amp circuit breaker can withstand up to 240 amps for short durations but will likely first trip between 100 and 120 amps. The trip point is reduced for each successive trip as the breaker heats up internally. When the breaker trips repeatedly, it buzzes and gets HOT!
The stated trip for the Jag is greater than 90 amps for just under 1 second, with 100 amps stated as the maximum current through the device. Following an over current trip, the Jag will wait about 3.5 seconds before re-enable. While the series resistance for the FET is lower than the one used in the Jag, there are only two per leg.

techhelpbb
04-04-2012, 16:55
The trip point is reduced for each successive trip as the breaker heats up internally. When the breaker trips repeatedly, it buzzes and gets HOT!

I just want to add that this means that it's hard to repeatedly overload the Victor's in a short duration of time when the 40A auto-reset breaker is protecting the circuit because the breaker heating causes them to trip at lower and lower currents.

However, each time you let the breaker cool off you run the risk again that the next overload will get a shot at cumulatively damaging the MOSFETs in the Victors because it's back to 240A max and 100A-120A normally.

The time frame between matches can reach that point. Also if you blow cool air on the breakers you increase the risks of this happening.

Al Skierkiewicz
05-04-2012, 07:55
Mike,
How did you couple to the wheels from the KOP transmissions? When you reached trip point were you driving straight or turning? Were you using the standard KOP frame with bolts for wheel axles?

DMike
05-04-2012, 09:12
All jags were independently powered from the distribution board, fused at 40 amps and wired with 10ga. Power consumption to each circuit should have been the max of 1 motor. Two front motors were Cim's w/ simbox and KOP gears and sprockets with #35 chain and 8" plaction wheels. Rears were the 16000 rpm AM motor coupled to a AM planetary with the simbox and KOP gears and sprockets, 8" omni wheel #35 chain. We realize there is a difference in RPM between front and back, testing under all foreseeable conditions proved it to be an acceptable setup. Our acid test was a %500 game cycle test at accelerated driving speeds. We are going to set the bot up and load test each circuit under varying conditions, this should provide us with some good data.

Al Skierkiewicz
05-04-2012, 09:46
Mike,
I am going to bet that the 8" wheels are really the big effector here. Have you calculated your final gear ratio and motor RPM?

Alan Anderson
06-04-2012, 21:46
I am going to bet that the 8" wheels are really the big effector here.

Good point. There's a team at Queen City this weekend having Jaguar cutouts on their drive motors, and they have CIMpleBoxes driving 8" wheels.

Captaindan
09-04-2012, 23:31
Team fusion did not change a single thing on their bot since the bayou regional, we went to lonestar unbagged the robot and put it on the field it connected and had no problems the entire regional

tagayoff
11-04-2012, 12:21
At 10,000 Lakes, our robot regularly had issues with getting connected to the FMS on-field.

The access point is mounted high in the robot, on Pleiglas at least 2" from any metal. We installed a new Ethernet cable as well. We have the DC-DC converter installed.

The solution that finally worked for us was cycling the power on the access point as we entered the field.

Has anyone had any experience using a shielded Ethernet cable? Was reading a bit about it here: http://www.l-com.com/content/FAQ.aspx?Type=FAQ&ID=4803

At the Central Valley Regional we had connection issues in all the matches except two. After changing all the electrical components and then some on the 9th match we started working but not at first. When our robot was brought onto the field the same connection issues were apparent (no connection dropped packets) Two FTA's went on the field and after they came off we were connected. After the match I asked the FTA what he had done and he said they connected the driver station to the robot when they saw no connectivity and it connected ok through a cable. But when he disconnected the Driver Station the radio would not connect to the field so he powered off the robot to reboot the robot. When it came back up we have connectivity for the entire match. Next match we make sure we do not connect the robot to our driver station by cable or if we do we power the robot off then on and so on our last match we ran again not perfect because of the code we had running but we ran the whole match with few lost packets. My guess is we were actually cycling power to the bridge as we entered the field as you stated above.

techhelpbb
13-04-2012, 18:16
I am NYC right not Philly, but students from Team 11 at MAR regional are reporting to me significant connectivity issues.

Apparently they are impacting even our team and we've participated in 3 other events without any issue.

Can someone confirm these reports?

Deetman
13-04-2012, 19:48
I am NYC right not Philly, but students from Team 11 at MAR regional are reporting to me significant connectivity issues.

Apparently they are impacting even our team and we've participated in 3 other events without any issue.

Can someone confirm these reports?

I can confirm this and they have all seem to have centered around RED Driver Station 1. I know 1712, 11, 25, 1676, and others have all had issues at Red driver station 1 and no others.

1712 has not had ANY connectivity issues in 36 of our 37 matches to date with the only failure today at the MAR championship. Our driver station laptop could not connect to the robot but as soon as we swapped out to the FTA's classmate we had no issue. This of course hurt us in the match as the drivers did not have their normal feedback for shooting but we didn't have any other issues. Returning to the pits we tethered up and had no issues at all with our normal driver station laptop and two different ethernet cords. Unfortunately the pits closed immediately after the award ceremony (huh?) and we were not allowed out to the field to work with the FTA to further troubleshoot/check the problem.

I have noticed a LOT more connectivity issues here at the MAR Championship than I saw at Hatboro-Horsham, Chestnut Hill, and Lenape all with the same MAR field (as far as I know). I'm not ready to rule out that something between our driver station laptop and FMS disagreed but the fact that this has only happened in one match, in one driver station, at one event, and to other teams in the same driver station really makes me wonder.

techhelpbb
14-04-2012, 06:57
Thanks I'll forward the observations to them. Team 11 has communicated that this has played a role in costing them 3 losses (when I last look they've lost 4 times...so it's safe to say they aren't over-estimating the impact of this 'phantom' problem). We've sent out the spare 4 slot cRIO FRC2, they are either close to or will have swapped the DC-DC converter for the D-Link AP. They've swapped the AP. They have reflashed the cRIO. They can swap the cRIO. The logs show communication and an 11-12V battery till the communication simply stops. I know for a fact from many test runs that the robot will remain in communication down to just over 9V on that measurment. They are rebooting the entire robot when they load it onto the field as a standard process (this includes the AP and the cRIO).

I've given them some TCP/IP related suggestions to try and now that you tell me this I'm gonna ask them to dump the registry settings on that laptop to get a comparison with one that works.

I don't believe in 'phantom' problems and while I can't say for certain that the problem isn't in that robot extremely little has changed between the MORT district even we hosted and this event. We are not under funded and a bad battery is extremely unlikely because of it. Therefore there's likely something wrong in wireless and if it's going to cost a place in eliminations at a championship I should think it should be a priority to find out what.

It's quite frustrating that we've had no apparent problems up until this point. That we've done basically nothing to the robot to start the problems. That we're not even using any high-bandwidth applications (no sending video) and here we are with problems. The control system on that robot is basically using PWM and if it's not communicating, to put it bluntly, this means that essentially there is no safe design to escape this mess. It's even more frustrating that I pointed out in these forums before these competitions began that we had problems just once before (with last year's robot) like this at a field at a small off-season event and during that time the process for troubleshooting it was hopelessly ineffective because the tools to do the troubleshooting simply do not exist within the context of the FMS.

Having had access to the tools we were offered to do this troubleshooting this year I can say again that the tools we've been offered are clearly hopelessly ineffective. In each case I've seen at the MORT district event and now here the only resolutions have come from trial and error not quantitative measurement. Trial and error means you get to forfit games to test one thing after another and since this seems to mostly effect the actual playing field and not the practice fields you are forfiting games you and your sponsers paid to play to troubleshoot problems you can't replicate on your own.

I'm not sure what frustrates me more. That we can't teach the students through this adversity more about troubleshooting and that technology or that we don't get to compete in the competition.

rsisk
14-04-2012, 10:58
Have you swapped the PDB yet? I know at least three teams that swapping the PDB has solved intermittent connection issues. Try to find a non-2012 PDB.

Alan Anderson
14-04-2012, 11:00
The logs show communication and an 11-12V battery till the communication simply stops. I know for a fact from many test runs that the robot will remain in communication down to just over 9V on that measurment.

If communication is lost at 9 volts, something is wrong with the robot. The cRIO and wireless bridge power outputs on the Power Distribution Board are designed to maintain full voltage down to below 5 volts on the battery input.

techhelpbb
14-04-2012, 12:01
If communication is lost at 9 volts, something is wrong with the robot. The cRIO and wireless bridge power outputs on the Power Distribution Board are designed to maintain full voltage down to below 5 volts on the battery input.

I have no idea how low it can really go till it totally looses communications actually (though I understand how you could interpret that sentence like that). At 9V we don't consider that robot to be fully functional as it impacts the shooter performance quite noticeably.

If there was something terribly wrong like that the odds are it wouldn't have gotten into elimination or near elimination matches at 3 (make it 4) events.

Also, I don't think you can claim that the robot can communicate with a 5V battery despite the rating of the converter because the moment the robot starts to move any actuator that battery will cave at that level of discharge and then you will be outside the nominal voltage range of the DC-DC converter. Heck that's less than 50% so really at that point you've deep cycled the battery and you really would have something wrong if your cRIO was reading a battery voltage of just 5V even for a short period. That would be easily indicative of a very heavy load, a battery issue or a discharged battery.

We tested this design quite a bit. We don't have discharged batteries because we have plenty of them constantly tended. We don't have bad batteries because we'd have likely discovered them in the batch some time ago. We might have 1 or 2 marginal batteries suddenly but again too many matches that shouldn't happen and so far as I know we never created any bad batteries with this robot. Additionally even driven incorrectly, with all the mechanisms moving under full power (and the compressor running) the robot can easily sustain activity for longer than a match as we did that over and over (at least 50 times). It's performance will suffer but it has never...not even once...lost communications under those circumstances and we did have a field we constructed with bridges, other robots and targets to test it.

Additionally, as I've noted elsewhere, when this was happening at the MORT district event someone took my oscilloscope and put in on the effected robot in the pit (an entirely unrelated team as I was in charge of spare parts and I let anyone use my tools). The battery itself back from that failure was not having any issues that would cause the DC-DC converter to drop it's output voltage. That was the same battery, on the same robot, right after a failure. So it would seem that this can't all be battery supply related issues. Additionally, it wouldn't explain why with the same model bridge and again a well tested battery we lost communications last year several times during a single off season event (an event with countless issues with the field operations I might add).

techhelpbb
14-04-2012, 12:08
Have you swapped the PDB yet? I know at least three teams that swapping the PDB has solved intermittent connection issues. Try to find a non-2012 PDB.

As it stands they've swapped the DC-DC converter and for the moment they've avoided any additional issues since doing that. However, other teams are apparently still having communications problems. It might be related, or as I've already suggested it might not be related at all.

Thing is this is a terrible way to troubleshoot. Firstly, the field itself has probably been power cycled since yesterday. Secondly, we have no idea what has happened to the field since it was left overnight. Thirdly, we had (I think) another 3 matches (before eliminations) but each time you're operating in the environment in different locations.

The only good part about this is that it's not just us. Okay that's not really a good part either as we want to compete on a communications level playing field. At least it's acting as a leveling effect for the tournaments. If your robot has other issues then as long as this issue rotates it's way around the ranks the rankings should somewhat self level.

If luck holds out they'll avoid any more surprises like this. If not when that DC-DC converter gets back here I'm going to see if I can find any reason it might be contributing to issues. Being a power component it's not like we have lots of choices of how to handle it.

I will add this anecdote. Early this year with a prototype robot one mentor constructed an LED light source and put it on the 2012 PDB. It was drawing less than the PDB specifications but that PDB was literally making a whistling noise. That PDB is not on the robot they have. I chalked that up to loose magnetic components but as the load changed the pitched changed (was gonna make it play a little tune). The LEDs were not a switched load as he was simply limiting their current with a resistor. Still that was the first time I've ever seen one of those PDB do that and there as no excuse because that was a fraction of the available current.

techhelpbb
14-04-2012, 15:50
At the end of the day after we replaced the DC-DC converter the problem did not come back, at least not for us. It looks like 1403 lost communications on our red alliance it helped cost the elimination match.

It was great fun as always, the communications issues aside, and as with the other competitions before this where we didn't have communications problems we placed highly. So as it turned out it wasn't the defining limit of our game but instead fell back to more regular issues.

Thanks to everyone for your input on the matter and for Team 11 still out at the event great job!
Additionally thanks to the field folks and MARs for the game and support!

Alan Anderson
14-04-2012, 18:11
Also, I don't think you can claim that the robot can communicate with a 5V battery despite the rating of the converter because the moment the robot starts to move any actuator that battery will cave at that level of discharge and then you will be outside the nominal voltage range of the DC-DC converter.

1) When the battery voltage drops that low, the cRIO will have disabled all the robot actuators anyway in order to maintain power for communication.

2) When you say DC-DC converter, are you talking about the internal circuitry of the Power Distribution Board, or do you mean the 12v-to-5v converter for the D-Link?

Deetman
14-04-2012, 18:53
Here are my final observations from the MAR Championships...

1) This was apparently one of FIRST's fields and not MAR's, possibly one of the advance/emergency fields. As a result I'm not sure what, if any, events this field was used for.

2) I saw much fewer communications issues Saturday, but didn't really watch all the qualifying matches.

3) Comms issues throughout the tournament seemed random and happen to no team specifically that I am aware of other than 1676.

4) In the elims the FTAs/1676 would allow the robot to connect to the field and then wait for some amount of time. If it didn't lose communications they'd be good for the match, otherwise they'd wait again (only happened once). While I know nothing specific, something specific to their robot/driver station was having more issues with the field than everyone else. I'd really like to hear their experiences and what they found out.

I must say, the FTAs and event staff did a great job being patient and troubleshooting in an attempt to make sure the matches started with all robots connected to the field.

mdrouillard
14-04-2012, 19:36
Check for metal shavings in your crio bay and ports. Someone may have drilled above your board and the electronics are complaining. Also check your radio version. Make sure it is version 1.2 and not 1.41.

Md

techhelpbb
14-04-2012, 20:10
1) When the battery voltage drops that low, the cRIO will have disabled all the robot actuators anyway in order to maintain power for communication.

2) When you say DC-DC converter, are you talking about the internal circuitry of the Power Distribution Board, or do you mean the 12v-to-5v converter for the D-Link?

I'm talking about the module you are supposed to connect to the D-Link.
Sorry am on my phone.

techhelpbb
14-04-2012, 20:13
Check for metal shavings in your crio bay and ports. Someone may have drilled above your board and the electronics are complaining. Also check your radio version. Make sure it is version 1.2 and not 1.41.

Md

Always a good thing to check but it can't be the cause in this case.
The design would make it quite improbable.

If this was a robot issue at all replacing that DC-DC module seemed to be the fix.

Alan Anderson
14-04-2012, 23:00
Also, I don't think you can claim that the robot can communicate with a 5V battery despite the rating of the converter because the moment the robot starts to move any actuator that battery will cave at that level of discharge and then you will be outside the nominal voltage range of the DC-DC converter.

2) When you say DC-DC converter, are you talking about the internal circuitry of the Power Distribution Board, or do you mean the 12v-to-5v converter for the D-Link?

I'm talking about the module you are supposed to connect to the D-Link.

You don't seem to understand the purpose of the boost-regulated 12 volt output from the Power Distribution Board. It stays high enough for the wireless bridge to function even if the battery sags to a ridiculously low voltage. If the PDB and 12v-to-5v converters aren't faulty, the system will maintain the bridge's operation while the battery is supplying only five volts.

techhelpbb
15-04-2012, 00:32
You don't seem to understand the purpose of the boost-regulated 12 volt output from the Power Distribution Board. It stays high enough for the wireless bridge to function even if the battery sags to a ridiculously low voltage. If the PDB and 12v-to-5v converters aren't faulty, the system will maintain the bridge's operation while the battery is supplying only five volts.

I am confused as to how you arrived at this conclusion that I don't understand what you're trying to communicate.

1. You are saying that the D-Link will communicate down to 5V and I'm telling you that the system battery voltage was no where near that low and the logs demonstrate that. So even if you're correct...it doesn't matter to this situation with this robot. If that battery is reading below 9V nominal on the driver's station for this robot it needs to be replaced with a charged battery and there's no evidence to support the claim that it was that low. Not in the logs or in any set of measurements I've got from the team or the field personnel. It doesn't matter just how low a battery voltage the D-Link AP may, or may not, be able to operate that robot is designed to shoot baskets and that system can't function within the parameters the drivers expect with a battery anywhere near that low.

2. I've also mentioned at least 2 times that the DC-DC converter being replaced seems to have fixed it. If, and only if, the DC-DC converter connected to the D-Link AP was actually bad would that be the only problem. I still believe that's it's either the escape goat or just part of the overall problem. I will gladly bench test the suspected bad DC-DC converter when it's available to me and if you like provide the results (if you like I can probably even send it to you to test). The point being here is that we might not be talking about a properly functional DC-DC converter (see my note in the next post because there's a problem with this idea).

3. I've already explained in detail that the battery being low or damaged was (as this is now past tense) not the case.

4. I've also explained that at several previous competitions we operated in dozens of matches with zero events such as this. That doesn't include dozens of matches on our test field and some matches on the practice fields. So what I'm saying here is if that DC-DC converter is the problem it literally just went bad. That would *really* bother me if that happened because that particular part is pretty well protected from SWARF! and it's load should be predictable (again see my next post below).

So what am I not understanding here? That we have a possible bad DC-DC converter? Didn't miss that we replaced it just before the problems stopped as one of my recommendations to them.

I understand you're trying to help. However this is now literally beyond help. After the DC-DC converter to the D-link was replaced we continued our remaining matches without problems. However, other teams on our alliance failed to be able to communicate and short their ample assistance we seem to have done the best we could.

I'm not sure how what I may, or may not understand, effects the other robots on the field.
Perhaps we just all have a lot of unrelated power problems that effect the D-Link AP.
However, again, that would mean that something that defys the expectations of the specifications would be at work here.
We are all a little too far along in these competitions for really defective robot designs.

techhelpbb
15-04-2012, 00:54
So here's the thing with these D-Link power issues. Let's assume that we can power this D-Link AP well below a battery voltage of 5V in a properly functional system. I respect you Alan and I haven't tried to push this limit myself so let's take this as gospel.

It really does not explain why these problems are so hard to troubleshoot off the competition field seemingly regardless of the number of tests run or the equipment. It's not just me and especially not just me since I have only attended exactly one competition this year and I spent that as the spare parts volunteer. We've had MARs folks, FIRST folks, trained engineers and teams struggling with this (and my heartfelt thanks to each and every one of you).

1. If we had the ability to measure the voltage feeding the AP on the robot at all times that would help, I have a circuit that could perform this function, but because it touches those power leads it's probably illegal on the field. So we can't really test the power to that D-Link AP either from the PDB or the DC-DC converter connected to the D-Link during a competition match. It's not a feature we have in our current system and the rules prohibit the inclusion of circuits that would add the capability.

2. I wonder what the FIRST configuration of the D-Link AP does to the power requirements of the D-Link AP. I can easily test the power requirements of the D-Link AP when it's communicating with a B/G network or N network, but not so easily with the competition field because of issue one above. Perhaps the reason these problems tend to favor the competition field is because the field settings for the D-Link AP somehow increase it's power requirements unlike other environments. That change might not even be the sort of thing you can test with a voltmeter or ammeter when the robot is stationary. If that's the case then we have an issue that makes it hard to test elements of the system unless we just hard load the power where it would feed the D-Link to it's maximum rating as a test.

If this was the case this might explain a great number of things and would mean the field itself isn't the issue. It would even touch on the reason that some teams get nailed by the PDB and some teams get nailed by the DC-DC converter connected to the D-Link (assuming they have decent power supply from the battery). Perhaps an overload situation is created beyond the specifications of the systems or perhaps those specifications are not being met in production for these parts.

In any case, did anyone that actually had problems fully load test the system at the point where the D-Link AP is connected? Can anyone confirm that actually tested this regardless? I will confirm that to my knowledge we did not fully load that D-Link power supply. We tested it anecdotally by simply using it without issue...till we had an issue on this particular competition field after many successful previous uses. Perhaps this thing is a bit of a ticking time bomb.

jteadore
15-04-2012, 14:10
We also had communications issues all season but what happened at the MAR championship was interesting. We would connect to the field when we were setting up for a match. When all the robots were connected, about a minute later we would disconnect. After a reboot we would connect and run through the match with no issues. The FTA and NI engineer were as puzzled as we were. Fortunatley since we had a workaround with the reboot we were allowed to run through that sequence for each match.

Just as a side note I want to thank the FTAs and NI engineers at the event. They really went above and beyond to help.

techhelpbb
15-04-2012, 14:35
We also had communications issues all season but what happened at the MAR championship was interesting. We would connect to the field when we were setting up for a match. When all the robots were connected, about a minute later we would disconnect. After a reboot we would connect and run through the match with no issues. The FTA and NI engineer were as puzzled as we were. Fortunatley since we had a workaround with the reboot we were allowed to run through that sequence for each match.

Just as a side note I want to thank the FTAs and NI engineers at the event. They really went above and beyond to help.

At the MAR Mount Olive Disctrict event a team sending video back to their driver's station only seemed to have issues when they sent that video stream. If they turned off the video the robot worked fine but was at a big disadvantage beacuse a key feature was turned off.

Here's something I've been thinking about:

In my post above I proposed that perhaps the field settings for the D-Link AP raise their power requirements to a marginal or overload situation. A situation that only exists on a competition field when that particular situation is created and can't easily be measured with robots moving.

I would think that there's a non-linear but generally increasing relationship between the D-Link AP's use of power and it's attempts to communicate. I would propose that if the AP's general power requirements are generally increased that sending additional payload would be the equivalent of fuel on a fire. Sending that transit might push what is otherwise a not very good power situation (that eventually might damange the power components) into a situation where the odds are the D-Link AP power is insufficent and it'll misbehave. A situation that might leave the D-Link AP powered fully up when the robots stop moving and therefore can't be observed easily after the fact.

This possibility fits a few situations. Sending video streams from the cameras over TCP/IP to the driver's station should and does work on the none competition field and on no field at all. N wireless has more than enough bandwidth for that application. However this proposal would apply here and create this issue only on the competition field (on the robot side).

Additionally, when you first enable the D-Link AP as things normalize you're prone to have a burst of very busy transactions. Again, another shot of high traffic that could raise the D-Link AP power requirements so it could also apply in this case. You then reboot the D-Link AP as described in the post immediately above and you clear the immediate consequences of the overload you induced in that short period. Which would mean that all it would take is a good hard shot of communications to cause malfunction again.

Did anyone fully load test the D-Link AP power system?

Did anyone measure the D-Link AP power requirements when the robot was running and moving on a competition field (using 2 multimeters, one as a voltmeter and one as an ammeter on MAX/MIN would be a good start...be aware that's not a perfect test)?

rsisk
15-04-2012, 15:41
Was talking with 846 at SVR and they made the comment about the radio increasing power requirement when it started to detect dropped packets. They supposed that was part of their issued with dropped communications. The ncreased power requirement was causing a a marginal compnent (PDB in their case) to fail

techhelpbb
15-04-2012, 16:12
Was talking with 846 at SVR and they made the comment about the radio increasing power requirement when it started to detect dropped packets. They supposed that was part of their issued with dropped communications. The ncreased power requirement was causing a a marginal compnent (PDB in their case) to fail

It would only get worse with packet loss. TCP is a reliable protocol so it'll keep trying until it times out to deliver the packets you've asked to be sent. So basically if something starts to cause issues it'll try again, causing congestion, then it'll try some more. Eventually it's a storm.

One of the things I suggested to Team 11 was to change the timing behavior of the driver's station TCP stack so that it would either wait longer or give up sooner for a successful delivery. They could also do that on their robot mounted laptop if they were using it. This would help reduce the load on the network but with so little time to test might have cause unpredictable control issues for them. Course in their case the damage to the DC-DC converter could already have been done so it might have helped before that damage happened but in this case possibly not.

A camera using TCP would also set up this situation. It's really not the best protocol for streaming live video content (if you loose live video data...give up and get more). So if the D-Link AP does draw more power when packets are lost then this would be a great way to cause a problem.

I'm gonna leave my questions on the floor for other input:

Did anyone fully load test the D-Link AP power system?

Did anyone measure the D-Link AP power requirements when the robot was running and moving on a competition field (using 2 multimeters, one as a voltmeter and one as an ammeter on MAX/MIN would be a good start...be aware that's not a perfect test)?

Deetman
15-04-2012, 16:17
Expanding on the potential of an increased load on the D-Link power circuit...

We know that teams have had a hard time duplicating this issue at their own facilities. We also know that some events have not had as many reported issues (FiM and MAR district events for example).

Looking at the events where issues have occurred they seem to all be the traditional regional-type event in an arena/larger type venue, perhaps largely taking place in a city. In contrast, smaller events are taking place in high schools and other such venues where the environment is much more controlled.

Are we looking at a situation in which the D-Link AP is exceeding it's published power specifications due to it dealing with "normal" interference from various sources such as campus/venue WiFi, large number of teams with their router on in bridge mode, other devices such as bluetooth? As a result of this increased load are we damaging or pushing the DC-DC converter or PDB out of spec or degrading to out of spec over time? Are we or specific teams (for whatever reason) hitting some issue in the firmware/baseband of the D-Link? I doubt anyone could argue that we aren't exactly using the D-Link in its designed environment...

I'm going to see if I can get one of our DC-DC converters and one of our "not for competition use" PDB and attempt to load them down to their specification and beyond.

Alan Anderson
15-04-2012, 16:41
I am confused as to how you arrived at this conclusion that I don't understand what you're trying to communicate.

Two things you said gave me that impression. You challenged the claim that communication can be maintained at a battery voltage of 5 volts, and you responded to a mention of the PDB's boost-regulated 12 volt output with a reference instead to violating the input specification of the 12v-to-5v converter.

I only brought up the low-voltage capabilities of the system because it looked like you said the robot was losing communication at slightly above 9 volts, and that would be a symptom of either a faulty PDB or incorrect wiring.

It really does not explain why these problems are so hard to troubleshoot off the competition field seemingly regardless of the number of tests run or the equipment.

I haven't had the opportunity to test things using a known-faulty setup, but I have seen hints that the wireless portion of the D-Link quits working if the power drops even a little below 5 volts, while the wired portion continues to function through the sag. If that's indeed the case, then a bad PDB or CPR360 (or incorrect wiring) could result in loss of communication on the field but perfectly good operation in the pit. It's also true that a robot is rarely working as hard when it's up on blocks than when it's trying to turn on a carpeted surface, so a marginal power system is much more likely to reveal itself while the robot is actually running a match.

1. If we had the ability to measure the voltage feeding the AP on the robot at all times that would help, I have a circuit that could perform this function, but because it touches those power leads it's probably illegal on the field.

What rule would keep you from doing this on a competition robot?

DonRotolo
15-04-2012, 16:50
If communication is lost at 9 volts, something is wrong with the robot. Well aware of that Alan. I can't speak for others, but low voltage was not a factor for us.
I'd really like to hear their experiences and what they found out.See my post in some other related thread, and jteadore's post above.
Check for metal shavings in your crio bay and ports.I can't speak for others, but this was not an issue for us, absolutely positively.

As I mentioned in my other post, we are heck-bent on duplicating this at home - after CMP though.

techhelpbb
15-04-2012, 19:22
Two things you said gave me that impression. You challenged the claim that communication can be maintained at a battery voltage of 5 volts, and you responded to a mention of the PDB's boost-regulated 12 volt output with a reference instead to violating the input specification of the 12v-to-5v converter.

I only brought up the low-voltage capabilities of the system because it looked like you said the robot was losing communication at slightly above 9 volts, and that would be a symptom of either a faulty PDB or incorrect wiring.


Well the Team 11 robot is >not< loosing communication at slightly above 9 volts norminal on the drivers station. Our robot just isn't a useful system when the driver's station nominally gets that low so it doesn't really matter if you can communicate to a robot that won't perform. Again that voltage was never that low. I respect you and the point you thought you were making so let's move on.


I haven't had the opportunity to test things using a known-faulty setup, but I have seen hints that the wireless portion of the D-Link quits working if the power drops even a little below 5 volts, while the wired portion continues to function through the sag. If that's indeed the case, then a bad PDB or CPR360 (or incorrect wiring) could result in loss of communication on the field but perfectly good operation in the pit. It's also true that a robot is rarely working as hard when it's up on blocks than when it's trying to turn on a carpeted surface, so a marginal power system is much more likely to reveal itself while the robot is actually running a match.


Wouldn't be applicable to Team 11. As I've stated we've driven both our robots (we have 2) around on our regulation carperted test field, on other competition fields and other practice fields for full tests of opreration more than 50 times. Sometimes quite aggressively. If the carpet induced load was the prime factor we'd have seen it quite long ago. We did not see this problem before this event and that's a whole lot of hard driving by a wide variety of drivers with a wide variety of experience driving our robots (and cars I hope).

Though I absolutely agree that if the loading of the system were the factor not testing on the regulation surface would be a bad idea. I can assure you, at least with our design on Team 11, it's been tested very well. So this isn't the hidden cause for us and I personally know it's also not the issue for several other effected teams.

Additionally when my oscilloscope was used to test another team's robot at the MAR Mount Olive event they did try loading the wheels while the robot was off the floor along with powering up other devices in different ways. The power from the battery did not sag anywhere into the terrority you're describing from what I was told.

What rule would keep you from doing this on a competition robot?

Please see rule R42, part B:

"The wireless bridge power feed must be supplied by the 5V converter (model # TBJ12DK025Z) connected to the marked 12 Vdc supply terminals located at the end of the PD Board (i.e. the terminals located between the indicator LEDs, and not the main WAGO connectors along the sides of the PD Board). No other electrical load can be connected to these terminals (please reference any 2012 Robot Power Distribution Diagram posted on the Kit of Parts site for wireless bridge wiring information."

If you only look at that rule you'd be breaking it if you insert a current sense resistor into the path of the wireless bridge's power or put a high impedance circuit in parallel with it's input. However, I'm aware of this:

Rule R47:

"Custom circuits shall not directly alter the power pathways between the battery, PD Board, speed controllers, relays, motors, or other elements of the Robot control system (including the power pathways to other sensors or circuits). Custom high impedance voltage monitoring or low impedance current monitoring circuitry connected to the Robot’s electrical system is acceptable, if the effect on the Robot outputs is inconsequential."

Problem is, if you're a real stickler it doesn't actually call out the wireless bridge.

If you disagree with my interpretation of this then we are back to my original question as you'd interepret it as an allowed thing to do:

Did anyone measure the D-Link AP power requirements when the robot was running and moving on a competition field (using 2 multimeters, one as a voltmeter and one as an ammeter on MAX/MIN would be a good start...be aware that's not a perfect test)?

So if you think that rule will allow you to measure that information, you can add putting a custom circuit on the D-Link AP power and logging the data (probably into the control system).

Now additionally let me point this out if you create this custom monitoring circuit before build season...like I did...you're dancing with Rule R18 because I don't sell it yet:

"Please note that this means that Fabricated items from Robots entered in previous FIRST competitions may not be used on Robots in the 2012 FRC. Before the formal start of the Robot Build Season, teams are encouraged to think as much as they please about their Robots. They may develop prototypes, create proof-of-concept models, and conduct design exercises. Teams may gather all the raw stock materials and COTS Components they want."

The open source exclusion might not apply because I never fully put up the schematics either:

"Example: A different team develops a similar solution during the fall, and plans to use the developed software on their competition Robot. After completing the software, they post it in a generally accessible public forum and make the code available to all teams. Because they have made their software generally available (per the definition of COTS, it is considered COTS software and they can use it on their Robot)."

Then there's the whole it was built by a mentor not a student that bothers me but isn't against the rules.

If you use commercial multimeters then you avoid this. However, then you have loose test equipment you've put on your robot. Could get it reinspected...but again...had anyone done this? Oh and by the way, the multimeters reading DC aren't the best choice entirely because they will check very infrequently compared to say an oscilloscope so a short surge or drop might slip right past. Can't say how fast the cRIO can monitor that information depends on a lot of factors.

techhelpbb
15-04-2012, 19:44
Double post (lost Internet access sorry).

EricVanWyk
15-04-2012, 20:43
If you would like to add such a monitoring device, you may be able to get an exception with the consent of the LRI and the FTA. They will need to contact FRC HQ, so ask well in advance.

techhelpbb
15-04-2012, 21:17
If you would like to add such a monitoring device, you may be able to get an exception with the consent of the LRI and the FTA. They will need to contact FRC HQ, so ask well in advance.

Thank you. I'm also hoping to get someone's attention at FIRST in the near future to discuss making something like I am describing clearly acceptable in a general sense without the exception.

It might mean I have to make some more and give them away to achieve that but first I need to be clear on the process involved. I'm always happy to help but I'd rather not just blow money and time into the breeze. I have, I hope, already started this process.

Al Skierkiewicz
16-04-2012, 08:10
Guys,
A few thoughts here. The radio boost/buck +12 volt regulator on the PD is capable of a few amps. There is a failure mode on the external 5 volt regulator that effectively turns it into a big resistor making about 7 volts when the battery is at a normal level and then drawing enough current to drag the +12 volt regulator down with varying battery. (I do not have accurate data on this phenomena since a replacement 5 volt regulator fixes the problem.) When connected to a battery instead of the regulated +12 volt output, the failed +5 volt regulator will actually follow the battery voltage when it falls below 7 volts.
The radio is designed to operate at 1.5 amps (less on the newest model) which both the PD and the 5 volt regulator are designed to supply. To my knowledge the bridge does not dynamically adjust power output but does have a power setting in one of the setup screens. (see manual for details) I am under the impression that the return to factory defaults and the WPA encryption routine sets the power output to a normal value. There is a feature of this device that adjusts power delivered to the ethernet ports and that might be where the confusion lies with varying output power. I could not find any reference to RF output power on this device.

techhelpbb
16-04-2012, 11:09
Thank you. I'm also hoping to get someone's attention at FIRST in the near future to discuss making something like I am describing clearly acceptable in a general sense without the exception.

It might mean I have to make some more and give them away to achieve that but first I need to be clear on the process involved. I'm always happy to help but I'd rather not just blow money and time into the breeze. I have, I hope, already started this process.

I just wanted to update this topic as I successfully made contact with U.S. FIRST regarding what I was proposing as a 'custom test circuit' and I now have a bearing on how to get it into at least the approved parts.

Basically I need to get a sample as close to production as possible to the US FIRST KOP team before the end of August. They'll check it over a couple of weeks and once it's been rung out along with all the business details (how it's getting made, in what quantity, how can teams get it, what's the cost, etc...) They'll issue an approval if it's warranted to be included in the list of approved hardware they finalize in September.

If somehow it ends up in the KOP I would need to deliver product to them by October for distribution purposes. Otherwise I could sell or give away product without being able to declare it approved by FIRST for competition usage. Obviously the NDA for the approval of the hardware would prohibit any discussion before January of whether it's approved or not.

My appreciation to FIRST for providing me a great place to start working from.

In the mean time I'm considering making a few simple test versions to send out for testing on real fields. Obviously per Eric's post since these aren't approved you'll probably need some approval from US FIRST if you want to try something like this on a live competition field.

techhelpbb
16-04-2012, 13:14
What is the nominal minimum voltage that can be provided by the DC-DC converter connected to the D-Link AP before things start to become a problem?

Keep in mind that if there's ripple on the DC power supply I need the voltage at the lowest peak of that ripple from the robot 'ground'.

I know it should be 5V, but does anyone know just how far below 5V you can go before you might start having problems?

There must be a window of regulation that is acceptable (for example 4.95V - 5.1V).

Al Skierkiewicz
16-04-2012, 14:44
You know we haven't correlated the possibility of RF pickup on the power wiring to the DAP yet. I haven't thought to check for that but I have seen many robots using a long power cord simply wound up/folded up and secured with a ty wrap. It is possible that large amounts of RF energy are simply walking in on the power wiring. Maybe what we should do is bring a bunch or ferrite chokes with us to St. Louis and give them a try. We should also not rule out intermittant power connectors putting noise on the power input that actually makes it through to the data circuitry.

RufflesRidge
16-04-2012, 14:53
Rule R47:

"Custom circuits shall not directly alter the power pathways between the battery, PD Board, speed controllers, relays, motors, or other elements of the Robot control system (including the power pathways to other sensors or circuits). Custom high impedance voltage monitoring or low impedance current monitoring circuitry connected to the Robot’s electrical system is acceptable, if the effect on the Robot outputs is inconsequential."

Problem is, if you're a real stickler it doesn't actually call out the wireless bridge.


Can't answer your other questions unfortunately but I do believe that the wireless radio is included in R47 as it broadly describes the "Robot's electrical system" as the place that these devices may be connected and the previous sentence refers to "other elements of the Robot control system" which would almost certainly include the wireless radio.

techhelpbb
16-04-2012, 15:09
Can't answer your other questions unfortunately but I do believe that the wireless radio is included in R47 as it broadly describes the "Robot's electrical system" as the place that these devices may be connected and the previous sentence refers to "other elements of the Robot control system" which would almost certainly include the wireless radio.

I've got a couple of things that I'm trying to finish for Championships. Apparently Team 11 is on the way out there. If I have the time I'll clean up what I have and send it out that way.

I'm a little dubious right now of what I can send out there. I have a circuit I made that can be set with a small handheld setup I made, but you don't really need all that stuff if you just use potentiometers and resistors for setting it up. Even with all those bells and whistles the robot mounted part is tiny and light.

Basically I'm trying to get a grasp on the range of input voltages it needs to accept. If the range is large then I should try the digital setup I made. If it's small then I can probably trim up a few common settings and use that.

If I just stick with working around the 5V supply going into the D-link AP it comes down to just how low can that voltage go before we need to call that a problem. hence my question above. I should think that number is in the tenths or hundreths of 5V.

So if we read this as perfectly acceptable then it comes down to actually doing it during a competition. I basically need to move my butt. Otherwise it'll have to be tested off season on a real competition field (I know it works I just never used it in a competition).

The good news is that this thing is at it's heart a latching analog comparator. So even if noise on the DC power supply reduces the voltage for a split second this will see it (and if not things will need to get expensive to test because that would mean it exceeds the performance of the integrated op-amps I used). At the moment all this thing does is constantly look for the voltage below the setting and if it happens it lights an LED and keeps it that way. Seems utterly trivial but it can do it on the moving robot and much more completely than a DMM.

Alan Anderson
16-04-2012, 16:10
What is the nominal minimum voltage that can be provided by the DC-DC converter connected to the D-Link AP before things start to become a problem?

Besides going far enough below 5 volts to cause the bridge to shut down entirely, we don't know what power deviations can cause problems. That's what your proposed datalogger circuit can help us determine.

The good news is that this thing is at it's heart a latching analog comparator...

Oh. That's not going to be very useful in characterizing things. It can only give a useful answer if we already know exactly what the question is.

techhelpbb
16-04-2012, 16:14
Besides going far enough below 5 volts to cause the bridge to shut down entirely, we don't know what power deviations can cause problems. That's what your proposed datalogger circuit can help us determine.

Okay then I guess we're gonna need some room in settings to try a few and see if any work out.

Be aware though, it doesn't log data as much as turn on an LED when the threshold is reached so it has only one equivalent bit of storage.

If that bit is on you went below the calibrated voltage.

If that bit is off you did not.

All the digital attachment does is calibrate the set of digital potentiometers attached to a precision reference.

(Sorry I saw only the top part of your post so I'll catch the rest next post.)

techhelpbb
16-04-2012, 16:19
Oh. That's not going to be very useful in characterizing things. It can only give a useful answer if we already know exactly what the question is.

I was under the impression that someone somewhere had tried to power the D-Link AP and messed around with it long enough to characterize it's voltage and power requirements (I should think FIRST did).

The problem then is that a digital measurement will always quantize the noise. So a digital data logger always runs the risk that something happens between the samples. Most of the cheap commercial units use SD memory so they are really not much faster at catching transients than DMMs (most actually quite a bit slower).

The analog circuit doesn't have a quantized 'blind spot' as much as a response time.

I do have an FPGA based oscilloscope I made with a bunch of computer RAM but that's not nearly as small or as light. It has a mode to essentially log data into a bucket to make a digital strip recorder (for a short time beyond that it needs to dump to USB2). It's also not nearly setup as a production test piece. I did specifically design it to be driven around on the robot.

With this in mind, then I can probably rig up some small DMMs with Max/Min functionality that you could robot mount till you figure out the limits. Either that or I'm really gonna have to move my butt and try to get that FPGA oscilloscope into something I can send unsupervised.

techhelpbb
16-04-2012, 16:35
Another possibility is I have a circuit I use in a production item I worked on that can essentially hold the lowest voltage you present to it till you reset it (within reason).

It's essentially an analog minimum detector. Sort of a minimum voltage sample and hold.

If I slap that together stand alone you could put it in the robot and drive around then measure the output with a voltage meter. It would tell you how low it went at any time it was operating.

It wouldn't tell you so much how low the D-link AP supply went before it malfunctioned but it could tell you how low that D-Link AP supply got during the match.

Actually perhaps with a few of those someone could see what the robots that kept working produced for power to the D-Link AP and what the robots that did not keep working produced for power to the D-Link AP.

I think you'd have a similar process anyway with a datalogger would you not?
This would be much cheaper and probably collectively faster to quantify the results.
I'll dig into my stuff tonight and see what's involved with making them.

mikets
16-04-2012, 16:48
This is a very long thread. I did not follow it very closely. So I apologize if I am missing the point. If the goal is to log the voltage data of the Wireless bridge during competition then why can't we do it in software? Assuming the cRIO is still up and running and only wireless bridge is losing communication, can one feed the wireless bridge's power (5V) to an analog channel and write code to log the data? Heck, even the SmartDashboard can be used for that. This year, our team has written a generic datalogger that allows us to do post mortem analysis of anything we want to log. So far, we are using it to tune PID control and evaluate data filter algorithms. But it can be used for anything.

techhelpbb
16-04-2012, 17:05
This is a very long thread. I did not follow it very closely. So I apologize if I am missing the point. If the goal is to log the voltage data of the Wireless bridge during competition then why can't we do it in software? Assuming the cRIO is still up and running and only wireless bridge is losing communication, can one feed the wireless bridge's power (5V) to an analog channel and write code to log the data? Heck, even the SmartDashboard can be used for that. This year, our team has written a generic datalogger that allows us to do post mortem analysis of anything we want to log. So far, we are using it to tune PID control and evaluate data filter algorithms. But it can be used for anything.

The only reliable place you could put the data would be the cRIO itself. At the point you need it most you very probably won't have wireless access back to the driver's station (chicken and egg problem). Additionally I did go look for the way to write to the non-volatile storage on the cRIO in Java but didn't get very far, not sure you have a lot of storage to play with there.

It would require someone to tinker with the code in the cRIO and that would add another task for the cRIO to handle (which might alter the performance of the system in general).

So you'd probably need to write that software in each language you might encounter.

Not to say it can't be done, but the cRIO still suffers the same issue that you don't know just how often it can check that voltage.

What I'm proposing is able to watch the voltage *much* faster and without tinkering with the robot's software.

Though I fully admit there are circumstances in which it might not matter. For example if you're saying you measure an analog voltage used in a PID loop feedback it probably won't matter with that because that voltage changes as the consequence of mechanical movement which by comparison to a power transient is very slow.

mikets
16-04-2012, 17:15
Since we already have the datalogger (written in a C++ class), if we encounter communication issues in St Louis, we can certainly hook up an analog channel, turn on the logger and see what we get.

techhelpbb
16-04-2012, 17:17
Since we already have the datalogger (written in a C++ class), if we encounter communication issues in St Louis, we can certainly hook up an analog channel, turn on the logger and see what we get.

I'm curious, where does it store the data it collects?

mikets
16-04-2012, 17:24
Though I fully admit there are circumstances in which it might not matter. For example if you're saying you measure an analog voltage used in a PID loop feedback it probably won't matter with that because that voltage changes as the consequence of mechanical movement which by comparison to a power transient is very slow.
If the power transient is too fast, I expect the wireless bridge would have some sort of regulator or capacitor inside that will filter that out. Also, assuming the 12V to 5V DC-DC converter is not defective, it should filter out some amount of high frequency voltage glitches. So I would assume we are only interested in relatively sustaining drop in voltage for which a decent software datalogger should be able to pick up. I would be surprise if the DLink bridge cannot take some minor power glitches.

mikets
16-04-2012, 17:25
I'm curious, where does it store the data it collects?
It stores the data as a CSV file on the cRIO. Post mortem, we FTP the file out to our laptop.

techhelpbb
16-04-2012, 17:33
If the power transient is too fast, I expect the wireless bridge would have some sort of regulator or capacitor inside that will filter that out. Also, assuming the 12V to 5V DC-DC converter is not defective, it should filter out some amount of high frequency voltage glitches. So I would assume we are only interested in relatively sustaining drop in voltage for which a decent software datalogger should be able to pick up. I would be surprise if the DLink bridge cannot take some minor power glitches.

Have you had communications issues at other events and would you mind performing that test anyway?

I won't disagree about the filtering I just want to point out that Team 11 apparently already had a defective DC-DC converter powering the D-Link AP that caused us quite some headaches. That was among the points being tested (it's back a few pages in the topic).

Ultimately this possibility was already mentioned before and if you've got it working for you, I certainly won't suggest you shouldn't give it a shot. The more data and eyes the better.

If we knew the sorts of power issues that aggravate the D-Link AP's functionality we could detect and prevent them.

mikets
16-04-2012, 17:42
Have you had communications issues at other events and would you mind performing that test anyway?
No, our regional event went very smoothly. There were very few field issues. Our last event was end of March. So our next chance to test it is St Louis.

techhelpbb
16-04-2012, 18:16
Besides going far enough below 5 volts to cause the bridge to shut down entirely, we don't know what power deviations can cause problems. That's what your proposed datalogger circuit can help us determine.



Oh. That's not going to be very useful in characterizing things. It can only give a useful answer if we already know exactly what the question is.

I just realized that this may not entirely be applicable to the problem.

If we don't know what the minimum voltage that causes problem with the AP we do know that many teams are not having problems with their AP when they are not on the competition playing field.

We can certainly get the output voltages to the D-Link AP from a bunch of robots AP power supplies and get a feel for the deviation from AP power supply to AP power supply between robots and we can do that in the pits. If someone was willing to you could even collect AP power supply ripple data in the pits.

The voltage shouldn't really be dramatically going below that measurement.
Especially if we all agree that the power supply for the AP should be able to operate down to a battery voltage of around 5V.

Anything wrong with this idea?

Basically I could set up my little low voltage indicator board to be just below that measurement voltage (to account for some ripple).

Could even do a little statistics on the measurements from a bunch of robots and reduce the limit till statistically it would be an outlier if your AP power supply was that low. You might get a few false positives but I would think that would even out really fast. The bigger the set you work with the faster you should be able to weed out minor variations in the systems especially if you start with robots with and without AP power supply issues.

Actually come to think of it we could probably collect that data from just about all the teams that were willing to provide that even if they have their robots at home right now and are done for the season. In that case, however, probably best to poll if they had any communications issues that they ever noticed and where.

Alan Anderson
16-04-2012, 18:33
If the power transient is too fast, I expect the wireless bridge would have some sort of regulator or capacitor inside that will filter that out.

That's a reasonable assumption, but it would be good to examine the actual circuitry to make sure.

Also, assuming the 12V to 5V DC-DC converter is not defective, it should filter out some amount of high frequency voltage glitches.

Even if it is operating as designed, it could conceivably be creating some high-frequency ripple on its output. It's not a simple linear regulator. Again, some examination is in order.

Deetman
16-04-2012, 18:47
I see nothing wrong with that train of thought techhelpbb.

Regarding the D-Link's immunity to power transients I would agree that it should have appropriate filtering on the input to handle short transients. What we don't know is how it handles any kind of ripple on the power input and what the characteristics are of the regulator's output under various loads. Any noise we are getting from both the PDB and the Regulator is not characterized (publicly) as well as what effect, if any, the length of wire runs from the PDB to the regulator to the AP have. The D-Link's AC-DC power supply is presumably chosen to meet the input power requirements, but are we meeting those under all robot operating circumstances?

We know that there is an apparent failure mode of the regulator that causes the D-Link AP to behave anomalously. What I'm unsure of is what failure modes are we introducing in the D-Link over time? The environment we are subjecting the D-Link to is hardly what I assume to be the design environment of sitting on a shelf providing wireless connectivity. Are these the only factors or is the power input also causing a negative effect over time?

Are teams that are seeing communications issues using an old D-Link from previous years or have they seen issues with a new regulator and a new D-Link?

I'm really interested in performing some bench testing with the D-Link and at a minimum the regulator to help iron out some of these unknown characteristics. I have most of the measurement equipment I'd want here at home, but my one remaining issue is getting a good enough power supply to test various input power levels with differing amounts of noise, etc. I have everything I'd need to run the tests at work, but I'm unsure of my ability to do any of them (on my own time of course).

DonRotolo
16-04-2012, 18:55
Maybe what we should do is bring a bunch or ferrite chokes with us to St. Louis and give them a try. Good idea Al, I'll be doing that for certain.
If the power transient is too fast, I expect the wireless bridge would have some sort of regulator or capacitor inside that will filter that out. I'd be careful with that expectation.

But adding a big honkin' capacitor on the power input to the radio should filter out any millisecond transients. I need to verify that the rules permit this. I'm thinking several dozens to a hundred microfarads (whatever's in the junk box)

EricVanWyk
16-04-2012, 19:15
But adding a big honkin' capacitor on the power input to the radio should filter out any millisecond transients. I need to verify that the rules permit this. I'm thinking several dozens to a hundred microfarads (whatever's in the junk box)

Depending on the control mode of the regulator, adding a large amount of load capacitance might cause the control loop to go unstable and create additional noise. Proceed with an Oscilloscope.

techhelpbb
17-04-2012, 05:13
I'm going to put a website to collect measurements from the D-Link AP robot power supplies so we can perform a statiscal analysis. I'll put some direction on the website so people understand what the process to get the measurements should be so we get them to provide us the most practical data when they enter it into the website.

When I'm done I'll put the link in here. This way everyone with a 2011 or 2012 robot can help with the testing. That could net us a few thousand points of data while more evenly distributing the work load.

There will obviously be some issues with test instrument calibration but odds are the issues from that will be statistically small.

Al Skierkiewicz
17-04-2012, 08:00
Tech,
I think we need to slow down here. Most of the problems are not pointing to a power supply problem with the radio. It has been our experience that power supply problems with this radio manifest themselves by total radio reset which by most accounts lasts nearly 50 seconds. With the thousands of PD's in use thus far, only a handful have ever had a real +12 volt power supply issue and of those some were from mishandling of the PD. Those power supplies are very well designed and do more than they were originally intended to do. Remember that this power supply is designed to source a lot more current than the demand from the regulator and radio combined. When the supply faults it doesn't brown out, it just goes away. As to the 5 volt regulator, this is a 25 amp device if memory serves, and it won't brown out either. What ripple might be present on the PD power supply is high frequency and easily shunted by the high frequency impedance of the input cap(s) on the DLink.
The fixes I discussed earlier come from the knowledge that teams using the long power wiring that comes with the radio may run it near some high noise sources that are generating some very high frequencies. This noise can and does couple into wiring and in some cases, will actually travel on the outside of the wire and right into the radio bypassing everything you are suggesting.
At this point, the really offensive failures are either repetitive (and seemingly regular) losses in packets or complete loss of communications for no other obvious reason. I have seen a few robots where all of the data tell us that the robot is working, there simply isn't any robot commands. Remember that the FMS this year is watching far more things that ever before. They know what the robot battery is doing, what packets are being transferred, when the robot is communicating with the field, even if the robot is doing nothing.
If you want to look at power supply issues, you must first rule out, bent contacts in the radio, poor solder jobs on the wiring (if soldered at all), poor wire restraint and poor mounting of the radio. More than half of all radio complaints that are brought to inspectors are obvious problems with the coaxial connector supplying power to the radio. It is either loose, the wrong size, not secured, improperly insulated, the radio connector has pulled away from the board, or there is an improper termination to the 5 volt regulator. In the majority of the remaining issues (when pressed), the team had failed to use the regulator for some period of time. While in many cases the radio continues to function, the power supply has been stressed or damaged. Without doing some physical inspection of these radios, all statistical data becomes flawed.
As to the application of devices between the PD and radio, the answer is there is no rule that allows anything in this path when used on the robot. I know that there are people working on the problem and that bench testing will reveal issues if they exist.

FrankJ
17-04-2012, 08:16
Has anybody considered that the in question DAP has a cracked pc board? Cracked boards manifest all kinds of strange behavior. Since the DAPs really were not designed for the robot environment, maybe they are more fragile than anybody thought.

Gdeaver
17-04-2012, 08:46
We have a DC to DC switching power supply feeding a DC to DC switching power supply feeding most likely another switching power supply inside the D-link. I believe that this sets up a strong possibility that under high load a instability develops and a transient of very short duration gets through to the digital parts in the D-link. Any part that is out of spec in the chain would greatly increase the chance of this happening. Probably the only way to catch this is with a very high speed circuit or a scope. Caps are a prime source of this. That's why mother board manufactures boast about Japanese caps in their power supplies. I may be wrong but don't most N stuff have gain scheduling based on background RF levels? Why shout if its quiet. This could explain why the problems are seen on the field. Phily was a very loud rf environment based on the behavior of my I-phone while down on the floor. There are 6 robots all shouting in a loud rf environment. The D-links most likely were at max gain and highest current draw. If the power supplies do turn out to be a point of failure then there are automotive power supplies that are designed to handle the nasty automotive environment. (load dump, static, ect.)

MaxMax161
17-04-2012, 09:04
This has gotten to be a very long thread so I'd like to try and summarize the important points for anyone new who comes along. If any of this is inaccurate, incomplete, or over complete please tell me so I can edit this post.

Thread Summary

Main Symptom: On the field, after the auton bell rings and before teleop ends, robots loose communication with the driver station.

Main Cause 1: Camera stressing bandwidth. Manifests in many lost packets and high trip times. Very rare when getting a 320x240 image at 30fps and 30 compression. The un-updated smart dashboard defaults to the settings on the camera which default to a 640x480 image, the default dashboard defaults (say that three times fast) to 320x240 settings.
Fix: Smartdash- reconfigure the settings on your camera. Stockdash- make sure you don't ask the camera for anything specific unless it's less then the default.

Main Cause 2: Power problem to the radio. Manifests by taking around 40s-50s for communication to come back while radio lights turn off and then go solid while rebooting. Most often a problem with the 5v-12v converter or the PDB.
Fix: Replace the 5v-12v converter and/or the PDB.

Other causes: ?


Other Symptom 1: Robot connects to field, robot comm dies, we reboot robot, everything is fine indefinitely.

Causes: ?


Other Ideas Currently Floating Around:

Idea 1: ?

techhelpbb
17-04-2012, 09:34
Tech,
I think we need to slow down here. Most of the problems are not pointing to a power supply problem with the radio. It has been our experience that power supply problems with this radio manifest themselves by total radio reset which by most accounts lasts nearly 50 seconds. With the thousands of PD's in use thus far, only a handful have ever had a real +12 volt power supply issue and of those some were from mishandling of the PD. Those power supplies are very well designed and do more than they were originally intended to do. Remember that this power supply is designed to source a lot more current than the demand from the regulator and radio combined. When the supply faults it doesn't brown out, it just goes away. As to the 5 volt regulator, this is a 25 amp device if memory serves, and it won't brown out either. What ripple might be present on the PD power supply is high frequency and easily shunted by the high frequency impedance of the input cap(s) on the DLink.
The fixes I discussed earlier come from the knowledge that teams using the long power wiring that comes with the radio may run it near some high noise sources that are generating some very high frequencies. This noise can and does couple into wiring and in some cases, will actually travel on the outside of the wire and right into the radio bypassing everything you are suggesting.
At this point, the really offensive failures are either repetitive (and seemingly regular) losses in packets or complete loss of communications for no other obvious reason. I have seen a few robots where all of the data tell us that the robot is working, there simply isn't any robot commands. Remember that the FMS this year is watching far more things that ever before. They know what the robot battery is doing, what packets are being transferred, when the robot is communicating with the field, even if the robot is doing nothing.
If you want to look at power supply issues, you must first rule out, bent contacts in the radio, poor solder jobs on the wiring (if soldered at all), poor wire restraint and poor mounting of the radio. More than half of all radio complaints that are brought to inspectors are obvious problems with the coaxial connector supplying power to the radio. It is either loose, the wrong size, not secured, improperly insulated, the radio connector has pulled away from the board, or there is an improper termination to the 5 volt regulator. In the majority of the remaining issues (when pressed), the team had failed to use the regulator for some period of time. While in many cases the radio continues to function, the power supply has been stressed or damaged. Without doing some physical inspection of these radios, all statistical data becomes flawed.
As to the application of devices between the PD and radio, the answer is there is no rule that allows anything in this path when used on the robot. I know that there are people working on the problem and that bench testing will reveal issues if they exist.

Why is it a problem to create a trivial 20-30 minute exercise (per participant) to get an idea of this value?

If I had the voltage that was too low we could easily test virtually all supply issues up to the radio with those little LED modules I have. Actually come to think of it, I don't know how you can test that voltage at all if you don't know what the requirement actually is.

It wouldn't tell you exactly where the problem was (you could further isolate with more of those little modules) but it would eliminate that issue or indicate it's presence just by looking at it.

So given how small the effort why not? Certainly there's no reason I can't make the site.

To be clear I'm not saying your wrong, I'm just asking why not be sure there's no problem hidden there?

As it stands I having been watching the current method of troubleshooting and it is dramatically extended as we hunt through that great big list you offered dancing back and forth between things that effect power quality to the D-Link AP and may be intermittent and things that could additionally be wrong.

If we knew we had a power quality issue during the match that's probably half the effort right there and it's entirely possible you could have a power quality issue during the match but not any other place you can measure it during a competition.

(I've sent a question about rule R47 and the legality of using a circuit to measure that D-Link AP supply voltage to someone at FIRST. Let's see what they say.)

Al Skierkiewicz
17-04-2012, 12:27
I just don't want people to get the impression that there is a power supply issue when we are quite sure that this is not the case. The issue may not even be with the radio.

techhelpbb
17-04-2012, 12:58
I just don't want people to get the impression that there is a power supply issue when we are quite sure that this is not the case. The issue may not even be with the radio.

There are really two ways you can look at checking the D-Link AP power supply. Looking because you think it's the problem or looking because you don't think it's the problem.

If the voltage is too low it could still be an indicator of a wiring issue not a power supply module issue (a module being the PDB or DC-DC converter). However, first you have to know that there's a problem or not there.

It's that ambiguity that lets the imagination roam. I'm not looking to offload blame just find a process that's tangible and quantitative.

EricVanWyk
17-04-2012, 15:43
I just don't want people to get the impression that there is a power supply issue when we are quite sure that this is not the case. The issue may not even be with the radio.

I agree. I'm much more interested in things like firmware versions, rogue windows update services, and proximity to noise sources. Now its a matter of finding why some `bots are more sensitive than others.

The power supply issue you (Tech) are tracking, transient undervoltage, has a known fault signature that does not match reported issues. It is possible that a transient spike could affect things, but I'd expect more of a buzz than a spike in this situation. Gdeaver might be on the right track with his "switching cacophony" theory. Please confirm with your own measurements - we'd love to see the numbers. But please, no more back-to-back-to-back posts!

dcherba
17-04-2012, 15:53
For three years we have been running c++ code build from the default program offered by FIRST. In match 10 at the Michigan championship we moved in hybrid and came to a complete stop with loss of packet from the FMS. The only change was an added camera at low FPS rate. Turns out the CRIO was throwing away the FMS control packet because it was overloaded with the empty code stubs for the continious teleop. This was a totally unexpected consequence of adding a camera that had nothing to do with the control program except the CRIO would see those packets and throw them away. That was enough. After reading this set of symptoms I wonder if the same continous blocks are also in labview not being used and are having the same intermittent effect.

Considering how many years I have been doing programming it was a sobering revelation to see something so simple cause the problem.

Every program I help the students write have performance timers in them and according to our data the code was being executed every 20-100millisec but not the TCP packet handler..

When everything has been ruled out look at the impossible..

jteadore
18-04-2012, 08:47
Dave,
How is your camera connected? Do you connect it through the crio or direct to the radio?

dcherba
18-04-2012, 11:02
our cameras were connected to the wireless and not through the crio.
It was only when we added the second camera that the problem really appeared.

MaxMax161
20-04-2012, 09:08
Was your 2nd camera connected to your c-rio or did you configure it as something like 10.32.34.12 and also get it through the dashboard?

Steve Warner
21-04-2012, 16:02
This subject has been covered pretty thoroughly but just to make sure I understand: Can a Dlink restart be caused by ANYTHING other than loss of voltage or low voltage at the DC input to the Dlink?

techhelpbb
21-04-2012, 17:21
This subject has been covered pretty thoroughly but just to make sure I understand: Can a Dlink restart be caused by ANYTHING other than loss of voltage or low voltage at the DC input to the Dlink?

Of course and both Al and Eric above have touched on some ways that the power quality might be fine. Right up till you get into the connector on the D-Link AP.

Others have mentioned a few more communications traps that lie in the software. All of these are applicable and of merit as well (let's not assume that all communications disconnects are D-Link AP power induced issues).

Also let us all remember that voltage is just part of DC power, current being the other aspect (obviously time plays a role in AC power which we aren't discussing). Right now voltage is easier for us to tinker with without risking adding trouble that might not be obvious.

I did follow up on my question with FIRST and it was decided that for many valid reasons it needs to be in the official Q&A and so that's exactly where we submitted it by Friday:
https://frc-qa.usfirst.org/Questions.php
(Just click search it's not officially answered yet.)

At the moment it appears that the answer to using the cRIO to monitor the D-Link power supply input is not permitted based on what both Al and others have said. Then again there is clearly some dispute on this based on some of the discussions I've had. The use of my voltage monitor sensors could be considered a custom circuit, but the outlook for that being permitted at Championship seems bleak. I remain in discussions with FIRST KOP to get them approved for future application.

The entire point of all that discourse was to find a way to accelerate the diagnosis and remedy of tricky power quality issues (some of them might be wiring induced) so that when issues other than power quality are present you can more quicky and clearly narrow in on them. Right now often we don't know what's going on with the power quality to certain accessories when the robots are moving. This is the sort of problem that comes up with Jaguars (because of the added complexity and because they draw the most power while moving, that's why I made those units in the first place) and again with the D-Link AP because of the complexity and issues related to moving.

I will abide by whatever FIRST's response is as that is only proper on the competition field. Off the competition field we have much more latitude.

rsisk
21-04-2012, 18:53
This subject has been covered pretty thoroughly but just to make sure I understand: Can a Dlink restart be caused by ANYTHING other than loss of voltage or low voltage at the DC input to the Dlink?

Going over the bumps hard in 2010 was found to cause the DLINK to reset. I assume same thing could occur going across the bump in 2012, or flying off the bridge

Steve Warner
21-04-2012, 20:05
I guess I meant to ask, if we know the Dlink is restarting does that mean there is a problem somewhere on the robot itself and not on the driver station? I am asking about our practice bot so the FMS is not involved.

Mark McLeod
21-04-2012, 20:55
If the DLink is restarting, then yes it's very, very likely a problem on the robot itself.
There may be a rare case where the DLink can be made to reboot by an overwhelming flood of network traffic, but it's not common.

techhelpbb
21-04-2012, 21:08
I guess I meant to ask, if we know the Dlink is restarting does that mean there is a problem somewhere on the robot itself and not on the driver station? I am asking about our practice bot so the FMS is not involved.

If the D-Link restarts you usually loose access to the robot for 45-50 seconds.

Does that describe the symptom of your practice bot issue?

Assuming you know your battery is fully charged:

If you have a decent DMM put it on voltage min/max and attach it in parallel to the output of the DC/DC converter (even without knowing the exact voltages at which things are a problem) you shouldn't be reading less than 4.5V as a minimum. If you see something lower check your wiring over and make sure you've got the DC/DC converter plugged into the proper power on the PDB not a breaker. If all your connections are correct, then perhaps put the DMM on min/max in parallel with the output of the PDB feeding the DC/DC converter that shouldn't be less than 11.5V (again I'm approximating). If none of that finds the problem try a different D-Link AP (easy but means you need another unit). Then a different DC-DC converter (few wires and again another unit that is a little cheaper than the D-Link AP). Then a different PDB (not so easy and again you need another unit).

Al Skierkiewicz
21-04-2012, 21:48
Going over the bumps hard in 2010 was found to cause the DLINK to reset. I assume same thing could occur going across the bump in 2012, or flying off the bridge

I am going to bet that the majority of these restarts were related to problems with the power connector on the radio itself or improper termination of the gray connector on the PD. We tried to correct as many of these problems as we could during inspections. So many teams stretch the power cable tight to the point where a little vibration will open the connection to the outside of the coaxial plug. The termination problems usually result from too large a wire that is either stripped too short or too long. The short strip will actually push out of the connector while the long strip will invariably short due to exposed conductors.

techhelpbb
21-04-2012, 23:17
Well another point to make is that if we knew what the nominal current going to the D-Link AP was from the DC-DC converter we could monitor that with the min/max function on a suitable DMM setup as an ammeter and that would actually tell us if the connector on the D-Link AP and even where it meets the PCB inside the D-Link AP is making good connection. If it wasn't making good connection the current flow would be reduced momentarily or longer. Course this requires putting the DMM in series with one side or the other of the circuit between those points so it's intrusive unless you can find a suitable clamp on DC current meter.

(For a real world example of this technique consider the resistor at the end of alarm system contacts. That resistor serves to draw a relatively fixed current from the alarm contact loop. Too little current and either the resistor is the wrong value, or there's a bad connection adding to the circuit resistance or impedance, or a contact is open (alarm). Too much current and either the resistor has had another resistor of near or lower value put in parallel with it (or there's a short), or a contact is closed (alarm). With the general exception of a contact being in alarm the resistor would warn of a tamper with that circuit. Same idea here except we're not looking for security reasons.)

I just can't advocate right now that at championship we should measure current to the D-Link AP on the robots while they move. To me that introduces the risk that someone might not realize that the AndyMark breakout module could be a singled ended input. I've seen a ton of data acquistion tools wrecked by this simple error. In order to measure current through a current sense resistor by measuring voltage in parallel with that sense resistor in the path of the circuit to the D-Link AP (then using E=I*R which is Ohm's law); you'd need to make sure the other side of the voltage input across a current sense resistor was robot ground. In theory you could put the current sense resistor on the ground side of the circuit between the DC-DC converter and the D-Link AP, but that assumes the DC-DC converter is also properly connected to robot ground and as Al points out there are some issues with the connector that feeds that DC-DC converter from the PDB (which is where the robot ground for the DC-DC converter comes from). A DMM avoids this issue because the sense resistor is within the DMM and the voltage inputs are differential on that sense resistor within that DMM. My modules avoid this issue because their input is also differential.

I am also unclear if a DMM somehow strapped to a competition robot would be reasonable. It would likely have it's own battery and that starts running into odd territory as well. I know a COTS computing device can have it's own battery, but can it connect to the robot like a custom circuit might be able to? Does the additional weight of the DMM (much heavier generally than my little circuits) count against the robot's fielded weight limit (because a COTS computing device does)? I would think so...but again I'm unclear on it. Really these are questions all their own. However if FIRST says no to measuring the voltage to the D-Link AP then in spirit for the sake of this topic there's no reason to bother to ask the additional questions except to set a precident for the future (besides even if we ask this today it's likely that by the time we get an official answer it'll be too late to apply to the championship).

Al Skierkiewicz
22-04-2012, 11:23
Again,
No valid data would be collected, as the radio RF output changes so does the current input.

techhelpbb
22-04-2012, 11:31
Again,
No valid data would be collected, as the radio RF output changes so does the current input.

You are correct about the current draw changing over time. However, that's why you need to know what the maximum and minimum current the radio in proper operation should draw before this is useful. Again I can't find that information anywhere and the only way to really get it is to measure it on the robot moving in that environment. Again that's moving in both the FIRST field environment and a non-FIRST field environment so we eliminate any possible difference in current draw from that change to the robot's operating environment. (At least the official field dimensions limit the distances from the field radios the robot could be.)

Sure, there is the chance that the connection could be just loose enough that when the current draw for the radio should be high, the radio can only draw a current that's not lower than the lowest current it might draw normally. However, we can not merely assume that such a complexity exists we can only deal with that issue when we can quantify just exactly how much current maximum and mininum we are talking about.

I would suspect since the D-Link's AP primary purpose is to power that radio that there is a relatively finite upper and lower limit to how much current it draws for the RF amplifier once that RF amplifier is in operation. I would think that the RF amplifier would be disabled for a period during resets. Since you need to test with a properly operating robot to find normal limits one could reset the min/max function on a DMM before moving when the D-Link AP is now already connected and operating. Also we can tell when the D-Link AP resets because we loose communication so we can ignore that event if we use the cRIO to check for communications. I realize that the cRIO would have no effect on a stand alone DMM, but as a data collector we could use it that way and with my circuits it can be used that way as well. Therefore we could monitor the current draw during non-reset conditions only and that should greatly reduce the variation between the upper and lower measured currents, thereby making the troubleshooting value much greater.

If a connection is loose enough to prevent the D-Link AP from being able to draw the current it would normally draw (intermittently or otherwise) it should be loose enough that it pushes the current draw below the normal minimum at some point, perhaps not immediately, but definitely when it exerts itself as a problem. Otherwise the D-Link AP would always be straving for power and we know that the ratings of the FIRST provided power supply gear say that situation should not happen. Course the FIRST provided power supply parts are voltage regulated so if we measure the voltage at the same time we measure the current and we see the voltage move we should suspect a power supply component malfunction because those parts regulate the voltage and are supposed to accomodate the varying current draw while keeping the voltage where it was designed to be (within reason).

I know that this methodology works because I use it troubleshooting components all the time. The key is to characterize the power requirements of the components under normal operating conditions so we know what abnormal operating conditions would look like. If we don't characterize the components then obviously we can't know what we should not see either. So we have 2 needs. We need the characterization in the relevant environments and then we need a good reliable way to catch the exceptions. My little circuits and to a lesser extent the cRIO offer the (possibly legal if FIRST says so) means to catch the exceptions on the moving robot. With proper statistics we could use both methods to characterize the components as well, never mind simple DMM readings (the more readings the more applicable the baselines to all the FIRST robots).

The real problem is that to be thorough we need to do this before, between and during events. Right now usually we focus the troubleshooting during the events and that timetable degrades our ability to apply this method by lack of time, resources and quantifiable information. Once we characterize those D-Link AP power supply requirements then it's a simple question of whether the problem system falls within the requirements or outside of them and where. That's the sort of information you can tell by looking at a lit LED.

Al Skierkiewicz
22-04-2012, 17:50
tech,
What is your name as this just doesn't seem to be right not to address you with a common name? There are so many conditions that are variable with the radio you cannot come up with a 'normal' current draw. i.e. how many LEDs are lit, how often are they flashing, is there a high or low data rate on the ethernet ports, is there a high SWR on either or both of the antennae, is the radio searching for clear channels, how often is the router searching, is the 5 volt convertor at the high end of it's specified output or the low end, what frequency is the internal switching supply actually running at, is the radio deciding to increase output RF power, what is the internal temperature, how often is it resending packets, how many null packets are being sent? Need I go on? In any combination of the above, I would bet that the current demand varies at least 250 ma and maybe as much as 500 ma. With all of these things changing at any one point in time, how can anyone collect valid data?
How will you interpret the data you collect? If the radio resets, is the drop in current caused by the reset or the reset by a drop in current? If the current drops is it caused by the PD input, the PD power supply, the 5 volt convertor, bad wiring in any or all of the connections or an intermittent coaxial plug? Or is it a bad battery? how do you know if the software hasn't caused a reset due to something buried in the code?

DonRotolo
22-04-2012, 18:40
Although our problem could be something surrounding power supply, we're fairly sure it's not. We've replaced that entire chain, have never actually seen power cycle on the radio, and are now chasing down something in cRio software that could cause the radio to reset as though power was cycled.

Yes, we'll try a ferrite on the power lead, and perhaps a moderate capacitor too (Eric convinced me that Big Honkin' might not be good).

To Steve Warner's question: We didn't think so, but now we are not so sure.

See MaxMax161's post for a description of what we're seeing now: Always fails when first connected to the field, reboot just prior to the match start and it is always good. :confused:

techhelpbb
22-04-2012, 18:55
tech,
What is your name as this just doesn't seem to be right not to address you with a common name?


It's Brian. Like the brain in the jar only backwards in the middle ;) .


There are so many conditions that are variable with the radio you cannot come up with a 'normal' current draw. i.e. how many LEDs are lit, how often are they flashing, is there a high or low data rate on the ethernet ports, is there a high SWR on either or both of the antennae, is the radio searching for clear channels, how often is the router searching, is the 5 volt convertor at the high end of it's specified output or the low end, what frequency is the internal switching supply actually running at, is the radio deciding to increase output RF power, what is the internal temperature, how often is it resending packets, how many null packets are being sent? Need I go on? In any combination of the above, I would bet that the current demand varies at least 250 ma and maybe as much as 500 ma. With all of these things changing at any one point in time, how can anyone collect valid data?


The power supply input of the D-Link AP should have some finite capacitance as was discussed eariler. That finite capacitance should remove most of the quick high frequency noise and because you'd be measuring with the DC-DC converter still connected to that circuit it's output energy storage will stomp the high frequency spikes as well. Between the 2 if the energy storage is not enough and the circuit current reaches basically zero then we have a problem anyway because that would mean the voltage regulation of DC-DC converter and/or the 12V supply before in the PDB would have failed to perform.

More to the point we don't need to do millsecond by millisecond data logging. If the core of the D-Link is powered up it should draw at least a minimum finite amount of power as long as anything at all is running. As long as we catch that minimum anything more than that is adequate to a point (no we don't have intimate knowledge of all the current goings on in the D-Link AP but we know we delivered at least the minimum it should normally require). It wouldn't matter if more than that was 1A more or 500mA more unless that extra current exhausted the energy storage needed to make the DC-DC converter function in which case the voltage regulation would suffer. We just need to know how low that current can go normally. If we were using this to diagnose the specific functions within the D-Link AP then we'd need to worry about making all those extra measurements. I just want to use this to make sure that the D-Link should be getting at least the minimum current it should get to even sit idle. I'm willing to bet that the minimum current it normally draws with the radio powered up is still a pretty noticable amount above nearly zero current, like perhaps 100mA but probably more (and again we're not talking during the initial power-on reset when the radio is likely entirely off). Obviously the number of ports in use will have some effect but it should be towards the maximum, not the minimum.

Idealy this sort of characterization testing should be done with a current probe on a high speed oscilloscope suitable for mixed signal analysis. Obviously that's asking a lot unless someone has access to a bunch of robots and the tools. However, if you use even a DMM a bunch of times and different working robots a statistical analysis should reveal a good starting point.


How will you interpret the data you collect? If the radio resets, is the drop in current caused by the reset or the reset by a drop in current?


In theory if we're talking about the minimum current with the radio on, the drop in current below that minimum should happen only when you first power-on the robot and the radio in the D-Link AP radio is off. After that any drop below that point is a signal that:

1. A wrongly characterized minimum (the fix for this is more checking of the minimum, the more the better).
2. The D-Link AP has reset from a software issue (more on this later).
3. The D-Link AP has bad wiring to it possibly including the barrel connector.
4. The D-Link AP is actually defective (bad connector, damaged internals).
5. The current output from the DC-DC converter dropped, but if you're watching the voltage output of the DC-DC converter at the same time you could see if it's voltage regulation related.


If the current drops is it caused by the PD input, the PD power supply, the 5 volt convertor


It's a voltage regulated power supply. Check the voltage between the DC-DC converter and the D-Link AP while you check the current. If the voltage stays where it should be (and you know because you characterized that as well) and the current doesn't then your problem is beyond the DC-DC converter. You might have a bad D-Link AP, but swapping that is pretty simple. A properly functioning voltage regulator should have a very easy time hitting it's target voltage with less load.

A bad DC-DC converter would have hard time maintaining it's output voltage with the D-Link AP connected and fail to source at least the minimum current. At first you'd see ripple, then you'd see the underlying switching clipped. I could see how a good DC-DC converter could have a hard time maintaining it's output voltage at the maximum current and above...but that's not what we're checking when we look at the minimums. We can of course check for exceeding the maximum current as well which would remove overload as well. The current power supply is supposed to be able to provide more current than the D-Link AP should draw. With that knowledge we can find sloppy power towards the minimum and overloads at the maximum. An overload would show a lower than reasonable output voltage and a higher than normal current.

If you really want to push the current draw of the D-Link to a high point stick it in the best approximation of a Faraday cage you can make. It's gonna have to turn the RF amplifier's power all the way up eventually to try to communicate. If you want to really push up the link and the Ethernet switch patch the cables back to the ports (that's going to push the traffic loading to the roof unless they've trapped a loopback in the switch firmware). Do the loopback to all the ports. If that is trapped in the switch firmware. Stick a laptop and another switch in the box. Cross patch the switches and smurf (hacking term) the administration page of the D-Link AP.

, bad wiring in any or all of the connections or an intermittent coaxial plug?


I can readily see 5 things that would reduce a current below the characterized minimum:

1. A wrongly characterized minimum.
2. Bad wiring or a bad coaxial plug (obviously if the plug is shorted the current will be large and then you'll see the voltage regulation collapse.
3. A bad D-Link AP which is relatively easy to swap.
4. A rebooting D-Link AP which the cRIO would notice and we could log.
5. A rebooting D-Link AP because of a software problem.

(There's a trend here, LOL.)


Or is it a bad battery?


If you monitor the voltage at the same time as the current you'll see either or both slip and you'll know the problem is at or behind the DC-DC converter if you discount the battery. If you characterize the DC-DC converter's input and the PDB's output as well you can run the same tests with the suitable ranges for that circuit as well at the same time or later.

At the moment I assume that a battery collapse issue shouldn't happen because we are assuming that the FIRST field monitors the battery, but of course, we can monitor the robot battery with the cRIO analog monitor already (communications or not) and we could log it. If you wanted, of course my minimum voltage monitors could be put on the battery as well and trap that regardless of the onboard robot hardware (they can be powered by the robot battery or another battery). Of course a DMM set as a voltmeter with a min/max function would work to some extent as well.


how do you know if the software hasn't caused a reset due to something buried in the code?

Assuming that you've been watching the voltage to the D-Link AP and the voltage regulation hasn't dipped below the expected minimum, then when the D-Link reboots from software you're correct you'll see the current drop. Even if the software causes the D-Link to reboot. However you'll be armed with the knowledge that the voltage to the D-Link AP did not drop. If you monitor over-current as well you'll have eliminated an overload of power requirement as well. As I've stated above you can test out or remove the rest of the system from the DC-DC converter back if you want (it's unlikely to be the cause as I've explained at length.) While that's the same failure mode as bad connection between the DC-DC converter up to or inside the D-Link AP, you could swap the D-Link AP and the cable from the monitoring hardware attached to the DC-DC converter to that D-Link AP all at the same time. With a new connector and a new D-Link AP it's unlikely that the problem continues from hardware with those circuit conditions, if that wasn't clear on the first attempt it'll be real clear by the second. So at that point you've got a problem common to D-Link APs, common to the environment around the D-Link APs that's not robot power related or your software. If you want to claim it's a design fault in the D-Link AP or an issue with the environment then we can consider collected data from the environment and see if there's any consistent evidence. As we would know there's been no power quality overload that would be eliminted. If there's little evidence of those issues that leaves you software.

Again the point was to quickly check the power quality, not dismantal and reverse engineer a D-Link AP in the field. I can speak from experience that FIRST provides several spare D-Link AP in the spare parts and we know the cables are trouble so there should be spares for that as well in there (not so sure, didn't need any at MARS Mount Olive so I did not look). In any case now we form a dividing line in the power quality issue between the D-Link AP and the DC-DC converter. We can draw another between the PDB and the DC-DC converter. We can draw yet another at the battery. With my little modules that's easy and not much weight.

Again all of this requires characterization of the components. I thought FIRST did that before they shipped at least the KOP but maybe not. So we can do it now using data collection and all these robots with the obvious additional issue of excluding data from teams that might measure with bad tools or not understand how to do it. Enough statistics with robots that are fairly stable will reveal teams that errored in their measurements or have issues they don't know about. Not much we can do about these measurment issues except confirm measurements at events.

I can see how a great number of issues people report could be power quality related. Obviously some are not. Anything that makes the likelihood of a power quality issue being or not being the problem would seem valuable to cut down the troubleshooting and guessing.

stingray27
22-04-2012, 19:05
Not necessarily corresponding to the recent previous posts of the D-Link, we have finally solved our weird and unique problem with losing connections.

Through the districts and Michigan State tournament, Team RUSH had been experiencing some odd connection issues with the FMS. Not affecting our performance tremendously, we had a large trip time and ms loop going back and forth between the field and the driver station. In addition, we had some lag issue too (although we never knew about it because as drivers, that is what we practiced with so we assumed it was normal). After Hybrid was over, we would experience a 500ms to 1000ms drop time in which we would loose all comms and code but then regain it for the rest of the match. We also had some issues with "flickering" coms during end game at MSC.

At first, we assumed it was because we were using 2 cameras and displaying the views on the dashboard. We also were using the classmate as the driver station cpu - big mistake.

The classmate with 2 camera feeds did not have nearly the processing power to do this. The switch from hybrid to teleop spiked the cpu% to 100%, therefore momentarily loosing communication. Also, our lost packets were around 40 the entire match. By switching to a different, larger, newer computer, it solved all of our lag problems (controls and video streaming) as well as communication issues.

For those teams out there still using the classmate and struggling with communication, I would suggest switching your cpu for your driver station. (based on my experience this year).

techhelpbb
22-04-2012, 21:39
I'm going to summarize my post above to make it quick to follow:

Given:

1. Ability to monitor maximum and minimum voltage and current between DC-DC converter and D-Link AP.
2. Characterization of maximum and minimum voltage and current for D-Link AP (and really all FIRST parts should be characterized for at least this).
3. Reasonable characterization of the DC resistance of a properly functional D-Link AP robot power system as measured from the D-Link connector with a resistance meter and the D-Link disconnected.
4. A good reliable connector on the current and voltage monitor that allows swapping the tail that goes to the D-Link AP reliably.
5. Battery voltage monitored to be proper (logged to avoid missed measurements due to communication lost or trapped by voltage monitoring with a DMM or custom circuit).

Possible issues:

Exceeded high voltage or low voltage
Exceeded high current or low current
Exceeded nothing and communications issues remain

Terminology:

Power supply: The delivery method of power from the battery on the robot to the outputs of the DC/DC converter, including all components and wiring in between. Diagnoses requires a different troubleshooting flow.

Troubleshooting flow:

Symptom: Exceeded high voltage and high current / exceeded high voltage and normal current / exceeded high voltage and low current / exceeded low voltage and normal current / exceeded high voltage and exceeded low voltage during the same with with high current.
Probable cause: Regulation issue.
Fix: Diagnose power supply. If the proper functioning power supply still has issues replace D-Link AP.

Symptoms: Exceeded low voltage and low current / Exceeded high voltage and exceeded low voltage during the same run with low current / Exceeded high voltage and exceeded low voltage during the same run with normal current.
Probable cause: Regulation issue, wiring and/or D-Link AP issues.
Fix: Diagnose power supply then check wiring. If power supply returns to normal voltage and still have low current, replace D-Link AP and connector tail.

Symptoms: Exceeded low voltage and high current / normal voltage and high current
Probable causes: Overload due to short or unusually high D-Link AP current usage.
Fix: Remove D-Link AP and measure circuit resistance. If as characterized and there's nothing like swarf floating around replace D-Link AP with / without a new connector tail.

Symptoms: Normal voltage and low current
Probable causes: Bad power connections (outside the D-Link AP back to the DC-DC converter or inside D-Link AP), bad D-Link AP (draws uncharacteristically low current when radio is on), D-Link AP software induced reset.
Fix: Check connections between DC-DC converter and D-Link AP. Replace D-Link AP and connector tail. Analyze field behavior with other robots for common issue. If all that fails you probably had a software induced reboot.

Symptoms: Normal voltage and normal current but communications problems
Probable cause: Not a power quality issue.
Fix: Check Ethernet connections. Replace the D-Link AP with/without connector tail. Analyze field behavior with other robots for common issue. If all that fails you probably had a software issue.

EricVanWyk
22-04-2012, 21:44
See MaxMax161's post for a description of what we're seeing now: Always fails when first connected to the field, reboot just prior to the match start and it is always good. :confused:

This is different than what we thought we were looking for. Has it always been like this, or when did it change?

You should track down Greg or myself next week.

Deetman
22-04-2012, 21:55
Eric-

Since you know the design of the FRC control systems components extremely well, is it fair to say that the current PDB 12V -> 5V DC-DC Regulator -> D-Link AP is well characterized (publicly or privately) such that all the components are appropriately spec'd when it comes to power ignoring failed components, bad wiring, etc? I'd hate for people (myself included) to go down the path of characterizing something that has already been done. I assume the homework has been done but I know things have changed slightly in the control system setup since it was originally designed.

Steve Warner
22-04-2012, 22:48
We tried the idea of putting a volt meter on the output of the DC-DC converter to catch the minimum voltage. With the meter connected we could not reproduce the problem of the Dlink restarting. We were also using a different driver station PC. Could the meter have smoothed out any voltage spikes that might have been present? What size ferrite and/or capacitor should be used to test this? Would they be legal?

Deetman
22-04-2012, 23:08
We tried the idea of putting a voltage meter on the output of the DC-DC converter to catch the minimum voltage. With the meter connected we could not reproduce the problem of the Dlink restarting. We were also using a different driver station PC. Could the meter have smoothed out any voltage spikes that might have been present? What size ferrite and/or capacitor should be used to test this? Would they be legal?

The meter should not have much, if any effect on the circuit given its high input impedance. Depending on the quality of the meter however, you probably will not be able to catch any quick transients with it. An oscilloscope would be much better at catching those.

I see no reason why ferrite beads would be illegal since they are passive and don't physically connect to the actual conductors in the wire. Appropriately sizing them would depend on the noise you are looking to reject, etc and I couldn't provide a recommendation at the moment. Capacitors on the other hand may be illegal in the current rules.

Unless you've had power related issues with the D-Link I would carefully consider adding anything/changing anything at this point.

Alan Anderson
23-04-2012, 00:53
A bad DC-DC converter would have hard time maintaining it's output voltage with the D-Link AP connected and fail to source at least the minimum current. At first you'd see ripple, then you'd see the underlying switching clipped.

The usual symptom of a bad 12v-to-5v converter is a failure to regulate its output at all. With 12 volts on the input, it puts approximately 7 volts on the output. There is no immediately obvious effect on the communication, but it's probably stressing the D-Link's power input and I would expect it to fail, perhaps slowly.

techhelpbb
23-04-2012, 05:04
The usual symptom of a bad 12v-to-5v converter is a failure to regulate its output at all. With 12 volts on the input, it puts approximately 7 volts on the output. There is no immediately obvious effect on the communication, but it's probably stressing the D-Link's power input and I would expect it to fail, perhaps slowly.

When you say 7V are you measuring that with a DMM on DC volts?

I would expect a bad DC-DC converter pumping out switching frequency instead of DC would drive the DC reading of a DMM into strange territory. A low frequency signal like 60Hz AC usually under reads on the positive side and you might see the negative side occasionally (quantification by the DMM's DC voltmeter sampling), but I seriously doubt the internal switching frequency of the DC-DC converter would be lower than 1kHz and might be much higher. Also there's the chance that as the DC-DC converter reaches beyond it's ability to pump energy into it's output storage because of feedback internally it'll allow it's regulation to creap up because it'll keep saturating it's switching MOSFET for extra cycles to feed the energy storage. Depending on what safety systems are in the DC-DC converter you might even see some slow peek modulation imposed on the output as it tries to restrain it's output voltage and then gives up a bit and tries again in a loop That would have the effect of bringing the output ever closer to the input (so you'd see peaks between 12V and 5V...like 7V). However, if you looked with an oscilloscope you'd notice that it's not 7VDC, it's really more an AC small signal. As the D-Link's input capacitance is already within the circuit it'll not easily be able to impact such a wild input. In probability, however, the D-Link AP will have at least a standard linear regulator in there with some capacitance beyond it and that'll provide at the minimum a few dB of damping on the AC improperly feeding it. I would except misbehavior...perhaps due to complexity it might not be apparent but I bet a really thorough review would find it.

There's also the possibility that there might be a measurement difference from a cheap DMM and a true RMS DMM even though true RMS DMM usually impact the AC performance of the test equipment not the DC (I've seen some really cheap DMMs in my time).

I'm still chasing down the DC-DC converter Team 11 found damaged on their robot and when I find it, I'll put an oscilloscope on it. I'm not sure, however, that the voltage it was producing was too high on a DMM. So if someone does encounter a unit that produces 7V according to a DMM, please put an oscilloscope on it and describe the output signal for us.

If I'm correct and the peeks are merely 7V the effect of the low pass filter by the D-Link's internal regulation (which would probably be low-drop out) would be to drop that 7V (not really DC) by a few dB. In effect it'll be like pumping more like 5VDC or so into the input instead a lot of the time. Occasionally it might peek over but probably not all that often which would explain why it keeps running at all...of course doing this is still not a great idea and you're probably right eventually either the DC-DC converter might give it a hard shot of 9V-12V or the regulation within the D-Link AP will fail (if it's linear the result would be thermal damage).

As that applies to your quote of my earlier post, since I suspect that the DC-DC converter's output would then be AC small signal and not DC under those circumstances I feel I stand by my presented idea. The best voltage and current monitoring solution for the robot would approximate an oscilloscope not a DMM DC voltmeter. Something close to an analog oscilloscope like a voltage comparator would see the instantaneous low current and low voltage. It would not effectively take snapshots a few thousand times a second like a DMM would. Since the input capacitance essentially forms part of the output energy storage of the DC-DC converter we could see currents and voltages that are during the length of a time frame exactly as I've stated. Also, we could see them during the length of the waveform as too high in voltage and still too low in current which if you look back is still a regulation failure symptom.

I will conceed, however, that if you put a circuit with an LED for exceeding the voltage maximum and an LED for exceeding the voltage miminum fed by an analog comparator you could see both voltage monitoring LED lit and a current that's too low with a failure mode like this. In effect the comparator would be fast enough to at one moment see one fault and then shortly later see the other. Latching the results would produce both lit. I'll add that mode to my cheat sheet on the page before but there's already a course of troubleshooting for it starting with diagnosing the robot power supply up to that monitoring.

In any event, it doesn't negate the value of at least attempting to investigate the power quality feeding the D-Link AP, it just adds another possible observation you might encounter.

Alan Anderson
23-04-2012, 09:33
When you say 7V are you measuring that with a DMM on DC volts?

I don't know what instrumentation Al Skierkiewicz (http://www.chiefdelphi.com/forums/member.php?u=172) was using when he characterized the failure (http://www.chiefdelphi.com/forums/showpost.php?p=1143699&postcount=16).

I would expect...you might see...I seriously doubt...there's the chance...In probability...perhaps...There's also the possibility...If I'm correct...I suspect...

It might help if you did a little more reading of what people are actually reporting and a lot less wall-of-text speculation.

techhelpbb
23-04-2012, 09:48
It might help if you did a little more reading of what people are actually reporting and a lot less wall-of-text speculation.

I know that characterization of those values should have been done already.
Now all we have is speculation and swapping parts until we remedy the initial engineering process oversight (and that's all we can have given the missing quantifiable information).
Given I got the question into the FIRST Q&A forum I'm well on my way to removing the impediment to correcting that oversight.

Also I've read every post in this topic so that point is a straw man argument http://en.wikipedia.org/wiki/Straw_man.

Nevermind it ignores that my team did have this problem and it cost us 3 matches chasing phantoms with the help of field personnel while not being able to really do much on the field (and finally after chasing software and other unrelated issues they replaced the bad DC-DC converter they found in part because of my suggestion).

Nevermind that reading Al's post does not tell us whether the DC/DC converter was loaded or not or what tool was used.

Must have set foot in a political forum by mistake. Sorry.
Thanks for letting me know what my time is worth. Here I was thinking we were trying to be thorough.

EricVanWyk
23-04-2012, 10:49
It might help if you did a little more reading of what people are actually reporting and a lot less wall-of-text speculation.


Must have set foot in a political forum by mistake. Sorry.
Thanks for letting me know what my time is worth. Here I was thinking we were trying to be thorough.

We are trying very hard to find the needle in the haystack. Brian, you are adding a lot of extra hay to the top of that stack: the signal-to-noise makes it really hard to follow. Please constrain yourself to posting data. When you voice speculation as fact, it impedes the ability of the community to find the needles.

We value your input and your time - we need to make effective use of everyone's time. To do that, I'll pull from one of my mentor's phrasebook: Be Brief, Be Brilliant.

techhelpbb
23-04-2012, 10:56
When you voice speculation as fact, it impedes the ability of the community to find the needles.

Point of reference Alan just seemingly explained to me that my posts are full of speculative statements (complete with quoted evidence from them) so I fail to see how I can be voicing the precise opposite. If the details of my concerns are challenged why should I not explain them before the next tactic is to claim that my failure to respond indicates my initial proposition was without merit?


We value your input and your time - we need to make effective use of everyone's time. To do that, I'll pull from one of my mentor's phrasebook: Be Brief, Be Brilliant.

Okay, here's the brief solution someone provide the quantified numbers I've asked for and FIRST should already have and I won't have to defend my reasoning in lengthy posts (for those that read things into every detail looking to find error, like claiming over and over that Team 11's robot battery was discharged despite me saying otherwise multiple times) I'll just do it and hand you the simple quantified results (which for all we know vindicate everything you've all been saying and the power quality issues are infrequent).

The question above remains the same. No straw man argument can distract from it. Do we have access to the quantified data I seek about the D-Link AP or must we as community engage in a process to get that data? I was fully prepared to start community data collection at my financial cost before it was suggested I 'slow down'.

All the other issues from the software and otherwise are valid. They deserve recognition but the format of this forum is not within my ability to control. If you're concerned that I'm burying these additional topics can someone (because I am unable to) provide a topic split?

EricVanWyk
23-04-2012, 11:12
Eric-

Since you know the design of the FRC control systems components extremely well, is it fair to say that the current PDB 12V -> 5V DC-DC Regulator -> D-Link AP is well characterized (publicly or privately) such that all the components are appropriately spec'd when it comes to power ignoring failed components, bad wiring, etc? I'd hate for people (myself included) to go down the path of characterizing something that has already been done. I assume the homework has been done but I know things have changed slightly in the control system setup since it was originally designed.

My responsibilities shifted more towards FLL and FTC, so I'm not as involved in the day to day of FRC as I used to be. I designed the PDB, but FRCHQ did the verification work for the 5V DC -> D-Link this year.

The 12->12 is very well known, the 12->V is well specified, but the D-Link's input stage doesn't have the same level of documentation available. We've cracked them open and V&V-ed from the inside, but unfortunately reverse engineering can never create a data-sheet: We can only say that all the samples we tested worked well with good margin.

It may be worth while to take an oscilloscope to a working and a non-working setup. The frustrating part is that we haven't been able to reliably reproduce the issue in a lab, so we have to drag a subset of our equipment to competitions.

techhelpbb
23-04-2012, 14:18
My responsibilities shifted more towards FLL and FTC, so I'm not as involved in the day to day of FRC as I used to be. I designed the PDB, but FRCHQ did the verification work for the 5V DC -> D-Link this year.

The 12->12 is very well known, the 12->V is well specified, but the D-Link's input stage doesn't have the same level of documentation available. We've cracked them open and V&V-ed from the inside, but unfortunately reverse engineering can never create a data-sheet: We can only say that all the samples we tested worked well with good margin.

It may be worth while to take an oscilloscope to a working and a non-working setup. The frustrating part is that we haven't been able to reliably reproduce the issue in a lab, so we have to drag a subset of our equipment to competitions.

I partially write software that is used by gallium arsenide wafer fabricators to characterize their wafer lots and when they run prototype fabrications. There are the design expectations and then there are the actual data results for thousands and thousands of units on several wafers. Unless changes are made because the actual data fails to satisfy some critical parameter of the design expectations. They are reverse engineering to create what becomes the support evidence for those semiconductor's data sheets.

It might be true that you can't necessarily reverse engineer every aspect of the original design expectations. It's often the case that the actual product behaves sufficiently to serve the original design priorities but parts of it perform outside of the original expectations.

The point being that we don't hand a customer the design expectations and tell them this must be what they got because the product seems to meet the critical design priorities. We test and hand data backed by real measurements to the customer (hundreds of different measurements). Sometimes we don't even perform enough tests so we do some more because there's a problem with the RF modules after the fact. Often involving the amount of power these RF modules can consume when added to other circuits which toggle their states of operation (these are usually cell phone RF circuits and WiFi chipsets).

All veracity of all these test results requires effectively confirmation by quantity. The more of the things you test the more likely you are to notice the tolerances. That's where statistics come in.

None the less, the problem is that now that I've slowed down to allow other ideas to pass I've lost time to collect data relevant to the concerns I expressed from a wider selection of working and non-working robots. It's not likely that by middle of this week a whole lot of teams will take the time to perform a bunch of measurements to add to the measurements we can get at competition. In a real way the delay has removed a key validation aspect from the characterization process by time limiting the testing and the quantity at the Championship event.

Between the fact that I need a characterization to use my modules, and the fact that FIRST has yet to even answer the question about the legality of the modules for this year in their forum. There's little sense sending these items to the Championship. I grow increasingly concerned that the necessary characterization parameters will not be sufficiently determined as a result of this contrived delay's impact and therefore my circuit will produce dubious results by misapplication.

It greatly bothers me that FIRST might have tested an unspecified number of D-Link APs as samples from unspecified lots from D-Link and merely tested for the margin from overloading the DC-DC converter on the bench. It has partially blinded us at the troubleshooting phase and now I have basically run out of time to remedy this using the community and we are left merely with the same troubleshooting process that left us pecking and hunting for 3 matches while stuck like a brick on the field.

I will be asking FIRST to approve my units for testing. It seems (contrary to forum commentary I read when I proposed making these very modules for the Jaguar in the first place) that FIRST does not in fact have suitable test equipment to test on any arbitrary moving robot. That's what these circuits I made addressed as a primary design attribute and I want them to get a fair evaluation in the environment of quantified results that they deserve.

In the mean time, the good news is that if FIRST does answer my question in the official FIRST Q&A forum then they'll open the door to allowing the cRIO to act as a datalogger as well (at least for voltage) while connected to the D-Link AP's power connection. This means that it will make it entirely clear whether or not Mikets's Team's C++ class can be used on a fielded competition robot (see page 8 of this topic). So while that's not the most comprehensive solution to my concerns it's something that we can at least do for robots that run C++ as he has offered it (in theory all FIRST robots can do this with their software but you'll need to figure out how to do it for each language in each robot). Luckily as a datalogger the actually characterized component information won't be to critical. Too bad we have no idea how fast the cRIO can actually sample that voltage and my question to FIRST doesn't include sampling current (my reasons for that decision already stated).

Al Skierkiewicz
23-04-2012, 15:33
OK Brian,
Let me repeat in a different way. Modeling the current drawn by the Dlink is a waste of time for the reasons I already outlined. When I spoke of the variable current demands I was not thinking in terms of msec changes but in terms of average current drawn. The reality is that that type of data you want to collect in a real world test, would give you absolutely nothing useful. The average current delivered will (I guarantee) change with with variables that are not under your control. (and there is no way for many people to control those variables outside of the manufacturer) The more variable of these is input RF signal processing dependent on the number of active channels, output power, SWR, modulation usage, packet repeats, and data transfer. Do you have an RF proof anechoic chamber at your disposal?
The radio link does not fail with changes in current except when the wiring has high series resistance. I am glad that a suspect 5 volt convertor turned out to be the issue. However, let's face facts, it is far easier to replace a suspect device with a know failure mode than attempt to shoot holes in an otherwise working system.
To remind those who are lurking.
1. A failure of the 12 volt PD power supply occurs when the battery supply falls to 4.5 volts or less. With poor mechanical design and/or electrical primary wiring, this is a real world possibility. The result is full radio reboot of 50 sec, or a full Crio reboot of more than 30 sec or both. this supply is capable of three times the current of the Dlink.
2. The five volt supply will fail when the input voltage is less than 7 volts. It is designed to regulate at least 20 times the current demand of the Dlink.
3. The permanent failure mode of the 5 volt regulator makes it look like a big resistor. The output is about 7 volts with no movement on the robot and it tracks the input when it is pulled down under varying load. There is no strict data on this failure as it is simply the series resistance of molten silicon in an individual device.
4. It is well documented that there are a few things that can take the data flow down. This is not an FMS disconnect as many people report. It is either a maximum CPU usage in the Crio, the DS computer, or an overload due to high data throughput experienced when the camera(s) is used at high resolution.
5. Let us not forget that there are certain safety protocols in use to insure that when the field is no longer capable of passing control data (i.e. robot enable) or the match is stopped, all robot function must also stop. This takes priority over everything else.


Of the odd failure modes, many are attributable to mishandling by the team. Number one of these is wiring the radio backward, followed by wiring the 5 volt convertor backwards. The next most common is the team who failed to wire the 5 volt convertor at all. While the radio may work for a while with no regulator, it will start to fail in unusual ways even when the regulator is correctly wired in place.
As to Crio related failures, these are very common and many are well documented. Please search for things like "feeding the watchdog", odd behavior with certain LabView implementations that are marked as unused. finally check to see if your Crio is logging activity and how often this is occurring. High loop times is good indicator that something is wrong in this area. If you suspect something in the radio circuitry, first watch the RSL to see what is happening on the robot. If the Crio reboots, you will see it there. If you robot always fails with a particular robot move, watch the lights on the PD (this is why you should mount where we can see it) and see if they go dim or go out. Watch the LEDs on the Dlink to see if they go out or if they go dim. The robot will talk to you but you have to listen. Finally, check with the FTA when failure occurs on the field. They are logging things like that battery voltage, lost packets and field connection. I have had only one unexplained failure in the last three years that I have seen and couldn't explain. However, that was when we didn't have CSAs checking to see if teams have shot themselves in the foot.
All the people I respect on this forum tell me that the power supply to the radio cannot be the problem and my gut tells me the same thing.

As to what ferrite beads may work in keeping RF out of the radio when used on the power wiring or ethernet connections. I know that there are a couple of snap on chokes available from Radio Shack that could work. Others that I know of are available from Ham Radio Internet sites but is unlikely at this late date you get them in time.

techhelpbb
23-04-2012, 16:04
OK Brian,
Let me repeat in a different way. Modeling the current drawn by the Dlink is a waste of time for the reasons I already outlined. When I spoke of the variable current demands I was not thinking in terms of msec changes but in terms of average current drawn.


Average current drawn wouldn't be of value. That's why I designed a circuit fast enough to see momentary changes in the input voltage like an oscilloscope.


The reality is that that type of data you want to collect in a real world test, would give you absolutely nothing useful. The average current delivered will (I guarantee) change with with variables that are not under your control. (and there is no way for many people to control those variables outside of the manufacturer) The more variable of these is input RF signal processing dependent on the number of active channels, output power, SWR, modulation usage, packet repeats, and data transfer.


Turn on or off all the features you like. There's still a maximum and minimum current the unit can draw. I'm not characterizing what it draws at each operation, only what the maximum or minimum peak is.


Do you have an RF proof anechoic chamber at your disposal?


Don't need anything like this to test what I've asked. Could do it in an old microwave with the electronics removed and even reuse the metalized stirrer fan from the magnetron section if you want to get funky with it.

The inside of a microwave oven is an RF shielded box (with some nice leaks around the door). It'll reflect and absorb 2.4GHz well on it's way up to 5GHz. The metalized plastic stirrer fan would do what it does to the microwaves from the magnetron and form an interference pattern.

You don't need that sort of test equipment to characterize the current input on the DC power supply. You just need to max out the radio current draw. You'd need a anechoic chamber if you wished to reverse engineer the details of the radio to specifics and I've never asked for that at all.


The radio link does not fail with changes in current except when the wiring has high series resistance.


I wasn't looking to test the details of the radio link this is another straw man. I was looking to see if the radio is on and how much it and the rest of the unit can draw above that point to look for wiring issues and to generally make power quality issues locatable. I detailed how concisely already and clearly it was all ignored.


All the people I respect on this forum tell me that the power supply to the radio cannot be the problem and my gut tells me the same thing.


I bet you didn't intend this statement the way it appears from my perspective. Quantifiable data isn't a popularity contest. Making it into one and silencing alternate opinions isn't a great idea. In fact it's the worst possible idea once we know that the data we need, from the environment we need it in, doesn't exist so you are speculating as much as you might claim I am (and more importantly the very people who's opinions you respect have effectively been the reason why I know the data I seek wasn't considered by their own communications in this topic).

As I feel I'm being effectively bullied on this subject in a way that can't produce anything of value for FIRST. Since I've publicly asked for the topic to be split (and I can't even start a topic myself as I've pointed out) to remove the cover distraction that I'm interfering in the discussion and my request has been ignored. Since it appears that the parties involved with the most access don't have the data I require. Since I have now been stalled to prevent my use of the community to get that data. Since some seem intent on ignoring my posts while complaining they are too long. I'm going to just drop the subject which was apparently the mission statement of the day here. That doesn't really serve the engineering or scientific interest of FIRST but it does serve to let me get some of the information I wanted without any further opportunities for interference.

If FIRST allows the use of the cRIO to log that data on the Championship field I hope someone actually does it.

You know the sort of funny part is that it doesn't matter if FIRST uses my little comparator modules or not officially. They have so many applications that getting money to develop them seems no issue at all. I wonder why there's so much interest in an idea people claim can't be applied.

Al Skierkiewicz
23-04-2012, 17:49
Brian,
You already posted that high frequency changes in current would not be visible due to all the low frequency filters in circuit. I agree. You won't see them so that gives you no data as I pointed out above.
There are features that you cannot control and you cannot predict so again there is no way to collect meaningful current data.
Microwave oven? It is not an RF proof chamber it is simply a cavity with an opening on one side. Most ovens use a 1/4 wave shorted stub to prevent leakage. And you will not be able to test real world activity since you won't have another radio that is connected to the wireless. The SWR that the Dlink will experience in this environment is likely to be extreme and changeable with a small movement within the cavity.

Above all, I don't want it to seem that I am bullying you. I do have a vested interest in preventing the others who read this discussion to be distracted from collecting meaningful in situ data and understanding the real problems when they exist. If there is a real problem I (and FIRST) need more than the sampled data of the current draw on the radio collected in a sea of variables. We (Inspectors, First Engineering, National Instruments and CSAs) require hard data. i.e. did the radio stop transmitting, did the Crio reboot, did the radio reboot, did the CPU hit max usage, did the battery fall below 4.5 volts, what language is the team using to program the radio, can we see the code?
Collecting this data will point us to a particular component failure or to a software issue that hopefully will be repeatable under real conditions.

techhelpbb
23-04-2012, 18:04
Brian,
You already posted that high frequency changes in current would not be visible due to all the low frequency filters in circuit. I agree. You won't see them so that gives you no data as I pointed out above.


The high frequency isn't what I'm worried about, it's the transients that drop the current because for a moment a loose wire would disconnect the circuit or block it with a high resistance. That's why I used the alarm circuit example. I'm not interested in measuring RF bleed-in or anything exotic like that if I was I'd bring my spectrum analyzer and the near field probes.


There are features that you cannot control and you cannot predict so again there is no way to collect meaningful current data.


All those features do so far as I'm concerned given all the filtering on the point in the circuit I'm interested in is consume power and I'm only looking at the maximum and minimum peaks...I don't care if the window between is 1A or not. I don't need to know exactly when anything in the D-Link turns on or off *except* when the radio first turns on during the initial robot power-up sequence (I only need to know about that because for a bit of time the D-Link as a unit will draw much lower power).


Microwave oven? It is not an RF proof chamber it is simply a cavity with an opening on one side. Most ovens use a 1/4 wave shorted stub to prevent leakage. And you will not be able to test real world activity since you won't have another radio that is connected to the wireless. The SWR that the Dlink will experience in this environment is likely to be extreme and changeable with a small movement within the cavity.


I know that's the point. I want to max it out so it draws maximum current. The whole point is when you remove the microwave outer case and the magnetron assembly you'll be left with an opening in the case (usually a grid of holes) and sometimes a plastic metalized stirrer fan. The D-Link AP inside that cavity would have to drive it's RF section into the max region to drive a connection to a wireless device outside the microwave past the stirrer fan which itself would cause interference. Of course some microwaves don't have metalized plastic stirrer fans but any metal bladed fan would work, even moving tin foil on a motor. In any case during the test you leave the outside case of the microwave off...the point is not to contain all the RF signal but to make the D-Link AP work very very hard to communicate to the wireless network outside (a point not served by putting the outer lid back on and effectively building an RF shield around the interior shield).


Above all, I don't want it to seem that I am bullying you. I do have a vested interest in preventing the others who read this discussion to be distracted from collecting meaningful in situ data and understanding the real problems when they exist.


I completely understand and hence my request to be offered a different topic so we can discuss this specific issue without in any way ignoring the other valuable advice you wish to insure is communicated.


If there is a real problem I (and FIRST) need more than the sampled data of the current draw on the radio collected in a sea of variables. We (Inspectors, First Engineering, National Instruments and CSAs) require hard data. i.e. did the radio stop transmitting, did the Crio reboot, did the radio reboot, did the CPU hit max usage, did the battery fall below 4.5 volts, what language is the team using to program the radio, can we see the code?


If you look back in this topic to page 10 towards the bottom I made a nice summary of exactly what I thought this data would do for you. I prefer to use my circuit to do this test for reasons I've stated repeatedly including there's no data to trawl through just maybe 4 LEDs...but if I can't get that I'd love to see the data log later and I'll be interested.


Collecting this data will point us to a particular component failure or to a software issue that hopefully will be repeatable under real conditions.

Exactly and what I'm doing is collecting one subset of data that is particularly of value in the context I already feel I summarized quickly last night back towards the bottom of page 10.

mjcoss
23-04-2012, 18:23
Well let me add some more hay :) And this thread seems like a good place to report some information.

I've been concerned about the field network for the past few years because it *seemed* that *which* robots were on the field was a contributing factor to the overall responsiveness, jitter and lag of the network. For example, the number of robots on the field that had onboard ip cameras seemed to be important. So this year, after our first district event, I started asking questions, and then at the MAR championship I brought a hardware packet sniffer to take a look at the field network, and I feel I should share what I discovered.

I originally believed that each robot was going to be assigned a separate channel and that they were using 802.11n @ 5 Ghz. The network at the MAR championship was indeed using 802.11n @ 5Ghz, and they are using a wide channel, however all robots were sharing the same channel and as such were sharing a total theoretically bandwidth of 300 Mbits.

At the Mt. Olive competition, I was told that the robots were running on channel 6. If true, this would have meant that they were running in the 2.4 Ghz range, with over a dozen other 802.11g networks, this would have considerably reduced the theoretical bandwidth.

In general, due to a number of different factors, if you get half the theoretically bandwidth on a wireless network, you're doing well. So let's assume that 150 Mbits is our expected available bandwidth on the field, if you're using a wide 802.11n channel @ 5 Ghz. Much less if you're using 802.11n @ 2.4 Ghz as the interference will be awful.

I've asked but haven't heard whether or not they have established any QoS or bandwidth limitations on each SSID in the Cisco access point that they are using. Without any controls, it will be a free-for-all for the available bandwidth.

This bandwidth is used by the robots in a number of different ways, and here I’m just talking about communication between the robot and the driver station laptop, as I’m not sure what the FMS is using:
VIDEO streaming
If you have an onboard camera, and send a video stream from robot to driver station at 640x480 in 24 bit color @ 30 frames a second, that's ~200 Mbits uncompressed raw bits. MotionJPEG will compress that down on average to around 10 - 15 Mbits. H.264 will do better. I've heard that some teams have 2 cameras onboard. The Axis camera supports a number of options to reduce the bits but there is no rule about how you configure the cameras, and they would work fine in your local build environment and even in the practice fields at competitions where you’re the only user of that wireless channel.
Dashboard data
There is the "normal" dashboard that is part of the default code, and the default dashboard sends data, if I remember correctly, at about 10 times a second. For reasons that I can't remember at this point, we are actually sending the data at 40 times a second from our robot. This is a relatively small amount of data, but it doesn't have to be. With the addition of Smart Dashboards, and other custom dashboards, and no guiding principle on the volume of data this could be a significant amount of data or just a dribble. In our case, we're sending ~1 Kbits per update, or 40 Kbits per second.
Driver station data
This is data packaged by the driver station application provided by FIRST and sends the values for the input devices attached to your driver station to the robot. I've never looked into how much data is being sent, or the frequency with which it's sent but it's not a lot of data probably on the order of 40 - 50 Kbits per second.
Other network traffic
There are several network ports that are open for teams to use for communication between the robot and the driver station. In our case, we ran a UDP server on our robot to collect the results of vision processing performed by our driver station. We sent the results of the calculations back to the robot at the rate of 10 times a second. The data is small (72 bits) so we're sending only 720 bits per second.

So for us, our network utilizations was small ~1 Mbits for the camera - we're using gray scale, 320x240 and 30 frames a second with MotionJPEG compression, and at best another 1 Mbit for the remaining traffic. But that is due to choices that we made. I could easily imagine making other choices, and given that I was operating under the belief that we had a full channel to ourselves, I might have gone down totally different path.

The thing about this is that while there is the catastrophic failure mode where the field network crashes. There are many other situations where the latency and jitter can spike, and dip badly. VxWork’s IP stack is not particularly robust, and for some teams that stack has to handle all of the time sensitive CAN bus traffic, as well as driver station, dashboard and custom traffic.

Further, unless you change your default Iterative robot code (at least in C++), you're periodic functions are synchronized with the arrival rate of the packets from the driver station. Now it a well behaved network, the arrival rate should be pretty stable. But if your code assumes stable packet arrivals, you can run into all sorts of timing issues.

In addition, both the camera traffic, and the driver station packets are using TCP which can be very unfair when it comes to sharing bandwidth. A greedy application can ramp up its utilization of the bandwidth, causing starvation of others. And then there's retransmissions, etc.

Is it possible to saturate the network? You betcha. Is it service impacting? Yes to everyone, including you. Is there anything that can be done? Yes.

When I examined the network at the MAR championship, I saw a number teams that were having problems associating with the field. There were repeated attempts by the robot's DLINK to associate with the field access point. I also saw many corrupt frames.

Our DLINK in Mt. Olive simply gave up completely, rebooting during several of our matches. It had been fine, and then just started reboot when we hit another robot or field element Of course to our drive team it looked like we lost communication to the field (which we did) but it was the DLINK that was rebooting. And no there weren’t any loose wires except maybe inside the DLINK housing. I heard from a team that they had a DLINK that only worked when it was standing on edge. Lay it flat and it didn't work. I think they are cheaply made, and are really not meant for the hostile environment of a FIRST robotics competition. A ruggedized access point/bridge would be a beautiful thing.

I don't know why FIRST chose not to have 6 separate access points to provide a channel for each robot. Maybe they just figured that 150 Mbits/6 = 25, and who’d need more than that. I don't know if they are configuring QoS to ensure a fair share of the network. I will be looking at the network in our next off season competition, and try to come to some conclusion about what exactly is really going on.

Joe Ross
23-04-2012, 21:12
I originally believed that each robot was going to be assigned a separate channel and that they were using 802.11n @ 5 Ghz. The network at the MAR championship was indeed using 802.11n @ 5Ghz, and they are using a wide channel, however all robots were sharing the same channel and as such were sharing a total theoretically bandwidth of 300 Mbits.

Here's the answer from the GDC on bandwidth.

Game - The Arena » The Arena » The Player Stations
Q. Are there any bandwidth limits enforced by the field/FMS on the ports listed in section 2.2.9? Is each team able to access the full 802.11n bandwidth, or is it divided by the 6 teams on the field? FRC0330 2012-02-23
A. There are currently no bandwidth limits in place in the field network. In theory, each team has 50Mbits/second (300Mbits/6) available, but that’s not actually realistic. In reality, each team is likely to have ~10-12Mbits/s available. This rate will vary depending on the location of the radio on the Robot and the amount of wireless traffic present in the venue at 5GHz. While this information may help give teams an idea of what to expect, note that there is no guaranteed level of bandwidth on the playing field.

techhelpbb
24-04-2012, 09:11
I mentioned about the networks at MAR Mount Olive district event appearing all to be on channel 6 (course I couldn't look at the 5GHz networks with my phone) back on Page 3. RufflesRidge assured me they were not shortly after.

I never got an answer to my question about whether the 2.4GHz sections of the D-Link are turned off when the fields configure the D-Link APs (post #48 on page 4). If they are turned off, then in theory the channel overlap in 2.4GHz should not be a problem. However, if the second band remains on the radio will attempt to interact with the 2.4GHz networks as well as the 5GHz+ networks.

I got confirmation of my information from people who helped set up the field at the MARs Mount Olive District, not sure where RufflesRidge got his information.

mjcoss
24-04-2012, 11:03
I had a separate source who assured me that at Mt Olive, they were on channel 6. I did not check to ensure whether that was true or not. But to me having no controls on the apportioning of bandwidth and having robots sharing bandwidth means that the idea that which robots are on the field can impact the communications between the robots and their driver stations is possible. And steps could be taken to mitigate these issues so that there's a level playing field. ;)

techhelpbb
25-04-2012, 11:05
I wrote a several page document on how to address what I suspect is Al's concern regarding my previous troubleshooting flow described on page 10, post 149 of this topic.

This is about checking the power quality to the D-Link AP.

I suspect Al is concerned about a 'blind spot' (see my post #144, page 10) in my troubleshooting flow which I can resolve with:

A. At least 1 troubleshooting D-Link APs provided in spare parts at a competition with the barrel connectors removed and tail of stranded wire soldered to the PCB. That wire should have a good mechanical strain relief and come with a ferrite RF choke to remove noise. This has been asked for at past events and in past years as well.

B. At least 1 low temperature coefficient resistive load (physically large resistor probably with fan/heatsink) that could be provided in spare parts at a competition with a maximum and minimum current monitor in series with it. It's value would draw the same current as the D-Link AP can draw maximum at all times. The current monitor limits are set using Ohm's Law (E = I x R) and the maximum and minimum voltage limits we need to characterize from the DC-DC converters we are using to power the D-Link AP.

I've asked 2 times to have this topic split (because I can't) so that this matter would not distract from the other valuable observations and suggestions but since it hasn't been done I'll post the detailed version elsewhere.

Suffice it to say I think this resolves Al's concerns as best as I understand his communications at this time. I'll post the link later when I have time. I think anyone that's really interested will figure my solution out from the minimum of detail I just provided.

techhelpbb
26-04-2012, 09:09
Interesting observation, as of right now the questions asked after the one about monitoring the D-Link have been answered in the official Q&A forum but this question, now days old, has not.

So much effort to avoid checking a simple voltage with already approved hardware. The amount of energy expended to avoid removing all doubt now greatly exceeds that of just doing it.

tickspe15
26-04-2012, 21:06
We had the same problem in spokane...try setting your camera to 6fps

DonRotolo
27-04-2012, 21:46
I think anyone that's really interested will figure my solution out from the minimum of detail I just provided.
Before we move to "solutions" the step of identifying the problem has yet to be completed. PLEASE stop chasing ghosts publicly, the symptoms so far do not support a power issue with the radios. If you want to test your robot at home, feel free, but stop throwing assertions and opinions out there as facts because those of us who are trying to find a real problem are not happy with waters getting muddied. Bad info is worse than no info, since unsuspecting teams may rely on bad info.

That being said, our comms issue went away in our last 2 matches at MAR CMP and in 8 matches here in STL has not returned. Nothing was changed, so we don' t know why. Yet.

techhelpbb
28-04-2012, 17:13
Before we move to "solutions" the step of identifying the problem has yet to be completed. PLEASE stop chasing ghosts publicly, the symptoms so far do not support a power issue with the radios. If you want to test your robot at home, feel free, but stop throwing assertions and opinions out there as facts because those of us who are trying to find a real problem are not happy with waters getting muddied. Bad info is worse than no info, since unsuspecting teams may rely on bad info.

That being said, our comms issue went away in our last 2 matches at MAR CMP and in 8 matches here in STL has not returned. Nothing was changed, so we don' t know why. Yet.

I haven't posted to this topic in a bit and yesterday when you posted this I was on my way to FEDEX to deliver a backup shipment of boards to Linuxboy. Sorry I didn't see this sooner but I'm recovering from pnuemonia.

First off, I offered merely a solution to remove power quality issues for the D-Link AP not a guaranteed solution to everything that effects the communications. Please quote me in context.

Secondly:

Team 11 certainly did have power issues with the radio systems.
Other teams certainly did have power issues with the radio systems.

Telling me there is no power issue with the radio denies Team 11 had a power issue with the radio subsystem and so did others. It denies that I have here a bad DC-DC converter that took 3 matches worth of dysfunction on the field in Philly to find, and others have seen the problem go away when the PDB was replaced.

I'm sure it's a real problem if our team is stuck on the field for 3 matches just like anyone who has problems besides the power quality issues. Just as I am sure it's a real problem if your team is stuck because of a software problem.

The fact is there is obviously more than one thing not quite right with the radios and the systems that power them. That was my entire point. Outright denying a power problem can and sometimes does exist...either because of the parts or because of the wiring refutes the evidence that even those who do so present. It also misdirects the people that do actually have power problems effecting them.

A loose connector on the D-Link is a power quality issue. Why are we intent on saying that doesn't happen when in point of fact even Al notes it often does? Not all power quality issues are about the components sometimes it's just the wiring in the power system.

So far as how I troubleshoot the problem: how would you like me to troubleshoot a problem that per the posts in this topic seems to exist most often only on a competition field...when i don't have a competition field to work with? Even when I worked at spare parts at MAR Mount Olive it's not like I can put things on that competition field that FIRST won't allow. Furthermore thanks to my efforts we no longer can dispute that FIRST will not allow it, they officially said no (as I originally stated they would before everyone led me to believe I was incorrect...again why is this my problem when I was correct?).

We do have a field but we certainly don't have the competition field parts.

If the issue was about me distracting this topic I openly offered to take this conversation out of this topic. Such an argument is a straw man.

I do respect you and the others Don. However, expending so much effort on scapegoating me to cover for a power quality issues that sometimes do exist is not a reasonable thing to do. The reasonable thing to do if you all feel it distracts is offer me another place to discuss this issue like another topic (I can't create topics myself).

If you tell people to troubleshoot for a software issue and they have a power quality issue then like Team 11 they'll spin their wheels rifling through the software looking for what might just be bad power supply to the D-Link. Then what? Who's fault is it then? The guy offering the solution to help rule power quality issues out quickly? I think not.

Furthermore, I asked repeatedly in this topic if anyone load tested the D-Link power supply and no one replied. So if you wonder how the bad wiring or bad component (no matter the rarity) might slip through you need look no further than that.

Additionally, it's clear that despite the general assurance we'll find all the problems if I just stop communicating, Championships are over and obviously these issues remain. So what have we solved? How does the situation today differ from the situation at the start of March?

RyanN
28-04-2012, 18:15
I haven't posted to this topic in a bit and yesterday when you posted this I was on my way to FEDEX to deliver a backup shipment of boards to Linuxboy. Sorry I didn't see this sooner by I'm recovering from pnuemonia.

First off, I offered merely a solution to remove power quality issues for the D-Link AP not a guaranteed solution to everything that effects the communications.

Secondly:

Team 11 certainly did have power issues with the radio systems.
Other teams certainly did have power issues with the radio systems.

Telling me there is no power issue with the radio denies Team 11 had a power issue with the radio subsystem and so did others. It denies that I have here a bad DC-DC converter that took 3 matches worth of dysfunction on the field in Philly to find, and others have seen the problem go away when the PDB was replaced.

I'm sure it's a real problem if our team is stuck on the field for 3 matches just like anyone who has problems besides the power quality issues. Just as I am sure it's a real problem if your team is stuck because of a software problem.

The fact is there is obviously more than one thing not quite right with the radios and the systems that power them. That was my entire point. Outright denying a power problem can and sometimes does exist...either because of the parts or because of the wiring refutes the evidence that even those who do so present. It also misdirects the people that do actually have power problems effecting them.

A loose connector on the D-Link is a power quality issue. Why are we intent on saying that doesn't happen when in point of fact even Al notes it often does? Not all power quality issues are about the components sometimes it's just the wiring in the power system.

So far as how I troubleshoot the problem: how would you like me to troubleshoot a problem that per the posts in this topic seems to exist most often only on a competition field...when i don't have a competition field to work with? Even when I worked at spare parts at MAR Mount Olive it's not like I can put things on that competition field that FIRST won't allow.

We do have a field but we certainly don't have the competition field parts.

If the issue was about me distracting this topic I openly offered to take this conversation out of this topic.

I do respect you and the others Don. However, expending so much effort on scapegoating me to cover for a power quality issues that sometimes do exist is not a reasonable thing to do.

If you tell people to troubleshoot for a software issue and they have a power quality issue then like Team 11 they'll spin their wheels rifling through the software looking for what might just be a bad power supply to the D-Link.

Our issue at the Bayou regional was most certainly not a power issue. We ruled that out early enough... why is it that a team goes from one regional and doesn't run a single match without dropping field communication, goes to a different regional, with a different field, different staff, different environment, run perfectly the entire time with no fixes applied. Literally packing up a 'broken' robot (as FTA deemed ours to be at Bayou), and pulling it out at the LSR, having a highly competitive robot that works 'out-of-the-box'.

There's more going on here than just power issue, but that's not saying that everyone that has communication problems isn't due to power related issues. I'll admit that we did have power issues early on at Bayou, or at least it seemed to be, when we smacked the Dlink, the power would be reset. We corrected it by using the OEM power cable and hot gluing the outside of the plug to the radio to avoid it moving during vibration and hits.

Bandwidth is another thing I don't believe in. While we had a couple of advanced teams at Bayou, it was nothing compared to what we saw at the LSR. Only a handful of robots at the Bayou used live streaming video from the robot to the DS, at LSR, I would say that number doubled or tripled.

Interference? Maybe or maybe not. I'm not an expert in radio interference, so I cannot say whether or not that was the issue. I did notice at least half a dozen APs at Bayou and the LSR, so yea... not sure...

Driver station overloading? Yes, it will cause issues if you are having saturating the CPU on the DS or the cRIO, but that wasn't our issue. We tried 3 different driver stations (Classmate, Dell M5040, and 2010 MacBook Pro), all plenty capable of running the DS software. Our cRIO ran at around 65% usage on average, and for a real time system, that's good.

So yea, my point is that no one can say "This is the problem everyone is having." unless NI or FIRST comes out with a statement saying what the issue is. We have an open ticket with FIRST and they will be investigating the logs from Bayou and LSR to determine what was going on, but as I understand it, the logging features of the field software, the driver station software, and the cRIO software is still pretty underdeveloped, so the cause may never be determined.

There are things that we...
should have done,
could have done,
wish we would have done,
etc...

but we didn't.

These issues for us were just coming up on CD from the Florida regional, where the Bayou field was the week earlier. I wish I would have checked for things such as other routers with our ID running in the pits, and I wish Team Fusion would have kept on an engineering approach and changed one thing at a time when we were trying to solve the issues.

The issues with our field connection put all of our volunteering mentors on edge. Emails were shot out to FIRST with infuriation over the handling of our issues at Bayou from our mentors.

I feel that we proved to the FTA that the issue was not us. While the same FTA that kept blaming us for the issues was packing up the practice field, Team Fusion was running on it, wirelessly, using the hardware they provided to us. We ran our robot for a full 20 minutes without a single drop of communication. I talked with the FTA and asked what he thought the issue was since it runs perfectly everywhere but on his field. He was straight up rude (I could call him a different name...) and said that the issue is not the field, but is still with our robot. If he KNOWS that the problem was with our robot, he should have told us what the problem was, because I spent the three days of the regional ruling out everything but their system. But I believe he said that to cover his butt. That's what he's supposed to do, right? We packed the same broken robot up into the bag and brought it to LSR, and experienced no issues whatsoever. How is that an issue with our robot and not the field?

And the extent of our debugging went down to taking everything off our robot. We put back the given arcade drive code to just drive our base, with the default dashboard software on the classmate PC, with no camera feedback, and still couldn't run. CJ, our CTA at the event spent a ton of time in our pit going over things with us, and I believe he came to the same conclusion as us that the issue was not related to us.

I know that some of this is offending to the FTAs from the Bayou, but I have said nothing that was untrue from my point of view. I'm extremely upset with FIRST over this problem, and so are all of the other mentors for Team Fusion. For many of the students on the team, this was their first event to attend. I can tell you that none of them were impressed with what they saw.

-Ryan Nazaretian

techhelpbb
28-04-2012, 20:10
As long as the methods used to troubleshoot are anecdotal not quantifiable there's going to be increasingly tension between everyone.

There's too many possible issues, combinations of issues, and things we can't test off a competition field, and that makes sense because the robots are all different and there are different fields. It also makes sense because as we move the robots and the fields around we risk disturbing things.

The field guys are told it's not the field it's the robots.

The robot guys have exhausted all the tools they usually have, including extensive testing, and the problems continue.

This problem seems to continue into situations impacting the performance of the best of the best by elimination. It's no longer about team reputation, veteran status and whose word can be trusted (in a level playing field it should not be about that at all anyway).

There still seems to me only one fix to this problem. To find ways to test and get quantifiable evidence about each system. The field. The robots. Each and every time the problems appear while they appear. FIRST has made great strides in field and robot monitoring since Team 11 first saw communications issues last year at an off-season event and I sincerely do appreciate their efforts.

I hope that FIRST will continue to seek quanitifable information of all kinds to insure that these events move quickly, move cleanly and move with the sort of direction that can only do credit to everyone. It serves no purpose to point fingers at anyone. This is the world of science and engineering it's about the numbers and the evidence.

I still intend to let FIRST evaluate my voltage monitors and even if they only fix a small percentage of the problems by volume...that's a small percentage closer to the goal.

mjcoss
29-04-2012, 10:59
I agree. There are a lot of pieces of the system that can go wrong, and the only way to get a stable competition environment is to have a way to get solid data on a real field. I intend to bring some network tools to view the field at the off season competitions that my team is going to, so that I can take some measurements, and get data. I *believe* the network configuration is part of the problem with the field. Stories of robots turning off their cameras, and have connectivity issues go away isn't proof, just more anecdotal evidence. I also believe that the DLINK is a weak link.

It would be nice to get to the point where the doubts about the field network can be laid to rest. It *may* just be a robot issue, but as long as we don't have data to back that assertion, it's just opinion.

Al Skierkiewicz
30-04-2012, 08:19
The fact is there is obviously more than one thing not quite right with the radios and the systems that power them.

A loose connector on the D-Link is a power quality issue. Why are we intent on saying that doesn't happen when in point of fact even Al notes it often does?

We do have a field but we certainly don't have the competition field parts.

Championships are over and obviously these issues remain. So what have we solved? How does the situation today differ from the situation at the start of March?

Brian,
We still seem to have a communication issue so I will repeat in as clear a series of sentences as I can.

There is no "obvious" power problem with the radios. There are issues that teams should be aware of that compromise the power delivered to a radio used on an FRC robot. Namely, the power connector can become noisey or intermittent due to team handling. There are rare occasions where the 5 volt regulator is defective, again in most cases due to team handling. And on still rarer occasions (less than ten that I know of in the whole time we have used the PD) the PD power supply may not function correctly. With the 3500 teams, 60+ events, hundreds of matches, thousands of practice runs at both events and home fields, these radios have worked as intended, powered as designed. I personally witnessed failures over this past weekend that were not radio related even though many observers may think so. They were attributable to faults on the robot either in software or hardware that while intermittent were eventually corrected.

We have a good and cost effective competition system that works. There are occasional issues that are as yet unexplained but the FIRST staff is working hard on solutions. They are as committed as I am to insure everyone has the ability to run when they are scheduled to play. Again, there is no indication these issues had anything related to power on the robot radio.

You can continue to look at your own robot and spend your time chasing a power issue that you believe is the only problem that exists. I am convinced (more now that I observed from the scorer's table thousands of matches) that radio power is not the panacea that you believe it is. While I do not wish to stifle teams who are trying to expand our understanding I am firmly committed to preventing misleading lines of thought so that teams will keep their minds open to other possible failure modes.

Gdeaver
30-04-2012, 08:55
The CMP proved that there is a problem. Robots go dead during a match. The big problem as I see it is that there are many points of failure. Power and the power connector are just one. Isn't FIRST about the engineering mind set? Let's Give the students a real world example of how to kill a technical problem.
I would like to suggest the people working on this problem need allot more info than they have now. First they need a list of robots and matches were there was a failure. They need specific info about the robots on the field. Everything from radio mounting, language, video stream details, DS configuration, DS laptop details. Failure description (did the robot come back to life) and many more robot specific configuration details. Any solutions they tried. This could be sent out to teams as a questionnaire and the responses tabulated. An analysis may yield points to explore. The dropped coms issue has been around for some time. It needs to be attacked and killed.

techhelpbb
30-04-2012, 09:36
You can continue to look at your own robot and spend your time chasing a power issue that you believe is the only problem that exists.


Again with the straw man arguments. I *never said it was the only problem* and everyone can go back and read my posts to prove it.

Your argument is the equivalent of you have it handled and there's nothing to see here. I think a national broadcast of that was something to see there.

No disrespect to you intended you are in point of fact overwhelmed. It was clearly apparent on the faces of each and every senior person on that field.

It was the direct result of doing everything that could be done to sandbox any argument that differs from your own. In point of fact you continue to do exactly that. It is literally unbelievable. From your own post:


We have a good and cost effective competition system that works. There are occasional issues that are as yet unexplained but the FIRST staff is working hard on solutions. They are as committed as I am to insure everyone has the ability to run when they are scheduled to play. Again, there is no indication these issues had anything related to power on the robot radio.

Except on the robots where it was a problem. Now clearly it's unlikely these were the robots during Einstein but your argument ignores your own points about the issues you know exist with the D-Link (namely that power connector). Worse it was you who argued about large relay contacts making noise and having variable resistance in the topic about reverse voltage protecting the Jaguars. The power connector on the D-Link is the functional equivalent of a giant normally closed spring tension relay contact.


I am convinced (more now that I observed from the scorer's table thousands of matches) that radio power is not the panacea that you believe it is. While I do not wish to stifle teams who are trying to expand our understanding I am firmly committed to preventing misleading lines of thought so that teams will keep their minds open to other possible failure modes.

Every troubleshooting process for TCP/IP (in fact most things) starts at the bottom. At the electronics and therefore power level. Each and every time you fail to start at the bottom and work your way up you undermine your troubleshooting process.

Again, even if a small number of teams are impacted by power quality issues who is anyone to claim a fair and level playing field when you deny them the opportunity to find their problems quickly?

RoboSquad
30-04-2012, 09:49
Good Morning All,

Being new to FRC this year I heard about the Bayou issue from a lead NI engineer. So FIRST did send them to LSR I think for free? Knowing this then there is an issue somewhere that needs attention.

I agree we shouldn’t chase some things, however, I would have figured that those running the contests would have some type of monitoring software. If not maybe this is where it should start. Have "HARD" evidence for the FTA staff showing the connection rates etc. and radio strength to prove or support technical difficulties occured during matches (document! Document! DOCUMENT!). As a mentor/coach I would have a hard time selling to my kiddos about their mistakes if I didn’t have something to back it up with when it seems to work fine at home and not at the conest.

Also, I watched the World championship matches and was really wondering what happen with 1114 & 2056 last matches. They averaged 40+ pts. And then to see a sudden drop makes me think of another issue and then the e-mail from headquarters this morning.

I know we try to instill GP across the board and this should be a lesson for all parties involved on how to improve.

techhelpbb
30-04-2012, 09:50
When a team shows up at an event and the only people welcome to troubleshoot on the competition field are your and your group. When your word is gospel. When it can't be questioned.

When representatives of your organization tell a team over and over to go find problems in their robot, and it's clear that your organization field personnel are comprised of people with relationships to certain teams. Worse it's getting increasingly clear that the field personnel don't have any more resolutions than you as a team.

Exactly how do you think your argument does not lead directly to the feeling that certain teams have an advantage by having an 'in' with your organization?

Marginalizing teams and people with troubleshooting knowledge is entirely incompatible with the mission statements of FIRST.

Furthermore Alan spent a decent portion of this topic harping on essentially the idea that Team 11 doesn't know how to measure a battery or build a properly functional robot. It didn't matter what I wrote. The evidence is there for all to see.

So you tell me. Why should I not interpret the continual actions to silence me and the continual actions to down play our team's ability as a direct example of anything other than a very unlevel playing field?

Just how far is FIRST as an organization willing to go to avoid troubleshooting from the bottom up?

Are you going demand next I be removed as a mentor because I stood up and pointed out that this troubleshooting process is tragically flawed when the tragedy has already claimed it's victims?

How far is FIRST willing to go to protect this failure?

I will not go back in my little box Al. Expending all this effort to put me back in that box only demonstrates that there's more wrong in FIRST than a mere failure to troubleshoot some WiFi and field issues.

techhelpbb
30-04-2012, 10:28
The CMP proved that there is a problem. Robots go dead during a match. The big problem as I see it is that there are many points of failure. Power and the power connector are just one. Isn't FIRST about the engineering mind set? Let's Give the students a real world example of how to kill a technical problem.
I would like to suggest the people working on this problem need allot more info than they have now. First they need a list of robots and matches were there was a failure. They need specific info about the robots on the field. Everything from radio mounting, language, video stream details, DS configuration, DS laptop details. Failure description (did the robot come back to life) and many more robot specific configuration details. Any solutions they tried. This could be sent out to teams as a questionnaire and the responses tabulated. An analysis may yield points to explore. The dropped coms issue has been around for some time. It needs to be attacked and killed.

I tried to start a community effort towards this goal already. I was told to slow down.

It does not appear that FIRST values the concept of a community effort. They want a few loud voices to shout over the rest.

All the excuses about my contributions to this topic distracting are also straw man arguments because I asked to be given another place to discuss the matter. They refused.

They don't want to prevent distraction. They want my message to troubleshoot from the foundation up, and myself as the messenger silenced publicly.

Al Skierkiewicz
30-04-2012, 10:31
Okay Brian,
Since you don't know me or my skill set, let me explain. My real job is troubleshooting electronic failures down to the component level. A position at which I excel. I know that power can be a definitive issue with many problems that exist and I know both how to diagnose those issues, and how to correct them when they fail or are designed improperly. Power supply noise in digital systems is nothing compared to the noise generated in analog audio systems where microphone levels are in the -60 to -90 dBm (that's millwatt for those who are wondering) range, can require as little 4 microamps from the power supply and where noise is considered bad when it is only 20 db above the theoretical noise floor. I have been telling you for weeks that power is not the issue you have deemed it to be but you have failed to believe me or the others in this forum that are trying to get you to accept that fact. While there are problems (and they have yet to be diagnosed), power is not the greatest of these. How can you even think that those people who are researching every possible failure point have not considered looking at the power supply first? So to borrow from others, stop muddying the waters, perform your tests and bring us real data that can be duplicated in the lab or field and actually correlates to failures of the robot wireless link. Until that time I will only respond to others seeking real answers.

Edit:
Now that I have have seen your most recent post, let me assure you that Team 11 is a long time friend and a team I certainly respect. When at competitions, I would work as hard helping them compete as my own team and I expect all my inspectors to also do the same.

techhelpbb
30-04-2012, 10:39
Okay Brian,
Since you don't know me or my skill set, let me explain. My real job is troubleshooting electronic failures down to the component level. A position at which I excel. I know that power can be a definitive issue with many problems that exist and I know both how to diagnose those issues, and how to correct them when they fail or are designed improperly. Power supply noise in digital systems is nothing compared to the noise generated in analog audio systems where microphone levels are in the -60 to -90 dBm (that's millwatt for those who are wondering) range, can require as little 4 microamps from the power supply and where noise is considered bad when it is only 20 db above the theoretical noise floor. I have been telling you for weeks that power is not the issue you have deemed it to be but you have failed to believe me or the others in this forum that are trying to get you to accept that fact. While there are problems (and they have yet to be diagnosed), power is not the greatest of these. How can you even think that those people who are researching every possible failure point have not considered looking at the power supply first? So to borrow from others, stop muddying the waters, perform your tests and bring us real data that can be duplicated in the lab or field and actually correlates to failures of the robot wireless link. Until that time I will only respond to others seeking real answers.

Your professional credentials have nothing to do with this Al. Nothing at all. You are not testing each and every robot personally.

You know very well that I can't test a competition field environment without a competition field. So you're telling people that I should basically test something you are insuring can not be tested.

Furthermore, each and every robot on that field is wired differently. Your argument that you've tested this out fully simply can not be true. You have only tested the ones you've tested. So your lots are smaller and the results are always rushed because you most often do those tests on team's robots when you access them during a competition and you only test the ones that do not work. Worse you don't have a complete tool set to test on the field during the matches.

The point again, for all to read, is not that all issues are power quality issues. The point is that when you do have power quality issues you are effectively putting those teams at a disadvantage. I fully admit...again and again in this topic there are more problems than merely the power quality.

The only thing I offered was a way to help identify and remove those power quality issues when they exist I left the rest to you to figure out.

Note: this next part was posted before Al edited his post.

So far as this not being a real issue...you prove once again that Team 11 and other teams finding bad power supply components feeding the D-Link AP is not considered a real problem. It's not a real problem why? Perhaps because Team 11 and these other teams weren't on the Einstein field? Going to be hard to get to that field when our not so 'real' problems get in the way.

RyanN
30-04-2012, 11:19
Okay Brian,
Since you don't know me or my skill set, let me explain. My real job is troubleshooting electronic failures down to the component level. A position at which I excel. I know that power can be a definitive issue with many problems that exist and I know both how to diagnose those issues, and how to correct them when they fail or are designed improperly. Power supply noise in digital systems is nothing compared to the noise generated in analog audio systems where microphone levels are in the -60 to -90 dBm (that's millwatt for those who are wondering) range, can require as little 4 microamps from the power supply and where noise is considered bad when it is only 20 db above the theoretical noise floor. I have been telling you for weeks that power is not the issue you have deemed it to be but you have failed to believe me or the others in this forum that are trying to get you to accept that fact. While there are problems (and they have yet to be diagnosed), power is not the greatest of these. How can you even think that those people who are researching every possible failure point have not considered looking at the power supply first? So to borrow from others, stop muddying the waters, perform your tests and bring us real data that can be duplicated in the lab or field and actually correlates to failures of the robot wireless link. Until that time I will only respond to others seeking real answers.

Edit:
Now that I have have seen your most recent post, let me assure you that Team 11 is a long time friend and a team I certainly respect. When at competitions, I would work as hard helping them compete as my own team and I expect all my inspectors to also do the same.

I agree with Al here, but I also agree with power being a contributing factor to some teams, but again, as Al pointed out, most teams do start from the ground up in the debugging process.

As I've mentioned many times before...
Team Fusion, at the Bayou Regional, could not run a single full match due to communication problems. Our first suspect was power, and sure enough, we did have a power issue.

The student that purchased the connector for the router purchased one that 'fit' but didn't fit properly, leaving us to have a potentially bad connection. We could pick up and drop, shake, etc... our robot, but it wouldn't drop, but if we put focus to abuse on the router, and picked up, dropped, shook, slammed the router, then we would cause a reset.

That was our first step. We had a spare router, with a spare power cable, so we took the OEM plug and used that. After replacing and gluing the plug in place, we could not brown out the router. We did stress testing to other power components of the router as well, including the 12-to-5 regulator, and the PD board. We replaced the 12-to-5 regulator for comfort, and we hit on the PD board a bit, but could not replicate the issue.

And again, as the title of this thread mentions, the teams with this issue had a "Intermittent connection on (the) field only." We worked great in the pits, on the practice field, during kickoff, practicing at home with our router, everywhere, but the dang Bayou field.

The only thing the FTA would tell us is that it was a robot problem. No specifics on what is failing, or why, just that it was an individual problem with us.

Here are the two things I cannot get around:

We were the only one at Bayou with this issue.

CRyptonite had a match with us where they dropped out, but after Bayou, I was told that they were experiencing the CAN Autonomous-to-Teleop transition bug.

Prometheus had an issue late Friday, but I think they determined it was an issue with their camera or something...

Combustion had lag, but that was related to high CPU usage on the cRIO.


We packed up our broken robot, kept the same code, etc... to the Lone Star Regional, and pulled out a fully working, functional robot.

The only thing we changed was reconfiguring our router to work with the LSR field. Greg McKaskle worked with us at LSR, but unfortunately for him, but fortunately for us, we never experienced a communication problem at Lone Star.



There's something external to the robot with the issue we experienced. With what I saw on Einstein, the issues experienced by all of those teams was the same issue we experienced at the Bayou Regional, a Week 3 regional.

As far as router placement, our original location isn't the best place, and is within a foot of some noisy shooter motors, but we relocated it during Bayou to a better location, obviously not fixing our issue.

As far as code goes... besides the fact that our code didn't change from Bayou and LSR, we went back to the basic framework code, no camera (unplugged), no CAN, no PID, nothing. We simply had solenoid and motor outputs, no sensor inputs at all.

Then comes the driver station. We had been using a Dell purchased this season as the DS, but during all the debugging, we switched to my laptop, a 2010 MacBook Pro (Core i5, 8GB RAM, SSD... a pretty powerful machine), and we kept on having the issues. Finally, when trying out the basic framework code, we switched to the Classmate PC, configured for this year and updated with the latest FRC updates, and no Windows Updates.

Basically, at the end of Bayou, we had a kitbot control system, but still couldn't maintain a connection, even sitting still.

All of this proving, in our case, to me, that this isn't an issue with Team Fusion 364's 2012 Robot, Aiminite. Our issue was not a robot issue. That's all I have concluded from the work I have done diagnosing the issues.

I'm not sure if we were the only one during the regionals to pack up a broken robot and pull out a working robot, but FIRST should look at our case for some good information. Team Fusion has concluded that the issue is not our robot.

Now to the big question that everyone would like to know... what is the issue? No one knows at this point. Everyone is just speculating.

Interference
I've been speculating that the issue is interference from the audience carrying hundreds of smart phones with WiFi enabled... but that even has errors in the hypothesis.

Lone Star had a much larger crowd than was at Bayou. It's much larger event. The Lone Star regional is also held in a bigger city (and actually in the city). Bayou was right next to the river, with just a small road leading up to it, not part of a major city arterie. It seems like LSR had the potential to be worse as far as interference goes. So interference seems to be a non-issue. And why would interference just target us? Things don't add up for this case. What about the WiFi channel used? What does happen with the Dlink is the 2.4GHz is left enabled?

Field Hardware Issues
Again, if it truly was FMS messing it, at Bayou, it would have affected everyone, not just Team Fusion. We had been placed in multiple spots on the field, but we had problems in all of them.

I'll be down on the Coast next week, and can test some things, but I need a list of things to test.

First off, I want to break communication. What can we do to break communication? We'll need to have two routers and use the same system FIRST uses.

I'm curious as to what will happen is we power up two of our robots, with the same IP address and everything. How will the system handle this? If it works, how do our logs look?

Then what happens if we run some noisy equipment next to the 'field' router. Can we run a Skil saw next to it and still maintain connection? How about running a FP motor at full speed next to it? What happens? How do the logs look with this?

Did Bayou have wireless lighting controls? Were their frequencies within the range of our channel provided? I can't tell.

Anyway, I want to help figure out this issue because it really messed us up in Bayou, and we have no explanation as to what happened.

techhelpbb
30-04-2012, 11:28
Now that I have have seen your most recent post, let me assure you that Team 11 is a long time friend and a team I certainly respect. When at competitions, I would work as hard helping them compete as my own team and I expect all my inspectors to also do the same.

Again Al I do respect you. However, you are not personally at all the competitions. Team 11 being friends with you doesn't reduce my concerns.

A level playing field is just that. It should not matter if Team 11 is friends with you. It should not matter if Team xxxx who just walked on that field you've never met is friends with you.

Troubleshooting is troubleshooting. We must insure that guidance for it is uniform. We also must insure that if a relatively unknown team does in fact eliminate all other problems we respect the effort and investigate.

efoote868
30-04-2012, 11:33
I originally believed that each robot was going to be assigned a separate channel and that they were using 802.11n @ 5 Ghz. The network at the MAR championship was indeed using 802.11n @ 5Ghz, and they are using a wide channel, however all robots were sharing the same channel and as such were sharing a total theoretically bandwidth of 300 Mbits.

I don't know of any wireless product that comes close to it's theoretical rating.

This year, I was running throughput testing for a demo board. At 10-30 feet in a quiet environment (as shown on a spectrum analyzer), we were achieving about 120 Mb/s out of a theoretical 300 Mb/s using IXChariot software.

If all robots are sharing one channel, that is pretty unnerving, especially given the throughput requirements for a single video feed.

techhelpbb
30-04-2012, 11:39
RyanN:

I deeply respect that power quality issues are not the only issue on the field. That's why I asked to separate the issue repeatedly.

Obviously when component in the power supply to the D-Link AP goes bad it can do so at anytime.

Team 11 drove our robots extensively before the competition in Philly even at other competitions. All of a sudden the DC-DC converter decided to be a problem. We, like apparently most people were not instructed to load test the D-Link AP supply, no process was provided nor even basic specifications. I did ask repeatedly in this topic whom else load tested that supply with no responses.

This was a power quality issue that was intermittent on the field only seemingly, but just because the timing of the failure coincided with the punishing match schedule.

Not all power quality issues are wiring problems. Not all power quality issues are component issues. As others have said, in the absence of procedure and tools it's very hard to walk a troubleshooting process to work that out especially under pressure and especially when there are real WiFi communications issues that keep undermining the basic troubleshooting process.

It's entirely conceivable in the current test environment for a team to check their wiring, replace the D-Link AP. Fail on the field. Replace their DC-DC converter and fail again on the field. Replace their PDB (which can easily consume all the time between matches) and fail on the field.

Then after all of that diligence, they fail on the field for other reasons which may, or may not have been there all along as well.

So in effect that's about 4 matches of effort right there where a robot might be totally or mostly non-functional. That's a great deal of ground to loose. Compound that with the fact that your robot's mechanisms might not be top tier and those 4 or 5 matches might put you out of the competition entirely.

I know we are here to above all have fun. How much fun is it for a rookie team to possibly loose 4 matches due to an unclear troubleshooting path, have to go home and find more sponsorship to come back next year, and while they are at it very likely be entirely unable to repair the issue once they leave that environment? Doesn't sound like fun to me and please be aware I was one of the people that founded Team 11.

Given the opportunity to treat this issue separately that's what should be done. I respect that your team expended all possible tools and efforts to eliminate your concerns. However, until someone helps me to separate these 2 topics properly I am left little choice to respond within the limited venue I've been offered. If people were trying to respect this basic foundational concern they'd help me move out of the way instead of demand my silence.

Perhaps more importantly, demanding my silence here is symptomatic of the fact that we have had communications issues with this system for years and in the past the finger was pointed at the robots. The teams had no reason to believe that the problem could be anything but them because the authorities on the subject insisted that it could not be the field. People should stop insisting, open the floor, divide the topics and if you want me to shut up and test myself let me get onto a competition field and do what I need to do.

Dad1279
30-04-2012, 15:19
..........
CRyptonite had a match with us where they dropped out, but after Bayou, I was told that they were experiencing the CAN Autonomous-to-Teleop transition bug.
........

What is the CAN Autonomous-to-Teleop transition bug?

I believe we were hit with this last year, but have not seen it documented or mentioned elsewhere. We had 2 unexplained no-coms at NYC regional last year, FTA said it was a bad battery. On a whim we removed CAN and switched to PWM for the next regional, and did not have a problem.

Has anyone else been hit with the 'Joysticks disconnect' problem this year? It would happen mid-match, and the fix was swapping the joysticks in the DS and swapping them back...

Mark McLeod
30-04-2012, 15:37
FRC 2012 - Known Issues List (https://decibel.ni.com/content/docs/DOC-21809)


Issue: If CAN is used in the Autonmous VI, the robot cannot transition to the Disable or Teleop VIs.

Workaround: Terminate the autonomous routine just before the VI would be aborted, to stop calling CAN updates before the end of the autonomous period
Move the CAN updates to Periodic tasks and have autonomous simply update setpoints

mjcoss
30-04-2012, 15:45
Unfortunately, which robots are on the field is part of the "environment", and so working in one regional and not in another could simply be the mix of robots. I believe that the DLINK is certainly one point of failure, from the battery to the power connection on the box, and internal to the DLINK. I also believe that there are issues with how the network is configured.

Maybe at one of the off season competitions - our first one is at Monty Madness, we can get a group together and poke and prod the field setup, and network with a mix of different robots. If we get enough robots together we might be able to map out some of these issues, on a real field.

I think that we would all like to have a level playing field for all robots. And to do that we need data on real fields, and I would like to be involved in the analysis, not simply told that it's "fixed" because I'll be back to the "there doesn't seem to be any issue with my robot, in the pit, or at our build site, and there is at competition". And once again, left to tell the kids that I can only theorize about what is wrong because the conversation is closed when it comes to discussing details about the field with FIRST

MaxMax161
30-04-2012, 16:07
Has anyone else been hit with the 'Joysticks disconnect' problem this year? It would happen mid-match, and the fix was swapping the joysticks in the DS and swapping them back...

Yup, we've had that three times this year and a few times last year. Still no ideas on how to prevent it.

Dad1279
30-04-2012, 16:42
@Mark - So that's an issue with Labview and not Java? (We were using Java)

Dad1279
30-04-2012, 16:45
Yup, we've had that three times this year and a few times last year. Still no ideas on how to prevent it.

Ah, a common factor.... It only hits teams from NJ! That should make it easier to find.....

Mark McLeod
30-04-2012, 16:53
@Mark - So that's an issue with Labview and not Java? (We were using Java)
It's a LabVIEW issue, although it's possible to write Java/C++ code that has the same problem (i.e., killing the CAN sequence mid-command) you pretty much have to go out of your way to do it.

The LabVIEW default framework is constructed to kill the Autonomous thread at the end of the mode. The good part of this is you can't easily get stuck in Auto like you can in the other languages with sloppy code. The bad part is, with sloppy LabVIEW CAN code, you can end up with corrupted CAN.

Al Skierkiewicz
01-05-2012, 09:24
You know very well that I can't test a competition field environment without a competition field. So you're telling people that I should basically test something you are insuring can not be tested.

The point again, for all to read, is not that all issues are power quality issues. The point is that when you do have power quality issues you are effectively putting those teams at a disadvantage. I fully admit...again and again in this topic there are more problems than merely the power quality.

Note: this next part was posted before Al edited his post.

So far as this not being a real issue...you prove once again that Team 11 and other teams finding bad power supply components feeding the D-Link AP is not considered a real problem. It's not a real problem why? Perhaps because Team 11 and these other teams weren't on the Einstein field? Going to be hard to get to that field when our not so 'real' problems get in the way.

Brian,
I suggest you look back and reread what I have said to you and to others. You believe your team obviously has a power issue but you refuse to perform the tests that would point to a power issue on your own robot. Do so and bring us the results. You do not need a competition field to do that (try using FMS Lite). Your robot remains a stand alone component that if power is an issue, it should show up that problem on your home field. If you find something that others have not, then and only then can FIRST actually decide to pursue an investigation using other methods. You must admit that what you are asking requires a rather large expense on First to accomplish. Bring valid data to support your claim and First can weigh your data against the need to expend further money and resources in the pursuit of a radio power supply issue.

As for myself and all other robot inspectors, we take our job very seriously. When we don the shirt and hat we take on every team as our own, bar none. When they fail on the field, we fail. When they go home with a loss, so to do we. We feel strongly that the success of an event lies our in our hands and the happiness of the competitors is our highest measure of success. We cheer for everyone because that is what we are required to do in this competition. When we see something wrong we try to fix it or we find someone who can. We do this at every competition for every team, all weekend, without regard to team number or qualifying slot. Should an inspector or Lead not meet these goals, I want to know about it. Teams need to be able to trust and believe in the assistance of their inspectors. Our stated goal is to insure you play (within the rules) and not to prevent you from playing. I can tell you that robots not running is a common discussion topic during volunteer meals for inspectors. We pass along what we find so that all teams can fully participate.

As to your finding a bad power component at one of your competitions, that is a known issue with a very small number of regulators. (a very few have been found defective from the factory) When questioned, most often the root cause is a team wiring issue that was later corrected by the team. For the record, that can take the form of the input and output of the regulator being swapped, the input or the output wiring reversed, the input to the radio reversed, or some of the wiring not fully insulated and shorting to the frame. As to other issues with robot radio, we have found the use of the wrong size coaxial plug (either the inside socket or the outside barrel or both), a team fix on a broken plug, a broken wire internal to the cable due to stress or bending, improperly applied hot glue that forced the connector open or un-soldered the power jack. More than half (~100 this year) of the power problems we find is teams wiring the radio directly to an unregulated 12 volt output on the PD. Other issues include blocking all the cooling holes in the radio, using damaged ethernet cables or after having broken a connector hot gluing same into an ethernet output on the radio. Mounting the radio upside down against a metal plate or mounting the radio on top of 2 CIMs in a transmission are also common. And finally teams mounting the radio deep inside the robot to "protect" it from harm.

So to repeat for everyone's homework, if you suspect a power issue with your robot radio, test and document the conditions under which the link fails. i.e. a power supply droop, noise (how much and what frequency), radio reboot or not, disabling of certain inputs or outputs, do you have a camera and does it stop working, does the robot stop after auto or after a specific length of time coming out of auto, etc. Make your findings public or send them to me, so others can duplicate the problem.

As far as your last paragraph above, I never said this was a non-issue. I said, in my experience I have not found any evidence that the power supply issues (that you suggest are occurring) have caused link issues. So far you haven't shown any evidence of this either. Your team found a defective regulator that for you and me is a smoking gun in your robot. That is a known problem, stated multiple times over the past two years by me (and many others) in several posts here on CD and in the LRI forum for all inspectors. For the record any problem is an issue for me. If you have one, the chances of someone else having it are good. However, I can only pursue problems if you can tell me what it is and repeat it. If you were closer I most certainly would suggest I spend a Saturday at your facility pursuing this.

techhelpbb
01-05-2012, 10:04
Brian,
I suggest you look back and reread what I have said to you and to others. You believe your team obviously has a power issue but you refuse to perform the tests that would point to a power issue on your own robot. Do so and bring us the results. You do not need a competition field to do that (try using FMS Lite). Your robot remains a stand alone component that if power is an issue, it should show up that problem on your home field. If you find something that others have not, then and only then can FIRST actually decide to pursue an investigation using other methods. You must admit that what you are asking requires a rather large expense on First to accomplish. Bring valid data to support your claim and First can weigh your data against the need to expend further money and resources in the pursuit of a radio power supply issue.

Al, Team 11 did have a power issue. Replacing the DC-DC converter at the competition did fix it. You want this DC-DC converter? I ask because when this topic and the vector it was following wouldn't stop before I offered to remove *myself* from this issue and let someone you were seemingly more comfortable with perform the analysis to your satisfaction.

The solution to the problem is past tense. It was found.

Other teams did have a power issue. They replaced the PDB and it did fix it.

Past tense and in the next part I accept that you publicly acknowledge this.


As for myself and all other robot inspectors, we take our job very seriously. When we don the shirt and hat we take on every team as our own, bar none. When they fail on the field, we fail. When they go home with a loss, so to do we. We feel strongly that the success of an event lies our in our hands and the happiness of the competitors is our highest measure of success. We cheer for everyone because that is what we are required to do in this competition. When we see something wrong we try to fix it or we find someone who can. We do this at every competition for every team, all weekend, without regard to team number or qualifying slot. Should an inspector or Lead not meet these goals, I want to know about it. Teams need to be able to trust and believe in the assistance of their inspectors. Our stated goal is to insure you play (within the rules) and not to prevent you from playing. I can tell you that robots not running is a common discussion topic during volunteer meals for inspectors. We pass along what we find so that all teams can fully participate.


Sir with respect. I am not accusing you of active favoritism. Other than the mere fact that when you don't like something other people tell you they don't get told facts and figures they get told to go figure it out on their own and basically stop bothering you (when in point of fact I've offered repeatedly to do just that with a topic split I can't myself perform). Realizing it or not, that exclusivity creates a situation where someone gets marginalized. I should hope no one intends that result or you can see that I am obviously not shy and I will call them out on it.

Furthermore, some of those students you have on that field crew are students I mentored. So let me be clear. I am extremely unhappy when I see students in my chosen profession. A profession to which I have been an apprentice. To which I have clawed my way up in versus vast nonsense. I am extremely unhappy when I watch them get reduced to tears. I am tired of FIRST using them as reason why I should watch what I say.

I should like to point out that the very first e-mail I ever got in private from anyone on ChiefDelphi was Eric telling me to watch what I say because 40,000 students are watching. Yes they are and they need leadership and answers.

Some of you are their leaders. You dictate the limits of their tools (you certainly shot down my offer of a tool to which you have little alternate). You dictate the information and facts they are presented. I feel given the sheer lack of students in this topic (especially among those telling me to go away) that the leadership of this matter is using them as a shield. Not that you Al are doing so...but I see it all over the place. Where we tell people don't judge them so poorly. I am not judging those students, sometimes my students poorly. I am demanding they be given an environment that is never allowed to come to this end!


As to your finding a bad power component at one of your competitions, that is a known issue with a very small number of regulators. (a very few have been found defective from the factory) When questioned, most often the root cause is a team wiring issue that was later corrected by the team. For the record, that can take the form of the input and output of the regulator being swapped, the input or the output wiring reversed, the input to the radio reversed, or some of the wiring not fully insulated and shorting to the frame. As to other issues with robot radio, we have found the use of the wrong size coaxial plug (either the inside socket or the outside barrel or both), a team fix on a broken plug, a broken wire internal to the cable due to stress or bending, improperly applied hot glue that forced the connector open or un-soldered the power jack. More than half (~100 this year) of the power problems we find is teams wiring the radio directly to an unregulated 12 volt output on the PD. Other issues include blocking all the cooling holes in the radio, using damaged ethernet cables or after having broken a connector hot gluing same into an ethernet output on the radio. Mounting the radio upside down against a metal plate or mounting the radio on top of 2 CIMs in a transmission are also common. And finally teams mounting the radio deep inside the robot to "protect" it from harm.

So to repeat for everyone's homework, if you suspect a power issue with your robot radio, test and document the conditions under which the link fails. i.e. a power supply droop, noise (how much and what frequency), radio reboot or not, disabling of certain inputs or outputs, do you have a camera and does it stop working, does the robot stop after auto or after a specific length of time coming out of auto, etc. Make your findings public or send them to me, so others can duplicate the problem.


I would love to know, in a competition where I was one of the few examples of someone that brought an oscilloscope to a competition (a CRT unit at that).

In a competition where I had students that did not even know what the oscilloscope was nor it's application on the robot.

Considering that I offered test equipment for mounting on a moving robot, or even to open the door to using the cRIO which was shot down in the official FIRST Q&A forum after an unusually long delay.

How you figure that so many of the people you ask have the tools to deliver on your request?

Especially, again, when I point out that the tools to eliminate this issue on the active field...when it's the most disastrous they do not have. They can't even use the cRIO on the competition field as a data logger on that power supply.

Even if I show you what we already know. That power issues are the bottom of the troubleshooting tree and that we all admit they effect people. The point has and remains that what a robot does on my test field sometimes means nothing once I toss it in a bag. Ship it all over creation. Smack it into walls. Have pieces torn out of it by other robots accidentally and expose it to just general wear and tear.

The entire point of the monitors I created or even using the cRIO as a data logger on the D-link AP power supply was to prevent incidental damage as the result of wear and tear from blind siding teams that have otherwise done everything right and just don't realize the cumulative damage they are taking. We at Team 11 had no reason nor any approved method to track that damage. That's not unique to Team 11 at all. There's no approved method to get into that point while moving, and certainly no example I can find of anyone even load testing that supply and again I did ask openly in this very topic.


As far as your last paragraph above, I never said this was a non-issue. I said, in my experience I have not found any evidence that the power supply issues (that you suggest are occurring) have caused link issues. So far you haven't shown any evidence of this either. Your team found a defective regulator that for you and me is a smoking gun in your robot. That is a known problem, stated multiple times over the past two years by me (and many others) in several posts here on CD and in the LRI forum for all inspectors. For the record any problem is an issue for me. If you have one, the chances of someone else having it are good. However, I can only pursue problems if you can tell me what it is and repeat it. If you were closer I most certainly would suggest I spend a Saturday at your facility pursuing this.

Here's the thing Al. Unless someone tells me otherwise I'm going to Monty Madness with the tools I made to reduce the conflict on this issue...which perversely people have aligned to increase the conflict on these issues. I expect given the initial excitement that FIRST had when I mentioned what I created and was offering: that FIRST will give these monitors, that community electronic motor control and most especially LinuxBoy's projects a fair shake. I also expect that when teams ask FIRST to use a piece of approved hardware to monitor a system in a way that is inconsequential to it's function, but hugely consequential to the performance of the robot they'll let them do it even on a competition field.

I have every reason to believe based on the pattern I've been a personal witness to for the last week that I need to say that paragraph above publicly. I personally over a year ago offered to provide nearly the exact same thing LinuxBoy has been working on and the feeling I got personally was that I should go away. Do not do that to LinuxBoy, do not involve him in what seems to be an issue some of you have with me.

Al Skierkiewicz
01-05-2012, 10:32
Brian,
Your last post is confusing, what are you trying to say? Are you saying I act with favoritism because I don't agree with your suggestions? With your methods or with your premise?
What are you referring to in the paragraphs about students and field teams?
I am suggesting that you use your monitors on your own robot on your practice field to give First and the community some data to digest and duplicate. Are you against that?
As to you having something on your robot during Monty Madness, that is up to you and the event staff.
As to First's reaction to your hardware, I have no knowledge of what you are relating to us. For the record I am not FIRST staff, I am not a member of the GDC, I am a key volunteer.
As to adding monitoring to a robot, may I refer you to StangSense. We used the control system in conjunction with a custom circuit, to monitor current/voltage to critical systems, mostly motors, using a method that did not interrupt the power pathways to electrical components on our robot. We also used that system in a mobile design to aid other teams in the pursuit of problems on their robots for several years. We also described the data we collected and the subsequent issues we observed to anyone who wanted them and that data aided in the design of the current PD power circuitry. FIRST added several test bed monitors to operating robots (including ours) during the design phase as well. I assisted with at least some of the interpretation of that collected data during the collection at IRI. The boost/buck regulators on the PD are a direct result (I believe) of that research.

Gdeaver
01-05-2012, 10:44
First has a serious problem with dead robots on the field. The radio power supply and cabling has known issues. Given the severity of this problem and most likely there are many points of failure all points need to be looked at. The power supply issue needs to be looked at closely to see if there are problems other than cabling. I mentioned that there are potential problems with a switching power supply feeding a switching power supply feeding a switching power supply. I have never head of anybody that has characterized the load dump and switching transients on the PD. Does this affect the power supply to the radio? One transient that gets through is all that is needed to potentially lock up the radio electronics. It would be a very good idea to get some real on the field data. This needs to be done at a competition. A competition like Monty Madness should be RF noisy enough to put the radio into a high power state. ( gain scheduling) Team 1640 will be at Monty. We would be willing to have data logging put on to our robot. Can't load code on our c-rio. It's busy enough. Has to be stand alone. We have a swerve drive using 8 motors plus 4 more motors for ball handling and shooting. Should be a good heavy load environment for some testing. The testing should be able to see short transients. So if this want to be done our robot is available. Contact me if our robot is needed.

Jon236
01-05-2012, 11:09
Then comes the driver station. We had been using a Dell purchased this season as the DS, but during all the debugging, we switched to my laptop, a 2010 MacBook Pro (Core i5, 8GB RAM, SSD... a pretty powerful machine), and we kept on having the issues. Finally, when trying out the basic framework code, we switched to the Classmate PC, configured for this year and updated with the latest FRC updates, and no Windows Updates.





Just curious, was your wifi adapter disabled on your laptop when you connected to FMS?

techhelpbb
01-05-2012, 11:11
Brian,
Your last post is confusing, what are you trying to say? Are you saying I act with favoritism because I don't agree with your suggestions? With your methods or with your premise?

No I am merely stating that the power supply issues, on the whole robot in point of fact, are often a great big grab bag of problems many of which I feel are beyond the resources of some teams.

We don't expect teams to solder. If they can great. We don't expect teams to engineer the cRIO. If they can great. We expect teams to follow directions, but what directions can I hand them to follow?

Where are the directions and parameters to load test that power supply? What tools do they need? What do the tests in the directions tell, or not tell about the robot?

I am not the first to ask these questions. When we ignore that some teams are far more equipped than others. When we ignore that off the competition fields these teams may be more greatly isolated. We create a situation that while I agree with you seems so simple...to them is disaster looking for a place to happen for them.


What are you referring to in the paragraphs about students and field teams?
I am suggesting that you use your monitors on your own robot on your practice field to give First and the community some data to digest and duplicate. Are you against that?


I'll be at Monty Madness at the request of field personnel. I am not against it at all.


As to you having something on your robot during Monty Madness, that is up to you and the event staff.


Exactly. I was willing to let FIRST shut down the idea of monitoring the D-Link power supply with my my monitors without too much fuss because it was made clear to me that off season events are the place to beta test. I agree with the KOP team at FIRST that they should have all opportunity to review a piece of test hardware they've not seen before.

I was surprised when FIRST said the cRIO couldn't monitor the power supply of the D-Link AP. In effect that means that we can't look while the robots are running on the competition field with hardware FIRST should by now know quite well.


As to First's reaction to your hardware, I have no knowledge of what you are relating to us. For the record I am not FIRST staff, I am not a member of the GDC, I am a key volunteer.


Fair enough Al. It's just that this a FIRST sponsored forum. It's hard to ignore the authority from which some of you speak.


As to adding monitoring to a robot, may I refer you to StangSense. We used the control system in conjunction with a custom circuit, to monitor current/voltage to critical systems, mostly motors, using a method that did not interrupt the power pathways to electrical components on our robot. We also used that system in a mobile design to aid other teams in the pursuit of problems on their robots for several years. We also described the data we collected and the subsequent issues we observed to anyone who wanted them and that data aided in the design of the current PD power circuitry. FIRST added several test bed monitors to operating robots (including ours) during the design phase as well. I assisted with at least some of the interpretation of that collected data during the collection at IRI. The boost/buck regulators on the PD are a direct result (I believe) of that research.

I have no doubt that StangSense and your work is well considered Al. The problem for me remains. That you do not personally build all of our robots. The need for teams to collect data doesn't really end in a beta or in just a few robots or even in the pits at competition.

Data collection against issues, assuming proper respect for doing it properly, should not discouraged in any way. In fact, it should be an extremely common thing for FIRST teams beyond just reading the voltage of the battery. I don't feel based on the current rules and based on the current status of things that such collection is widely encouraged. Worse, because of a lack of simple instructions I feel that it's beyond the reach of some people that really need it.

Al Skierkiewicz
01-05-2012, 11:39
Fair enough Al. It's just that this a FIRST sponsored forum. It's hard to ignore the authority from which some of you speak.

This is not a FIRST sponsored forum, this is a team forum and always has been. Only the First Forum is sponsored by First. For myself, as for others that may or may not be FIRST staff, we write here for the teams that need advice. As part of that responsibility we (the collective CD community) feel strongly in correcting occasional errors in posts to insure that teams get the answers they need. I write personal opinions on issues related to the community in general from my collective experience (17 years on WildStang and 43 years as a Broadcast Engineer).

I still have to ask what your reference to students on field teams means?

techhelpbb
01-05-2012, 12:15
This is not a FIRST sponsored forum, this is a team forum and always has been. Only the First Forum is sponsored by First. For myself, as for others that may or may not be FIRST staff, we write here for the teams that need advice. As part of that responsibility we (the collective CD community) feel strongly in correcting occasional errors in posts to insure that teams get the answers they need. I write personal opinions on issues related to the community in general from my collective experience (17 years on WildStang and 43 years as a Broadcast Engineer).


Fair enough if I am mistaken. Your forum's position is well established in FIRST circles and the frankly your forum actually often comes up before FIRST forums in a variety of situations. My apologies for misinterpreting the extent of your influence.

I just want to make myself very clear again, that nothing I've written in this topic is intended to devalue your views or your experience.


I still have to ask what your reference to students on field teams means?


It all comes down to how leadership decides to handle situations.

I see people all over the place in these topics suggesting that some of us have lost confidence in the field personnel because we speculate and try to solve the problems (funny to be asked not to solve problems in situations where it's the task of the day normally...especially difficult considering the enthusiasm that FIRST inspires in regard to troubleshooting regardless of whom you are or what you know). That somehow we have lost confidence in FIRST's ability to troubleshoot this issue on the field.

I have lost no confidence in any field personnel. None at all. Not one field person I've seen recently seemed to me to be intent on making the playing field anything other than fair, circumstances beyond their control may influence the situation otherwise however. It was certainly clear to me on several occasions they were not fully equipped or fully prepared with quick, direct and specific resolutions to a variety of problems. Again there's a limit to what they can bring to the table, having never seen your robot before, having no access to valuable tools, and sometimes not having information about things that could and probably should have been given to them before these events. As my example in this topic, I was expressly assured by field personnel that the field was on 2.4GHz...not 5Ghz. That's a problem that needs to be addressed, it was a mere simple error I'm sure, but it's an error that masks other communications issues. Collecting anecdotal information before events is no real solution to the issue either of getting these field folk and the teams the information they need. Continuous, specific, directed data collection is a good solution.

I also brought that up because some people were genuinely thrilled that I had actually bothered to think to bring an oscilloscope to an event as the guy that didn't even know he was competition field spare parts until the evening before. People at that event in management of it made it very clear they intended to recommend that an electronics table, or at least the relevant basic tools were available to teams at events. They were thrilled that I sat there making tails for 2GO PC's with broken Ethernet ports. I even made a bunch of cables for the field and for teams. It was extremely clear to me that some of these teams had no resources to do this for themselves I wonder what would have become of them had I not stepped up.

When I wander through forums often reading but not posting to topics and I see people sort of suggesting that field personnel are somehow to deal with this. When I see people suggesting we should be quiet because the field personnel might feel we are being predatory on them. It makes me very unhappy. The field personnel have done all they could, in many cases much more than they should have been able to do...

However, they are restrained by the tools, by the quality of their instructions, and most importantly by restrictions on information that need to be removed.

I don't find any fault for these people running the fields, in the end FIRST is the central control canopy for all that happens. If the information doesn't get there. If the instructions were not given. If the tools were not there. FIRST needs to do something about it. Not have some people running around topics suggesting that other people are not patient with the field personnel or that the field personnel are the issue. There's no need for people to be silenced, but we can all see the unprecedented and absolute need that this problem...in fact all problems that drop robots dead on the field be meticulously managed and removed.

Data collection needs to be done. It needs to be done to analyze this mess at the Championship. It needs to be done within the organization. It needs to be a tool handed to every nook and cranny of the people involved and it does need to involve the people in the community. It needs to be continuous and it shouldn't be disappearing behind the curtains now. Blame is irrelevant and again purely anecdotal.

Hugh Meyer
01-05-2012, 13:36
Out of band interference could also be part of the problem.

I remember at the 2007 DARPA Grand Challenge they had a huge video display setup near the starting line. The three robots closest to the screen couldn’t move until they shut down the giant display. It was about the same distance the Einstein field was from the display in the dome.

Another possible culprit is the high concentration of video and other equipment near to the FMS access point. I often see the access point sitting adjacent to a television video monitor, which I would consider as bad as putting the robot radio on top of a CIM motor.

-Hugh

Al Skierkiewicz
01-05-2012, 14:21
Brian,
These students???...


Furthermore, some of those students you have on that field crew are students I mentored. So let me be clear. I am extremely unhappy when I see students in my chosen profession. A profession to which I have been an apprentice. To which I have clawed my way up in versus vast nonsense. I am extremely unhappy when I watch them get reduced to tears. I am tired of FIRST using them as reason why I should watch what I say.

techhelpbb
01-05-2012, 14:24
Brian,
These students???...

A fair number of your field crews are college students. They come back and help out.
No specific event was implied.
Further, I'm including all the personnel at an event in the crew.

Additionally (post 139):
http://www.chiefdelphi.com/forums/showthread.php?t=106042&page=10
"This could have been prevented long before Einstein, but that is a discussion for other threads which we have all been following all season. I saw one FTA in tears afterwards, and I know everyone involved was doing all they could."

Al Skierkiewicz
01-05-2012, 14:59
Hugh,
The LED wall does give off a fair amount of noise. It is mostly in the form of buzz and wreaks havoc with dynamic microphones. That's why you don't see anyone speaking from behind the scorer's table on the big stage I would suspect. I have some experience with RF mics and wireless "IN EAR" monitors and have used a spectrum analyzer when checking for problems in several shows here. Most are in the UHF band now that the FCC has been involved in unlicensed wireless devices in unused TV channels. Almost everyone is using Wi-Fi to interconnect various audio interfaces and haven't had any issues. I have not seen anything in the bands I am looking at so that leads me to believe that harmonics beyond 700 MHz do not exist. Each little block connects to the one next to it to assemble a large screen. Then you tell the master-controller how many blocks wide and high and if you want it to portion the screen or simply display the entire video signal. There is a CRT monitor on the scorer's table but is an analog device as I remember. There are a variety of LCD monitors as well. Most of that stuff has a fair amount of shielding internal so I don't expect anything in 5 GHz range from any of that stuff.

RyanN
09-05-2012, 21:24
I've been messing around with things today and yesterday trying to break the robot.

Power and brownouts should be a non-issue. If the teams use the OEM cable, the connection remains consistent and has no drop outs.

We also were running with a dead battery. The first thing to give out was the digital sidecar. It resulted in the robot quickly flashing our LEDs on and off as it would brown out the DS,.causing the relays to turn on and off rapidly. The voltage went down to about 4V before I shut off the light show. The cRIO (4-slot) and the Dlink remained powered and connected the whole time with no dropouts.

We have also been testing CAN out a bit, and found some unsettling results.

If you have an intermittent connection where a CAN devices does not appear on the CAN network, the whole CAN network will drop out, and the digital side car will not work. The thing is that the robot remains connected the entire time.

This was not the cause of our problem as we went back to PWM during Bayou, but I can see teams complaining about no communication due to this.

We originally had 4 CAN-enabled Jaguars, IDs 2, 3, 4, and 5. ID 5 was the last in the chain, and had the terminator resistor installed. We wanted to test 8 Jaguars over the RS-232 to see how well it worked. Well, we assigned IDs 6, 7, 8, and 9 to the drive motors. When we plugged it all back in, nothing worked. We found out that when we extended the network, that the Jaguar with ID 5 had a bad RJ-45 port. To work temporarily, we rearranged the cables to make that Jaguar the last in line, and it all worked again.

So that's another idea that I had as far as troubleshooting.

We can't use the 8 jags though... I'm not sure if we reached the limit of RS-232 driving CAN, but we were getting tons of CAN timeouts, and Drive Motors not being called fast enough warning (in teleop). I analyzed the timing of Telop, and found that it ran every 10ms on average, but 60ms to 500ms spikes were not uncommon (where we saw the timeouts). When this occurred, we would also glitch/jerk.

I think it has to deal with calling CAN devices at different times. We have Periodic Tasks running a shooter control loop every 50ms, and then Teleop is called every 50ms. I think there is a resource conflict and timing issues and we occasionally get a bus lockup for a split second. Going back to PWM, we run at a constant of 3 to 4ms to run the Telop task.

Mark McLeod
09-05-2012, 22:08
Teleop shouldn't be showing a loop time any faster than 19-20ms, because it's triggered by the DS packets arriving every 20ms.

You've got something else wrong going on with Teleop.

RyanN
10-05-2012, 12:36
Teleop shouldn't be showing a loop time any faster than 19-20ms, because it's triggered by the DS packets arriving every 20ms.

You've got something else wrong going on with Teleop.

I'm speaking of the actual time it takes to finish the teleop vi, not how often it is called. It is still called every 50ms as it should be. The actual processing of tr teleop vi is what takes only 3-4ms, and it should be pretty low and consitent for all teams.

linuxboy
10-05-2012, 19:45
I imagine the Digital Side Car not working is not a result of the CAN device dropping out, but rather, the error handling when a CAN device drops out is expensive and therefore watchdogs start tripping because loops aren't going fast enough and packets are processed quickly enough and suck.

[QUOTE=RyanN;1168414
This was not the cause of our problem as we went back to PWM during Bayou, but I can see teams complaining about no communication due to this.

Again, expensive error handling means the cRIO stops processing communication packets with the DS, and there for it "loses comms"
[QUOTE=RyanN;1168414
We originally had 4 CAN-enabled Jaguars, IDs 2, 3, 4, and 5. ID 5 was the last in the chain, and had the terminator resistor installed. We wanted to test 8 Jaguars over the RS-232 to see how well it worked. Well, we assigned IDs 6, 7, 8, and 9 to the drive motors. When we plugged it all back in, nothing worked. We found out that when we extended the network, that the Jaguar with ID 5 had a bad RJ-45 port. To work temporarily, we rearranged the cables to make that Jaguar the last in line, and it all worked again.

So that's another idea that I had as far as troubleshooting.

We can't use the 8 jags though... I'm not sure if we reached the limit of RS-232 driving CAN, but we were getting tons of CAN timeouts, and Drive Motors not being called fast enough warning (in teleop). I analyzed the timing of Telop, and found that it ran every 10ms on average, but 60ms to 500ms spikes were not uncommon (where we saw the timeouts). When this occurred, we would also glitch/jerk.
[/QUOTE]
So, since the last Jag in line's other RJ-12 (RJ-45 is 8p8c I think, whereas the CAN plugs on the jaguars are 6P6C and 6P4C) port was bad, the terminator couldn't terminate, therefore it was like having no terminator. This wouldn't be an issue in a shorter chain since there isn't as much need for a terminator (I don't know why) but in a longer chain, it would cause timeout errors, and again, the error handling is expensive, so it would slow down the loop.

mjcoss
10-05-2012, 20:18
I know that in the C++ case TeleopPeriodic is triggered, by default, on the arrival of a DS packets. This is bad if the network has a lot of jitter or latency issues. You can set the period, but then of course you need to handle the case where you have stale data.

RyanN
11-05-2012, 01:33
I imagine the Digital Side Car not working is not a result of the CAN device dropping out, but rather, the error handling when a CAN device drops out is expensive and therefore watchdogs start tripping because loops aren't going fast enough and packets are processed quickly enough and suck.


Again, expensive error handling means the cRIO stops processing communication packets with the DS, and there for it "loses comms"

So, since the last Jag in line's other RJ-12 (RJ-45 is 8p8c I think, whereas the CAN plugs on the jaguars are 6P6C and 6P4C) port was bad, the terminator couldn't terminate, therefore it was like having no terminator. This wouldn't be an issue in a shorter chain since there isn't as much need for a terminator (I don't know why) but in a longer chain, it would cause timeout errors, and again, the error handling is expensive, so it would slow down the loop.

The compressor is controlled in its own loop in periodic tasks and called every second. For no reason should CAN errors cause it to drop out like it is.

Alan Anderson
11-05-2012, 07:28
The compressor is controlled in its own loop in periodic tasks and called every second. For no reason should CAN errors cause it to drop out like it is.

There actually is a possible reason for CAN errors to shut off the compressor. It's a second-order effect. The error processing and reporting uses up a lot of system resources, delaying other actions, including the communication task. The system watchdog times out because the data from the Driver Station didn't get processed in time. Every PWM and Relay output on the Digital Sidecar gets shut off. Since the compressor is controlled by a Spike relay, it stops.

Al Skierkiewicz
11-05-2012, 07:35
Oliver,
From what I understand, the resistors are an integral part of CAN signaling in addition to terminating the bus. With an intermittent resistor, CAN devices cannot signal when they want to transmit data to the bus.
Has anyone used a CAN splitter to any advantage?

Jon236
11-05-2012, 09:48
There actually is a possible reason for CAN errors to shut off the compressor. It's a second-order effect. The error processing and reporting uses up a lot of system resources, delaying other actions, including the communication task. The system watchdog times out because the data from the Driver Station didn't get processed in time. Every PWM and Relay output on the Digital Sidecar gets shut off. Since the compressor is controlled by a Spike relay, it stops.


Alan,

Great analysis. What I'm trying to figure out is how we've advanced, if when we used C, we would be careful about interrupt priorities so that critical functions (i.e. packet transmission) were not delayed. When I use Labview, I run sensors and actuators in separate vi's, not in the main case structure. I know that if I'm not careful, a separate process can eat up resources as well, but doesn't Labview allow you to prioritize specific processes? Could we then not protect comms? Of course, careless programming might have the robot stop because of the causes you outlined. But the comms would stay up.

Racer26
11-05-2012, 09:57
While all this discussion of CAN is great, I know for a fact it is NOT the problem that 1114/2056 had on Einstein. They both exclusively use Victors.

RyanN
11-05-2012, 11:08
I imagine the Digital Side Car not working is not a result of the CAN device dropping out, but rather, the error handling when a CAN device drops out is expensive and therefore watchdogs start tripping because loops aren't going fast enough and packets are processed quickly enough and suck.


Again, expensive error handling means the cRIO stops processing communication packets with the DS, and there for it "loses comms"

So, since the last Jag in line's other RJ-12 (RJ-45 is 8p8c I think, whereas the CAN plugs on the jaguars are 6P6C and 6P4C) port was bad, the terminator couldn't terminate, therefore it was like having no terminator. This wouldn't be an issue in a shorter chain since there isn't as much need for a terminator (I don't know why) but in a longer chain, it would cause timeout errors, and again, the error handling is expensive, so it would slow down the loop.

While all this discussion of CAN is great, I know for a fact it is NOT the problem that 1114/2056 had on Einstein. They both exclusively use Victors.

Trust me. I know that wasn't the issue. It also wasn't our issue at bayou because we got rid of everything on our robot besides motors and solenoids. All using PWM, relay, and solenoid outputs, no camera.

It's just a potential problem that can come up that isn't easily diagnosable. CAN errors shouldn't cause everything to shut down.

linuxboy
11-05-2012, 11:48
Oliver,
From what I understand, the resistors are an integral part of CAN signaling in addition to terminating the bus. With an intermittent resistor, CAN devices cannot signal when they want to transmit data to the bus.
Has anyone used a CAN splitter to any advantage?

The terminator is required by the CAN spec, but it is there to prevent "signal reflection" which messes things up, however it will operate, without it, albeit with a smaller number of devices on the bus, however, my understanding is that no terminator at the end of the bus leads to issues with the Jaguars at the beginning of the bus, and lacking a terminating resistor at the beginning of the bus leads to issues at the end of it.

I have seen pictures of CAN splitters (there is a discussion about them on here somewhere), but they need terminators on each branch, and I believe the more branches you have the shorter each cable must be. In terms of CAN not being something that should shut everything down, when it comes to error handling and timeouts it can, even if it shouldn't.

Trust me. I know that wasn't the issue. It also wasn't our issue at bayou because we got rid of everything on our robot besides motors and solenoids. All using PWM, relay, and solenoid outputs, no camera.

It's just a potential problem that can come up that isn't easily diagnosable. CAN errors shouldn't cause everything to shut down.
I wasn't saying that CAN could have caused the problem at Bayou for you guys, I was just responding to the testing you were doing with it. I really do hope that the root cause of what happened to you guys at Bayou is found, I'd love to hear it.

Al Skierkiewicz
11-05-2012, 12:43
Oliver,
The nature of the signaling on the CAN bus uses switching in the transceivers to cause a device to enter a 'dominant' state for transmit and a 'recessive' state for receive. It is my understanding this is accomplished by driving the bus in a way that causes current flow in the terminating resistors. The CAN terminating resistors also provide the transmission line termination to allow 1 Mb/s traffic on a buss 40 meters long. Microchip has a nice application note, AN228, describing one of their products and the CAN signaling.

EricVanWyk
11-05-2012, 12:56
Mikets did some interesting work on CAN backboning and posted about it here: http://www.chiefdelphi.com/forums/showthread.php?t=102108

Al Skierkiewicz
11-05-2012, 14:05
Thanks,
I remember that discussion now.

linuxboy
11-05-2012, 14:06
Oliver,
The nature of the signaling on the CAN bus uses switching in the transceivers to cause a device to enter a 'dominant' state for transmit and a 'recessive' state for receive. It is my understanding this is accomplished by driving the bus in a way that causes current flow in the terminating resistors. The CAN terminating resistors also provide the transmission line termination to allow 1 Mb/s traffic on a buss 40 meters long. Microchip has a nice application note, AN228, describing one of their products and the CAN signaling.

I will talke a look at the info about that chip, thanks! My comments are based purely on observation of the behavior in certain fault conditions with the FRC control system. I don't know exactly how the internals of CAN work (yet, this is what the offseason is for!).

- Oliver

Al Skierkiewicz
11-05-2012, 14:12
Oliver,
Still learning myself. Considering the actual implementation was for cars, I would suspect that the protocol would allow for devices to go off line without killing the whole thing. So I am wondering what actually would cause the effect described above with a loss of a single device. Having seen the internal destruction on some Jaguars, it is always possible that an internal failure could have taken out the transceiver and shut down (shorted) the bus. In that condition, I would guess it is possible for the Crio to keep attempting to transmit data and do nothing else. I do not have any data to suggest that is what happened, just that it is possible. I bet Jim Zondag would know or at least could render an opinion.

techhelpbb
11-05-2012, 18:04
http://www.ti.com/lit/ds/symlink/sn65hvd1050.pdf

The application note for the CAN bus transceiver indicates that it has a dominant-time-out circuit that should stop a software error from simply holding the open-collector CAN transmission circuit down to ground and stopping all communication on the CAN bus. (IE the microcontroller crashes and stops doing anything of value and locks up the entire CAN bus.)

However this protection circuit has a fault. It only stops the Jaguar from stomping on the CAN bus if the TXD pin from the microcontroller goes to a logic low and stays there. It does not stop the Jaguar from going crazy and simply randomly stomping on the bus by changing the logic state of the TXD pin at bad moments. (IE the microcontroller gets overwhelmed by say interrupts from the encoder and starts running around speaking gibberish on the CAN bus.)

A quote from that datasheet:
"A dominant-time-out circuit in the SN65HVD1050 prevents the driver from blocking network communication with a hardware or software failure. The time-out circuit is triggered by a falling edge on TXD (pin 1). If no rising edge is seen before the time-out constant of the circuit expires, the driver is disabled. The circuit is then reset by the next rising edge on TXD."

If something can overwhelm the Jaguar's internal processing (like the encoder can) and cause it misunderstand it's timing on the CAN bus it could still stomp out the communications of other Jaguars occasionally or even sufficiently to lock the entire bus if the Jaguar's microcontroller can just keep picking the right moments to fail to run silent on the bus.

Dropping even a few bits here and there randomly from the protocol should be sufficient to render the expected functions of the entire CAN bus essentially erratic.

More advanced CAN solutions eliminate this possibility because they are not merely drivers. They are themselves peripherals designed not to fail in this application. If your processor were to crash while driving a controller like this (and I've done it plenty of times) the CAN bus would continue to operate regardless of that failure.

For example:
http://ww1.microchip.com/downloads/en/devicedoc/21801d.pdf

The CAN solution in the Jaguar per the datasheet:
http://www.ti.com/lit/ds/spms047g/spms047g.pdf

Does implement a controller, but it's interrupt based. So if the processing unit is overwhelmed it may not realize that it's failed to meet it's timing requirements and then bad news (from page 569):

"15.3.14 Bit Timing Configuration Error Considerations
Even if minor errors in the configuration of the CAN bit timing do not result in immediate failure, the
performance of a CAN network can be reduced significantly. In many cases, the CAN bit
synchronization amends a faulty configuration of the CAN bit timing to such a degree that only
occasionally an error frame is generated. In the case of arbitration, however, when two or more
CAN nodes simultaneously try to transmit a frame, a misplaced sample point may cause one of the
transmitters to become error passive. The analysis of such sporadic errors requires a detailed
knowledge of the CAN bit synchronization inside a CAN node and of the CAN nodes' interaction on
the CAN bus."

Also the CAN solution in the Jaguar is adaptive for timing so if you start spewing badly timed bits onto the bus then this can cause you issues (page 559):
"Once the CAN module is initialized and the INIT bit in the CANCTL register is cleared, the CAN module synchronizes itself to the CAN bus and starts the message transfer. As each message is received, it goes through the message handler's filtering process, and if it passes through the filter, is stored in the message object specified by the MNUM bit in the CAN IFn Command Request (CANIFnCRQ) register."

mjcoss
11-05-2012, 19:24
This seems like the exact situation that we had. We had one of Jaguars on our shooter motor at the Rutgers competition flood the CAN bus, and since we were/are an all CAN bus robot, we were dead in the water. Why and how it got in that state, and stayed there is not clear. Power cycling the robot did not clear it. We replaced the offending Jaguar, after we replaced the 2CAN because as far as we could tell, the whole bus was dead.

linuxboy
12-05-2012, 00:55
Oliver,
Still learning myself. Considering the actual implementation was for cars, I would suspect that the protocol would allow for devices to go off line without killing the whole thing. So I am wondering what actually would cause the effect described above with a loss of a single device. Having seen the internal destruction on some Jaguars, it is always possible that an internal failure could have taken out the transceiver and shut down (shorted) the bus. In that condition, I would guess it is possible for the Crio to keep attempting to transmit data and do nothing else. I do not have any data to suggest that is what happened, just that it is possible. I bet Jim Zondag would know or at least could render an opinion.

It isn't a CAN protocol problem, it is a cRIO problem that causes the problems on the CAN bus to affect the entire robot. If you take a look at the traces on the Jaguar PCB you should actually see that the two RJ12 ports actually have traces connecting the pins on both (pass thorough) such that even with a unpowered jaguar the CAN signal should still be passed through (although you will still get timeout errors).

The reason the cRIO causes problems is the error handling is expensive processor time wise especially if the errors are not handled correctly (printing lots of errors to the console and such).

Al Skierkiewicz
12-05-2012, 08:38
Oliver,
If we go on the assumption that the CAN is echoed through the serial port, it is not hard to imagine that something is confusing the traffic flow there. I can't tell you the priority the Crio takes when handling serial port issues but I would guess it would be one of the more important tasks. A big flaw in the Jaguar in my opinion, is the CAN connectors facing up and having a rather large target for all kinds of trash. I have seen more than a few with aluminum fluff and usually advise the team to clean out the Jag and tape over the connector. It is of course possible for stray conducting material to simulate a dominant condition from a Jag if it gets into the right places.
If I would to have any input on a rebuild on the Jag, I would add a dip switch or jumper to turn on a terminating resistor. The current method leaves a lot to be desired on such a critical item as buss termination. I have seen far more scary teminators than good ones.

Levansic
12-05-2012, 16:21
I have seen pictures of CAN splitters (there is a discussion about them on here somewhere), but they need terminators on each branch, and I believe the more branches you have the shorter each cable must be.

Whoa a second. The splitters change the physical topology from a daisy chain to a star. This doesn't change the party line nature, nor the end of run requirement of the terminating resistors. You still need only two 100 ohm resistors. Adding more would degrade the signal.

The resistors are there to separate the potential of CAN high and CAN low signal lines. The signal is carried by the differential of these two lines. Add too many terminators, and the signal voltage bleeds off too quickly. Remember how resistors in parallel work. 100 ohm is lower than the 120 ohm CAN spec. This helps to artificially limit cable length. If you add a terminator on every split out jaguar, your cables would have to be very short.

If you use a splitter for a more robust connection (we did), only terminate at the end of the backbone (as Mikets did) or the other side of the jaguar with the longest cable. We did the latter, because our shooter cable was longer than the rest of our backbone. The other end of the bus will be terminated by the serial adapter or a 2CAN.

-- Len

linuxboy
12-05-2012, 23:05
Oliver,
If we go on the assumption that the CAN is echoed through the serial port, it is not hard to imagine that something is confusing the traffic flow there. I can't tell you the priority the Crio takes when handling serial port issues but I would guess it would be one of the more important tasks. A big flaw in the Jaguar in my opinion, is the CAN connectors facing up and having a rather large target for all kinds of trash. I have seen more than a few with aluminum fluff and usually advise the team to clean out the Jag and tape over the connector. It is of course possible for stray conducting material to simulate a dominant condition from a Jag if it gets into the right places.
If I would to have any input on a rebuild on the Jag, I would add a dip switch or jumper to turn on a terminating resistor. The current method leaves a lot to be desired on such a critical item as buss termination. I have seen far more scary teminators than good ones.

CAN is not echoed on the Serial Port, the way the Serial bridge works is the cRIO communicates with the first Jaguar in the chain using the RS232 protocol (so the first one in the chain must be powered to do the conversion). Following Jaguars converse using CAN, but all the data that goes between the cRIO and the first jaguar is in fact transfered using rs232, however, the control flow lines are not connected, only the TX, RX, and GND lines are connected, however there are both TX and RX lines so I imagine flow control isn't too much of a problem.

I agree that the vertical ports can be a pain but I can't think of a better way to do it.

I would love to see a switch for termination, I'm doing some work with DMX and all the parts I've looked at that use that have switches for termination. There are a lot of ways for a external terminator to fail, I'd like to avoid seeing that happen.

Whoa a second. The splitters change the physical topology from a daisy chain to a star. This doesn't change the party line nature, nor the end of run requirement of the terminating resistors. You still need only two 100 ohm resistors. Adding more would degrade the signal.

The resistors are there to separate the potential of CAN high and CAN low signal lines. The signal is carried by the differential of these two lines. Add too many terminators, and the signal voltage bleeds off too quickly. Remember how resistors in parallel work. 100 ohm is lower than the 120 ohm CAN spec. This helps to artificially limit cable length. If you add a terminator on every split out jaguar, your cables would have to be very short.

If you use a splitter for a more robust connection (we did), only terminate at the end of the backbone (as Mikets did) or the other side of the jaguar with the longest cable. We did the latter, because our shooter cable was longer than the rest of our backbone. The other end of the bus will be terminated by the serial adapter or a 2CAN.

-- Len

Good info, thanks! As you can probably tell I don't know too much about using a CAN backbone, so this was an excellent read.

- Oliver

Al Skierkiewicz
13-05-2012, 09:23
Oliver,
The data is still in a string from CAN to Serial and back again. They don't act independently is what I was getting at. The DMX termination is as much for noise immunity as for correct operation of the serial data. We have DMX running throughout our big studio (85 x 105) using self raising battens. Each batten has DMX running through it to the next batten in line. The DMX terminates in several places around the studio so that we can connect in controllers where needed. We have 92 battens.

DonRotolo
13-05-2012, 12:44
Oliver,
From what I understand, the resistors are an integral part of CAN signaling in addition to terminating the bus. With an intermittent resistor, CAN devices cannot signal when they want to transmit data to the bus.
The terminating resistors are essential to address signal reflections, just like you'd get if you had an unterminated antenna wire.

An open circuit (=high impedance) allows the signal wave to reflect back onto the wires. These reflections can interfere with the actual data. As data rates increase, the interference has a greater impact on the data. Without the terminating resistors, eventually all the echoes would increase the bit error rate beyond acceptable.
Whoa a second. The splitters change the physical topology from a daisy chain to a star. This doesn't change the party line nature, nor the end of run requirement of the terminating resistors. You still need only two 100 ohm resistors. Adding more would degrade the signal.
The natural topology of CAN is a star, where everyone hears everything. For arbitration to work, each transmitter must be able to hear itself as it transmits, so these are full duplex transceivers. Putting them into a daisy-chain topology is just a poor implementation method.

The resistors need to add up to about 60 Ohms. Two 120's in parallel, four 240's in parallel, the bus doesn't care. In cars, there is a central "star" node which carries a single 60 Ohm resistor in it.

If you go with a class A CAN implementation, which is very slow, you don't even need terminations, but a Class C (e.g., 1 Mb/s) is very sensitive to them.

EricVanWyk
13-05-2012, 13:06
The natural topology of CAN is a star, where everyone hears everything. For arbitration to work, each transmitter must be able to hear itself as it transmits, so these are full duplex transceivers. Putting them into a daisy-chain topology is just a poor implementation method.

The resistors need to add up to about 60 Ohms. Two 120's in parallel, four 240's in parallel, the bus doesn't care. In cars, there is a central "star" node which carries a single 60 Ohm resistor in it.

This doesn't match my understanding of CAN topology or signal lines. Can you provide a link to a description of the star topologies in cars? All of the systems I've used CAN in were daisy chained or vampire tapped.

Al Skierkiewicz
13-05-2012, 16:47
Don,
I was suggesting that the resistors have a dual purpose, both termination and signalling. Ironically, a standard twisted pair cable has a characteristic impedance of about 120 ohms. Assuming a transmission line model, the transmitter then would see two lines (in parallel) terminated in 120 ohms in most cases. There is a famous AES paper by Jim Brown for audio twisted pairs that describes this.

EricVanWyk
13-05-2012, 18:27
Don,
I was suggesting that the resistors have a dual purpose, both termination and signalling. Ironically, a standard twisted pair cable has a characteristic impedance of about 120 ohms. Assuming a transmission line model, the transmitter then would see two lines (in parallel) terminated in 120 ohms in most cases. There is a famous AES paper by Jim Brown for audio twisted pairs that describes this.

Not ironic at all! You are exactly correct. The termination resistor needs to match the characteristic impedance of the line.

If there is interest, I could write a small white paper on CAN bus reflections. I think I even have access to the right software for it now that NI has acquired AWR and their excellent simulators.

linuxboy
13-05-2012, 19:33
Not ironic at all! You are exactly correct. The termination resistor needs to match the characteristic impedance of the line.

If there is interest, I could write a small white paper on CAN bus reflections. I think I even have access to the right software for it now that NI has acquired AWR and their excellent simulators.

I'd love to see a white paper on CAN bus reflections, although I'm sure it would mostly be over my head.

Al Skierkiewicz
14-05-2012, 08:38
Oliver,
What we are discussing are basic transmission lines. Lines do tend to behave in certain ways dependent on the frequency in use and the characteristic of the line. When a line has a characteristic impedance (in this case 120 ohms) and the termination at the end of the line does not match the impedance of the line, some of the energy is reflected from the termination (load) back towards the source (generator). Depending on the system in use, these reflections can add and subtract with the original signal or in this case be interpreted as additional data. For our use, the length of the line is so short that reflections, when they occur, travel on the line at almost the same instant as the generator that produced them.
Where this gets complicated is we are talking complex impedance which is different than resistance (although both use the common term 'ohm' as a unit value). At the frequency we are using for discussion (1 Mb/s or 1 MHz) and the distances we are using on a typical robot, (a few feet at most) much of the complex calculations fall out. Things don't start to get hairy until the line length gets about 1/4 wavelength and above which in our case would be more than 60 feet long.

DonRotolo
18-05-2012, 20:59
At the frequency we are using for discussion (1 Mb/s or 1 MHz) Your comments are spot on Al, but don't confuse data rate with frequency, since CAN is a bursty mode. (That means CAN transmitters don't transmit all the time, but in shorter bursts of data, with some quiet time between). Our Class C automotive CAN has a fundamental at about 1.2 MHz (meaning that's the dominant frequency) with sidebands out to 9 MHz (meaning there is enough energy of higher harmonic (multiplied) frequencies to worry about), bringing critical length down to well under 10 feet.

Still not as big a problem for our robots, but you can't ignore reflections even here.

DonRotolo
18-05-2012, 21:33
This doesn't match my understanding of CAN topology or signal lines. Can you provide a link to a description of the star topologies in cars? All of the systems I've used CAN in were daisy chained or vampire tapped.
Since CAN is a multicast / multi-master system with arbitration (see here (http://www.gaw.ru/data/Interface/CAN_BUS.PDF)) any network topology where everyone hears everyone is OK.

Vampire-tap is just a star network with a really long hub...

I can't give you any public documentation of how it is implemented in practice, but attached are two pieces of wiring diagram for a certain automobile with which I am very familiar that uses CAN.

In the 'white' image, the "hub" at the bottom (X30/7) is physically about 2 inches long with 10 'taps' for twisted pairs that run out to each node (I have included a small photo). The hub is used only on the low-speed* bus (125 kb/s).

For the high speed bus (500 kb/s) illustrated in the 'black' image, the 'hubs' are actually solder splices(Z37/x), which physically are a bunch of wires with their ends twisted together, crimped with a ring, and soldered.

Some implementations have termination resistors integrated in the hub, but most have them in two control units somewhere on the Bus.

*I call it Low-speed, but it is Class B, therefore "Mid-speed".

Don

Al Skierkiewicz
19-05-2012, 18:19
Don,
Doesn't the cable have a significant roll off at 9 MHz? So even with the sidebands, the cabling attenuates the signals for the sidebands.

Levansic
20-05-2012, 13:44
I wish I knew how to split off threads. The last few pages have been really great for CAN stuff, but don't really match the original topic. Maybe we could re-post some of this in the CAN subforum, so it is easier to find next year. Especially for rookie teams, or others who want to dive into CAN, but have no experience with it.

-- Len

DonRotolo
20-05-2012, 13:52
Actually, we're getting way too deep into CAN for FRC teams here.

On the original topic: FIRST has asked one of our mentors to write a paper describing what we experienced this year, including actions we took and the results. So I have to believe they're doing the best analysis they can of issues reported.