Chief Delphi

Chief Delphi (http://www.chiefdelphi.com/forums/index.php)
-   General Forum (http://www.chiefdelphi.com/forums/forumdisplay.php?f=16)
-   -   The communication tides are shifting... (http://www.chiefdelphi.com/forums/showthread.php?t=106050)

kmusa 29-04-2012 21:50

Re: The communication tides are shifting...
 
MAldridge - thanks for the explanation.

We certainly have had our share of programming errors and loose wires, but have also experienced unexplainable field errors.

From a network prospective, the artificial means of protecting available capacity, by trying to enforce no WiFi access in the arena and the pits, doesn't scale and isn't enforceable - different venues have different levels of control over access points, use of wireless in the audience is increasing (via mobile hotspots and the like), and teams are encouraged to transfer more and more data between the DS and the robot.

I would love to see a transition to another frequency band, but the additional cost is probably not realistic. Regardless, the FMS needs to be robust, both now, and in the future.

MAldridge 29-04-2012 22:11

Re: The communication tides are shifting...
 
the policing of wifi traffic is what doesn't make sense to me. It is possible to MAC address lock the D-Links to a specific AP, so why aren't they? As near as it was explained to me, the master transmitter doesn't have a single MAC, and there is no way to insure the proper MAC is assigned to the right VLAN every time.

As for a sensible language, I base sensible of of what the guy running it can debug. Since most FTA's do labview, it is sensible that they could troubleshoot it. Also, since the rest of the system is already written in labview, it makes sense to have it all done in one language.

marshall 29-04-2012 22:22

Re: The communication tides are shifting...
 
Quote:

Originally Posted by MarkoRamius1086 (Post 1163911)
The one team that was initially causing lag had three cameras streaming at an extremely high refresh rate. They toned it down to an acceptable rate as soon as they were informed it was causing lag not only with their bot, but with other robots on the field.

So I'm a mentor for Team 900. We were at both the Virginia and Raleigh regionals. You guys did an awesome job keeping things in line. It was truly top notch work.

However, I do feel the need to point out that the issue with our robot and the three cameras causing lag for the other robots was not an issue we actually solved in any meaningful way. We were watching our own bandwidth and it remained the same throughout the matches despite us fiddling with settings. I'm not sure what changed but it wasn't our robot despite our best efforts.

I do have a bigger concern with being told that a single robot is somehow taking all the bandwidth away from the other 5 robots on the field. This is a problem that should not be occurring. Quality of Service (QoS) should enable the bandwidth to be shared evenly amongst the teams.

There needs to be more transparency into how the FMS system operates. A complete breakdown of what parts and software are used will enable more eyes for troubleshooting and will hopefully enable teams to host better off season events.

We saw FMS problems at both Virginia and Raleigh. I'm not surprised other teams are having similar issues. I'd also like to echo the sentiment that a better transceiver system needs to be worked out. I'm not sure what that system should look like but the COTS D-Link and Linksys stuff isn't cutting it.

apalrd 29-04-2012 22:46

Re: The communication tides are shifting...
 
I was trying really hard not to step into any of these threads, but there is some information that needs to be included in this thread, and I feel compelled to post it.


-Everything between the robots and DS's uses UDP/IP which has none of the retransmission problems TCP has. As packets are sent every 20ms and include a CRC, a missed packet will cause a 20ms lag (not bad at all) and it will ignore the packet if the CRC is bad. The FMS also talks UDP to the DS, and the DS talks UDP to the Dashboard. Only the camera uses TCP for the video stream, which is generally unrelated to completely dropped comm (camera problems will show up as lag caused by late packets, and would be seen by a team from the beginning).

-Prior to one of the matches, I opened a WiFi scanner on my phone. The field uses one 5ghz channel for all 6 networks, and nothing else was using 5ghz for 802.11a/n (b/g only run on 2.4ghz). Cellular networks don't run anywhere near 5ghz (they use 700-800mhz, "PCS" 1900mhz, and a funny "AWS" band that splits uplink/downlink on 1700/2100mhz) and are not the cause of interference (if it is wireless interference, which I don't think it is).

-The field-end AP is a Cisco commercial AP, the same one has been used since the control system was introduced in 2009. It's operating exactly in it's intended environment (fixed AP for multiple virtual networks), and any problems with it would show up with all robots.

- The weak link is certainly the D-link AP. Many teams this season have seen random communication loss at their later events. I'm sure a consumer-grade wireless AP designed to sit on a desk or shelf dosen't like being thrown over a bump at 10 ft/sec hundreds of times.

-I believe that any part that is not industrial or automotive grade, or designed specifically for FIRST will not be robust enough for a FRC robot, especially if it is a single point of failure such as the radio.

-I did download FMS light, and the 2011 version of FMS Delta/Full (Full won't run without the PLC, so my play stopped there), and found that it was clearly written in a .NET language and was not terribly stable. I was able to crash FMS Full quite repeatably simply by not having a PLC for it to talk to.

--At an off-season event in 2010, the circuit powering the field tripped a breaker and shut down the FMS server mid-match. Upon reboot, the FTA's found that it would crash when opening because it had entered a date in the MSSQL table that was invalid, and it would not read the tables at all. As the tables were encrypted, the FTA's literally had to guess the password to manually open the tables in MS SQL and fix the date, and found that the password was a string related to FIRST (I don't remember what it was).

techhelpbb 29-04-2012 23:09

Re: The communication tides are shifting...
 
Quote:

Originally Posted by apalrd (Post 1164227)
--If anyone wants more information on FMS, I can post what I know, but it will be a long post.

I'm interested to read more.

Ether 29-04-2012 23:26

Re: The communication tides are shifting...
 
Quote:

Originally Posted by techhelpbb (Post 1164241)
I'm interested to read more.

I second that.



wireties 29-04-2012 23:28

Re: The communication tides are shifting...
 
Quote:

Originally Posted by apalrd (Post 1164227)
... and found that it was clearly written in a .NET language and was not terribly stable.

this must be the problem ;o) (conclusion by a prejudiced VxWorks & Linux guy)

dsherw00d 29-04-2012 23:30

Re: The communication tides are shifting...
 
We had one round where we were dead the entire round. A FRC tech person came over and viewed the DS logs and tested the robot. He found nothing unusual. We had a second round where we were dead after autonomous, then disabled the Ethernet port, re-enabled, and it all came back. It took around a minute to do this. Since we were not doing well, it was not a big issue.

It would be great see a detailed network diagram with configuration files of the devices. I understand they need to keep the system secure and not sharing details helps. If there are 9 VLANs (why 9 with 6 robots?), it should be easy to guarantee every team 100Mbs of bandwidth. The cRIO is 100Mbs plus maybe another 1Mbs for the camera stream. What do the 6 bridged radios connect to? I looked once and it wasn't a DLINK, but can't remember what it is. This shuold be a business/enterprise class device. There could also be an issue with the queues on the FMS Ethernet hardware, but if it can't handle max 600Mbs of traffic non-blocking, it must be outdated or mis-configured. Depending on frame sizes, a GigE port can start dropping traffic at 600Mbs depending on the quality of the device. Another possibility would be to use secure tunnels instead of VLANs. An IPSec tunnel with 100Mbs guarantee between the robot and the wireless router with the IP traffic receiving the highest priority marking. If they are only dealing with the Layer 2 VLANs, they have no way of guaranteeing the Layer 3 IP traffic. All traffic should also be mirrored to another port so Wireshark or the like can run an offline trace in case there is an issue or challenge. They should be able to determine what every dropped frame/packet is and where it came from. If dropping radios is a issue, I bet it can be traced back to the same issue in all cases.

I agree with a previous poster that the hundreds if not 1000s of WiFi capable smart phones/ipads/kindles/laptops are always searching for networks. Maybe this is impacting the radios. At the figerlakes (RIT), they actaully swapped out the robot radios with a business class DLINK due to discovered issues with large amounts of wireless traffic and the way the DLINK discovered/connected to these networks. I believe it was related to how many MAC address the cheaper DLINK can store and how long it took it to clear MACs. We do need better radios.

taichichuan 29-04-2012 23:33

Re: The communication tides are shifting...
 
Hmm... OK, I've read through the majority of the posts in this thread and I'm surprised that we haven't had any RF folks like the hams chime in. So, since I'm a ham operator of almost 40 years experience with antennas, I figure that I might as well open my big mouth and put in my $.02. Essentially, I think that there are several RF-related issues that could/should be taken into account.

The Dlink is a commercial, consumer wireless router designed for static mounting in a relatively benign RF environment. If we could characterize the Dlink's RF transmission pattern, I suspect that we'd find that the DLink is not a true omni-directional antenna. And, we're running it in an mobile, hostile RF environment. High-power motors, weird mounting angles, mounting near the cRio with lots of stray RF fields (unshielded PWM cables, the PDB, digital side car and more) and even mounting inside of signal-blocking structures like aluminum pillars all have the potential for interfering with RF signals. The only requirement for mounting the radio is that the inspector can see the status LEDs. But, the unit can be mounted in such as way that the RF transmission capabilities are seriously degraded while still having the LEDs visible.

There is a reason that most industrial WiFi units have small antennas protruding from them. It's because the little antennas the stick out have a higher RF gain and are more omni-directional than the small patch antennas found in the Dlink unit. In addition, IEEE 802.11n is capable of multi-in/multi-out (MIMO) operation that allows for antenna diversity so the antenna with the strongest signal can be selected and can support switching antennas on an as-needed basis during the match as the robot moves on the field.

So, I think that if you switch routers with ones that support multiple omni-directional antennas like the DIR-615 or DIR-655 as well as having Quality of Service (QoS) support (the ability to prioritize the DS packets and de-emphasize the video if bandwitdh is squeezed), perhaps even to support the remote mounting of the antenna on the center of the robot, that you might be able to address many of the loss of signal issues with the FMS. Additionally, mounting the receiver antennas *over* the center of the field instead of to the sides could also help.

As to licensing a section of the radio spectrum, that's not likely to be economically feasible. Companies like Verizon, Google and others spend literally billions of dollars to lease bandwidth in big government auctions run by the FCC. Even with 3000 new teams from the Boys & Girls clubs paying $5K a pop for registration, it's highly unlikely that FIRST could afford to license a chunk of the bandwidth for use -- let alone convince a radio manufacturer to create radios cheap enough for such a small market.

Of course, the lower the frequency, the the further the distance it will transmit. There are several, open ISM bands that do not require licensing. The 2.4 GHz and 5 GHz WiFi bands are examples. The early serial modems used several years ago where 900 MHz (another ISM band) modems I think. But, the field isn't so big that it justifies dropping to a lower frequency. And, we have to think about countries other than the US as well. The US 2.4 GHz WiFi band overlaps an Israeli military frequency if memory serves me correctly.

So, a dual-band (2.4 GHz/5 GHz) radio with MIMO support and multiple omni-antennas with antenna diversity support and some guidelines for mounting the radio would likely solve most of our FMS connection issues. These types of routers might cost an extra $100 or so more, but the ability to play the game without communications issues would be priceless, IMHO.

My $.02,

Mike

Slix 29-04-2012 23:34

Re: The communication tides are shifting...
 
Quote:

Originally Posted by MAldridge (Post 1164155)
The Dashboard on the other hand, is full of holes. When teams use the smart dashboard, they are asking for trouble, the same goes for all the other dynamic dashboards. These programs are too broad in scope and attempt to fit too wide an application. The only way to write a truly stable dashboard is to write it yourself for your own robot.

Why is that?

From what I understand, the current smart dashboard is just sending keys and values over the network to the laptop.

MAldridge 29-04-2012 23:46

Re: The communication tides are shifting...
 
I consider it to be full of holes because it was not written for a single application. If you think about it for a while, you always come back to the idea that however talented some people are, the software guys on FRC teams are high school students, and the odds that thier code will not screw something up on the dashboard, and that they will configure it properly, are pretty low. We played around with several of the 'commercially available' solutions, such as smart dashboard and zomb, but couldn't get either one to run reliably the whole time. We coded our own and that worked the best.

As for the wireless with the omni antenna, that would be great if it could be acquired cheaply. You are correct about frequency overlap, but I think you have the country wrong (my ARRL handbook is not handy, or I would check). I think though that the problem that you would encounter with an antenna is that then you would have to explain to teams proper cable routing and making sure the feedline was built right. Good luck with that one.

A good radio is a part of the solution, the question is what makes a good radio.

techhelpbb 29-04-2012 23:59

Re: The communication tides are shifting...
 
Quote:

Originally Posted by MAldridge (Post 1164264)
As for the wireless with the omni antenna, that would be great if it could be acquired cheaply. You are correct about frequency overlap, but I think you have the country wrong (my ARRL handbook is not handy, or I would check). I think though that the problem that you would encounter with an antenna is that then you would have to explain to teams proper cable routing and making sure the feedline was built right. Good luck with that one.

A good radio is a part of the solution, the question is what makes a good radio.

Just to point out, obviously several companies sell the prequisite reverse polarity SMA connectorized, precabled, and ready to attach antennas typically found on WiFi hardware. I have quite the selection of them I use for network evaluation and wardriving.

Course the problem is that if you vertical mount an omni and the field AP is directly overhead, you're probably in the weakest lobe of the field strength.

Perhaps a biquad mounted flat on the top of the robot on a copper clad PCB, parallel to the surface of the field with the APs directly overhead might be an option.

Deetman 30-04-2012 00:37

Re: The communication tides are shifting...
 
I don't want to point fingers at the D-Link directly with this post merely point out a few common thoughts that have been thrown around.

1) Perhaps the biggest problem anyone has in pinning any root cause down is that no two robots are alike. For that matter, not every team is running the same D-Link AP. There are multiple hardware revisions floating around out there and multiple firmware versions as well. Thus is the nature of not having full control of a commercial product. Even the same "version" of a D-Link AP may vary with quality control issues (bad solder joints, loose connections, DOA components) and parts are often substituted in due to obsolescence, cost, and availability issues. One of the first things you learn as an engineer when troubleshooting is to control the variables and only change one thing at a time. FIRST cannot do this with the D-Link as teams can get them from anywhere as long as they are the DAP-1522. Some vendors have old stock with old revisions, some with new stock, etc. Revisions to hardware/software are not made "just because". Usually there is a distinct defect that is being addressed in a revision. Theoretically a DAP-1522 is a DAP-1522 but even a small change could have a disastrous effect in obscure use cases.

2) After watching the webcast yesterday I went off in search of some more obscure info on the D-Link DAP-1522. I had considered this before as the issues throughout the season piqued my interest. The most interesting things I found were some nice pictures of the D-Link's internals. There were three things I noticed:
  • Save bad or weak solder joints, nothing about the design or mounting screamed out to me as being horrible with respect to the harsh mechanical environment we subject the D-Link to.
  • There are multiple revisions of the hardware out in the wild. See the pictures below. One version does not have the RF connectors glued, another does.
  • The u.fl connectors to the antenna seem to be the most likely point of mechanical shock/vibration failures. It appears that this concern would be addressed with the glued hardware. I've never used a u.fl connector, but it does seem like a pretty unreliable physical connection given what these APs may be subject to on the robot. These things are only 2mm tall!




pavie 30-04-2012 00:38

Re: The communication tides are shifting...
 
Also tried VERY hard not to jump into this, but some worrisome ideas are being tossed around...

Our team has observed and experienced multiple FMS failures throughout this season, very similar to what occurred in St. Louis (not just during the finals). It's highly likely that some component(s) of FMS other than robot hardware or software are causing it to fail.

The general technical requirements of FMS (explained in multiple post before, so I won't repeat them) are WELL WITHIN what WiFi + basic TCP/IP networking can handle, with all necessary access controls and bandwidth allocation limits. There is certainly no need to even consider switching to alternative radio technologies (we'll all be wishing to be back on WiFi in no time...:-).

There are two key issues:

1. As far as networking setups go FMS is fairly simple (not trivial, but for a truly competent network architect it is pretty pedestrian). The amount of traffic is also relatively trivial (compared to a commercial environment, not a home router with an XBOX or two). All FMS traffic should be 100% logged for post-mortem analysis. Again, it is well within technical capabilities to do so. The real challenge lies in the availability of people/tools capable of analyzing that data on the spot and diagnosing the problem.

2. Consumer-grade wireless networking devices have NOTORIOUSLY bad firmware. They are built for "easy setup" and "check box features" for marketing purposes, but as anybody who's ever tried any of the more sophisticated features in consumer routers (e.g. web filtering or QoS, not to mention throwing larger numbers of devices at them) knows, these features are often buggy or simply don't work at all. The usage pattern of FMS pushes the D-Link devices into areas they are not really built for and do not experience in normal consumer usage (and are therefore very unlikely to be tested/debugged/fixed - not because they're "bad" products, but because this is not their normal operating regime and their target consumers simply don't care).

It appears that FIRST will have to allocate more resources to both #1 (understanding what's going on through careful trace analysis) and #2 (creating an environment for effectively simulating worst-case game conditions and figuring out what brand of equipment satisfies those requirements or how to configure it to avoid pitfalls). The solution is unlikely to be "build custom hardware or firmware", because statistically it's much more likely to be even buggier than any widely-used commercial solution, not to mention economic (non)viability of that approach.

I suspect that there may be vendors willing to hep out with both #1 and #2, but the biggest challenge may not be technical but human (relinquishing a bit of control and allowing another party to step in).
Broken code is usually easier to fix than bruised egos :-).

dsherw00d 30-04-2012 00:56

Re: The communication tides are shifting...
 
I like the HAMs post:) - using remote antennas mounted away from the other electrical components is probably a good idea. FIRST could provide the same model with the same gain to teams. I use a Hawking antenna at home. Normally, this is a more exspensive radio with the appropriate BNC connectors, but may help. At the RIT event, they used the FIRST approved backup radios with the dual external antennas and I don't remember hearing about any problems. I know we had issues in Cleveland, but not to the extent of St Louis - less people/devices in CLE.


All times are GMT -5. The time now is 04:33.

Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi