View Single Post
  #13   Spotlight this post!  
Unread 23-04-2012, 18:23
mjcoss mjcoss is offline
Registered User
FRC #0303
 
Join Date: Jan 2009
Location: Bridgewater,NJ
Posts: 70
mjcoss is a jewel in the roughmjcoss is a jewel in the roughmjcoss is a jewel in the roughmjcoss is a jewel in the rough
Re: Intermittent connection on field only

Well let me add some more hay And this thread seems like a good place to report some information.

I've been concerned about the field network for the past few years because it *seemed* that *which* robots were on the field was a contributing factor to the overall responsiveness, jitter and lag of the network. For example, the number of robots on the field that had onboard ip cameras seemed to be important. So this year, after our first district event, I started asking questions, and then at the MAR championship I brought a hardware packet sniffer to take a look at the field network, and I feel I should share what I discovered.

I originally believed that each robot was going to be assigned a separate channel and that they were using 802.11n @ 5 Ghz. The network at the MAR championship was indeed using 802.11n @ 5Ghz, and they are using a wide channel, however all robots were sharing the same channel and as such were sharing a total theoretically bandwidth of 300 Mbits.

At the Mt. Olive competition, I was told that the robots were running on channel 6. If true, this would have meant that they were running in the 2.4 Ghz range, with over a dozen other 802.11g networks, this would have considerably reduced the theoretical bandwidth.

In general, due to a number of different factors, if you get half the theoretically bandwidth on a wireless network, you're doing well. So let's assume that 150 Mbits is our expected available bandwidth on the field, if you're using a wide 802.11n channel @ 5 Ghz. Much less if you're using 802.11n @ 2.4 Ghz as the interference will be awful.

I've asked but haven't heard whether or not they have established any QoS or bandwidth limitations on each SSID in the Cisco access point that they are using. Without any controls, it will be a free-for-all for the available bandwidth.

This bandwidth is used by the robots in a number of different ways, and here I’m just talking about communication between the robot and the driver station laptop, as I’m not sure what the FMS is using:
VIDEO streaming
If you have an onboard camera, and send a video stream from robot to driver station at 640x480 in 24 bit color @ 30 frames a second, that's ~200 Mbits uncompressed raw bits. MotionJPEG will compress that down on average to around 10 - 15 Mbits. H.264 will do better. I've heard that some teams have 2 cameras onboard. The Axis camera supports a number of options to reduce the bits but there is no rule about how you configure the cameras, and they would work fine in your local build environment and even in the practice fields at competitions where you’re the only user of that wireless channel.
Dashboard data
There is the "normal" dashboard that is part of the default code, and the default dashboard sends data, if I remember correctly, at about 10 times a second. For reasons that I can't remember at this point, we are actually sending the data at 40 times a second from our robot. This is a relatively small amount of data, but it doesn't have to be. With the addition of Smart Dashboards, and other custom dashboards, and no guiding principle on the volume of data this could be a significant amount of data or just a dribble. In our case, we're sending ~1 Kbits per update, or 40 Kbits per second.
Driver station data
This is data packaged by the driver station application provided by FIRST and sends the values for the input devices attached to your driver station to the robot. I've never looked into how much data is being sent, or the frequency with which it's sent but it's not a lot of data probably on the order of 40 - 50 Kbits per second.
Other network traffic
There are several network ports that are open for teams to use for communication between the robot and the driver station. In our case, we ran a UDP server on our robot to collect the results of vision processing performed by our driver station. We sent the results of the calculations back to the robot at the rate of 10 times a second. The data is small (72 bits) so we're sending only 720 bits per second.

So for us, our network utilizations was small ~1 Mbits for the camera - we're using gray scale, 320x240 and 30 frames a second with MotionJPEG compression, and at best another 1 Mbit for the remaining traffic. But that is due to choices that we made. I could easily imagine making other choices, and given that I was operating under the belief that we had a full channel to ourselves, I might have gone down totally different path.

The thing about this is that while there is the catastrophic failure mode where the field network crashes. There are many other situations where the latency and jitter can spike, and dip badly. VxWork’s IP stack is not particularly robust, and for some teams that stack has to handle all of the time sensitive CAN bus traffic, as well as driver station, dashboard and custom traffic.

Further, unless you change your default Iterative robot code (at least in C++), you're periodic functions are synchronized with the arrival rate of the packets from the driver station. Now it a well behaved network, the arrival rate should be pretty stable. But if your code assumes stable packet arrivals, you can run into all sorts of timing issues.

In addition, both the camera traffic, and the driver station packets are using TCP which can be very unfair when it comes to sharing bandwidth. A greedy application can ramp up its utilization of the bandwidth, causing starvation of others. And then there's retransmissions, etc.

Is it possible to saturate the network? You betcha. Is it service impacting? Yes to everyone, including you. Is there anything that can be done? Yes.

When I examined the network at the MAR championship, I saw a number teams that were having problems associating with the field. There were repeated attempts by the robot's DLINK to associate with the field access point. I also saw many corrupt frames.

Our DLINK in Mt. Olive simply gave up completely, rebooting during several of our matches. It had been fine, and then just started reboot when we hit another robot or field element Of course to our drive team it looked like we lost communication to the field (which we did) but it was the DLINK that was rebooting. And no there weren’t any loose wires except maybe inside the DLINK housing. I heard from a team that they had a DLINK that only worked when it was standing on edge. Lay it flat and it didn't work. I think they are cheaply made, and are really not meant for the hostile environment of a FIRST robotics competition. A ruggedized access point/bridge would be a beautiful thing.

I don't know why FIRST chose not to have 6 separate access points to provide a channel for each robot. Maybe they just figured that 150 Mbits/6 = 25, and who’d need more than that. I don't know if they are configuring QoS to ensure a fair share of the network. I will be looking at the network in our next off season competition, and try to come to some conclusion about what exactly is really going on.