Quote:
Originally Posted by nuttle
My guess is that the issue is triggered by more bandwidth being needed than there is available. So, robots that upload lots of telemetry, all have video feeds, etc. combined with anything that causes less availible bandwidth is the magic combination. Fixing whaever is limiting the bandwidth makes the problem go away, but really doesn't solve the deeper issue. When bandwidth is tight, packets drop, TPC requests retransmissions, and this only results in more bandwidth being needed. Traffic on some critical connections gets held up for long enough that robots die and do not recover. If this is right, the old radios in 2010 turned up the same underlying issue but the problem went away when everyone was told to use the new radios.
Depending on how things are implemented, a single robot could have more than one video stream coming from a single camera -- one for the dashboard and another one for off-board target tracking. One way to simulate this would be to open extra web sessions with the camera. The 'netem' tool I mentioned before can simulate a network with various issues in the network and is a more controlled way to try to cause this type of problem.
This wouldn't be an issue when using a tether, and a single robot running over wireless would have six times the bandwidth as when a match is being played and so would not likely see this either. The exact traffic when running under the FMS might be different as well. Using tcpdump / wireshark would allow digging into the problem.
Just thought I'd throw this out since this angle hasn't come up in this thread and it is one theory that should be considered, I think. THe solution would involve taking steps to make sure the critical data always gets through by limiting the bandwidth that is allowed for less critical data when bandwidth is tight. Giving each team an equal amount of bandwidth is only fair, and things like the video feed will do OK with limited bandwidth, usually by dropping frames. It would be good to have a way for teams to test with limited bandwidth as well. It would be possible to have a gauge on the driver station showing bandwidth used or something along these lines to help teams catch issues they can contol that cause them to use more bandwidth than they need and generally be more aware of this.
|
This concept DOES hold some water, especially in the context of Einstein.
It's reasonable to believe that the robots that reached Einstein would be making more effective use of the bandwidth available (sending streaming video back to the DS, etc).
It's then also reasonable to believe that if such bandwidth requirements are occupying near-to the full capacity of the link, and you start adding more of these bandwidth hungry robots to the network, problems occur due to dropping packets flooding the network with retransmits.
Maybe things could be improved by forcing camera information over a non-handshaked protocol like UDP, where no retransmission occurs if there are errors (after all, retransmitting an old camera image doesn't help, when there's a new image to transmit instead).
I agree that it would be really nice if HQ could get 6 of the 12, or all 12 Einstein robots, with the Einstein field, and test things out at HQ.
Even then, I'm not sure things would fail without the crowd of thousands of WiFi enabled devices inside a pseudo-faraday cage with it.