Dealing with Field having abnormally limited bandwidth

Hello All!

I encountered a situation at the FiM Howell event I’m not quite sure how to handle in the future.

Note: The following information was told to me by a CSA with supporting evidence in the form of graphs from FMS data, however, I am not a FTA nor do I have all the information available to a FTA so take what explanations I have with a grain of salt.

The problem was, due to WiFi congestion and interference, the field itself was limited to ~20-25mb/s throughput.
This caused a serious issue when multiple teams were nearing the 7mb/s max the field supported.
The issue was that the bandwidth limiting only occurs as you get near that 7mb/s cap, not when the field is running out of potential bandwidth.

So when the field ran out of available bandwidth, the entire field started having massive packet loss (read: >30%) and latencies above 500ms.

Because 3 robots could exceed the bandwidth required to cause this condition, I asked the CSA (this was just before elims) what my alliance could do if we encountered this condition because the opposing alliance had multiple robots nearing the 7mb/s limit.

*(Fortunately this never happened during our matches in eliminations at Howell)
*
The answer was “probably nothing”.

I am completely fine with that answer for an event. I understand the event staff only have so many options that they can go through and work out while continuing to run the show.

However, an opposing alliance being able to cause our alliance communications issues, with no ability for mitigation, is completely unacceptable going forward.

To be absolutely clear: I have no problem dealing with reduced bandwidth. I understand sometimes wireless communications are limited due to circumstances outside event coordinators control.

How would you guys deal with a situation like this?

I know we had our alliance turn off any camera streams to mitigate possible issues if the opposing alliance had 2 robots with camera streams, and we also asked the opposing alliances to limit themselves to 2 camera streams and explained the issue to them.

IMO, the “graciously professional” thing to do would be for both alliances to cut back on bandwidth uses so it’s fair for both.

The best option in this situation is to talk with the FTA not a CSA. FTAs can help determine the root cause (venue APs should be off/teams using hotspots/etc) and find a solution. Additionally the FTA can contact FRC Support and they can look at the WiFi conditions and make appropriate changes. Trip times that high would definitely make a good case for a replay in my opinion.

Based on your description, this seemed to be a problem during early matches at the Hawaii regional, where every robot experienced latency issues during the entire match during teleop.
While we all complained to FTA, they said nothing could be done and no replays would be done. Lucky for us, we won, but it was quite frustrating for everyone and we all wanted a replay.

As others have said, the best option is to bring this up with the FTA.

Generally the FTA can work with the venue staff and/or planning committee to get the venue to shut down their own WiFi networks. Doing so typically fixes these types of issues.

If there are rogue WiFi access points, the FTA can work with headquarters to move the field into a different part of the WiFi spectrum or find another solution to dealing with the rogue APs.

If all else fails, ask them to consider working with teams to minimize individual bandwidth utilization. Teams frequently have cameras that are configured for a higher resolution or framerate than really needed- they likely won’t notice significant quality loss if they lower it. Normally it wouldn’t matter, but in restrictive situations like this it would be GP to lower the quality to ensure a high level of gameplay for all teams.

During the playoffs driver meeting at CIR, we were asked to keep our bandwidth limited to about half the specified amount because the field had WiFi issues. All alliance captains were in agreement that this wouldn’t be a problem. During quals this was an intermittent issue, and only a couple matches were replayed IIRC.

I agree that this is the best course of action. Note that R65-B specifies a maximum of 7 Mbits/s, and the blue box specifically calls out that the venue may not accommodate that.

Therefore, as always, design your systems with a reasonable amount of headroom.

We had a qualification match where both alliances were experiencing latency as high as 2000 ms (as reported by our drivers). Afterwards the FTA, Sean Messenger, came around to all of the teams and asked us to reduce our bandwidth consumption by reducing camera streams and camera resolution. After we made the reductions, we didn’t have any problems afterward.

Along with the possibility of some venues not accommodating the full 7Mbits/s, the FMS whitepaper clearly points out that once you hit 6Mbits/s, the trip time increases exponentially as you clog up the vlan that is set up for your team. Teams should probably try limiting themselves to 5Mbits/s due to “~900Kbits/sec being consumed by Robot-DS packets alone.”

In a situation like this, I’d expect the CSA to work with teams to reduce bandwidth while not actually impacting performance. Whether for your drive team or for offboard image processing, we’ve found you can generally do just fine with pretty low resolution images and no more than 2 mbps per stream.

Just to clear something up, the CSA we were talking to was on the radio to the FTA the entire time we were having this conversation; so the FTA was aware of this problem.

The CSA also said that this was a problem all day both Friday and Saturday.

Sent from my iPhone using Tapatalk

what type of latency issues are we specifically talking about? Is it with just vision processing or overall robot performance?

We can confirm the occasional latency issue at the Hawaii Regional this past weekend, but that wasn’t the only time we experienced the same problem where all robots have control delays and trips. Unfortunately, were on the unlucky side of 359’s match.

In our 4th qualification match at the Ventura Regional (Match 27 if it concerns you) all robots experienced major packet loss and brown-outs in the later half of the match. This past weekend at Hawaii in our first match had the same issue and later in our third match (but less severe the last time). Both times we brought it up the refs/FTAs said no replays and nothing could be done and that it only could be sourced to individual issues for each robot (which seemed kind of ridiculous). At Ventura, the FTA/CSA tried to find out more about the problem and came to our pit to see our driver station logs which we are grateful for. Still they said the field was not to blame.

One trend in those matches we saw were that there were multiple teams who had some vision processing, on or off board. One team I heard their co-processor failed to start up in that match.

Shouldn’t the FTA’s be informed about these issues? I ask because they didn’t give this reason as the cause of the problem (instead saying it was all unknown individual issues) and we were not informed that lowering our bandwidth usage might improve the field comms. This thread is the first time I heard the problem being sourced to bandwidth (although we felt like this was the cause).

I will put in a plug for the driver station log viewer, which gives you all of the data you need to identify bandwidth constraints during a match. When I work as a CSA, I spend a fair amount of time training teams on this underappreciated resource.

Related, does you/anyone know why the DS log viewer graph always shows bandwidth Mbps as a flat zero? It’s done this for two years, and the CSA I asked about it had no idea.

We lost our whole day of practice matches due to this issue. 5 matches with horrible (you have no idea) lag. I know 3996 had the same problem, so we went to talk with the FTAs, who called the HQ. They didn’t find the problem at the time, so we just crossed our fingers for friday. The problem occurred once more, but only in the first match. I hope First finds a way to prevent this in the future!

I just want to make sure you have the right terminology- a brown out occurs when the robot power is insufficient to keep electronic components on and functioning. This would not be even remotely field-related and would 100% on the team.

There are certainly still cases where field electronic can be at fault, but in general I’m inclined to side with the FTA. Teams generally don’t have access to the match-wide data that the FTA is seeing, so most of the time all the feedback they get is “hey, I see three dead robots on the field.” Often times the FTA sees exactly what is happening, such as a popped main breaker, a disconnected batter, robot code that has been crashing for the last three days, etc. One of the best pieces of feedback an FTA can get that teams usually can’t is the LEDs on the radio and roboRIO- those can indicate which components lost power (if any) when a robot disconnects from the field.

Unfortunately the FTAs often don’t have the time to sit down with you and go over what happened to every dead robot in your match, so my recommendation would be to trust them as well unless you have evidence (typically in the form of driver station logs) to the contrary.

Again, this sounds like a team issue. The FMS isn’t responsible for sending out “co-processor boot” messages. In these cases I recommended talking to that team about what went wrong- they likely know what happened within a few minutes, or have CSA working with them to identify the issue.

Yes! And they typically are. If you experience issues, let the FTA know what the symptoms are and anything unique to your setup (e.g. a vision co-processor). If it is just you, they can get a CSA to assist you. If they see a trend, they can talk to HQ to track down any potential issues.

The only issue you mentioned in your post that could be related to bandwidth limitations is the high packet loss. The rest of the issues don’t sound particularly bandwidth related, but with limited data it is very hard to say anything more than that.

Also keep in mind that lowering your bandwidth isn’t a silver bullet for improving field communications- that should only be necessary in certain situations and for most events will not be necessary. If you are already pretty low (around or below 3 mbps), you really don’t need to change anything.

Sorry for all of the side points but we were just trying to relate our experiences to bandwidth and other interferences. The FTA/CSA at the Hawaii Regional could have mentioned this issue when we brought it up because we did find the packet loss and trip times (and brown outs which may not have any blame on the field, but weird to be widespread nonetheless) for all teams on the field abnormal. The replay call is totally their decision which we respect, but maybe it wasn’t entirely each individual team in those matches to blame and some other cause such as rogue APs (which there were) or help us try to reduce bandwidth usage if that was the issue.

We were also part of the first match where we experienced lag issues. Our partner experienced it and our other partner was dead throughout the match.
The third partner didnt move because they had issues with their driver station unrelated to the field.
However, something was definitely going on because to have 4 robots all experience the same issues (not including the 2 dead robots) isnt a coincidence.
What I would like to know is if 1 robot’s issues could affect the rest of the teams on the field.

If the field was having trouble with the band with requirements, it is possible that it could cause both robots to lag and have high trip times/packet loses. If one robot had a very high BWU and the FMS didn’t limit them, it’s possible that could cause problems for other robots on the field by limiting their usable bandwidth. More likely is that the field AP was being weird and not giving enough bandwidth regardless of how much each robot was using. The FTA has access to the BWU, trip times, and packet losses of all of the robots so I’m sure that was taken into account when deciding whether to replay or not.

The brownouts and dead robots are almost definitely not related. I can’t speak to Hawaii, but at all of the event I was at brownouts and rio/radio reboots were commonplace all the way through DCMP finals.

I’m not sure what field access point was running (Either the new WRT1900ACS, or the older Cisco Aironet 1252) - But considering the Aironet AP has been more than capable of running the 6 VLAN’s during my testing, with similar bandwith limits that are imposed on the robots.

Has this been replicated with a robot AP right next to the field AP? that would give you an answer whether it’s interference, or software issues on the radio/field network side.

I’ll see if I can do some stress testing over the next week on the 1252 I have.

PS. if the newer WRT1900ACS was used, then there should have been no bandwidth limitations, as the AP is capable of >1Gbps thanks to 802.11ac (assuming new radio) (The roboRIO only has a 10/100 port, meaning that at max it would do a tenth)