View Single Post
  Spotlight this post!  
Unread 29-04-2012, 21:11
MAldridge's Avatar
MAldridge MAldridge is offline
Lead Programmer
AKA: Rube #1
FRC #0418 (LASA Robotics)
Team Role: Programmer
 
Join Date: Jan 2011
Rookie Year: 2010
Location: Austin
Posts: 117
MAldridge will become famous soon enoughMAldridge will become famous soon enough
Re: The communication tides are shifting...

Begin long and rambling explanation:

I had the chance to talk with an engineer from NI at great length about what goes on inside the field system. Over the course of a few hours, I am fairly sure I asked about every single aspect of the field and how it works (and more importantly, why it doesn't). This is what I was told:

The field itself has two PLC's that handle the scoring hardware and a PLC that handles the bridges. These PLC's talk serial over IP using bridges to connect to a dedicated LAN. These PLC's are only restarted maybe once a regional, and are very resilient. The only time they have to be restarted, is if for some reason the system experiences a power surge that causes one to 'disappear' while it restarts. Next in line on the field are the PanelView systems, which are just win7 tablets with touchscreens running a simple interface to talk with the scoring computer, which is itself a part of the FMS, but that in a minute.

Next in the field is the Robot, the robot has a relatively new cRIO real time controller that is very advanced compared to what FIRST had previously. This system is VERY robust, considering that student programmers are writing code that is for the most part poorly thought out. There are many teams that write good code, but there are many more that don't. The only point at which the code on the machine can easily be blamed with 100% confidence is CAN. The CAN bus system was never designed to work like this, where the machine is expected to do a rapid start up and perform dynamic hand-offs of the CAN master node (the switch from auto to tele). Assuming that CAN is not in use, and that there is good code on the machine, the cRIO is a bulletproof system.

On the other end of the line is of course the DS, the DS is a simple labview program that communicates with the robot to convey the info from the hardware connected to the DS. In my 3.5 years of FRC, I have never seen a problem with the DS. The Dashboard on the other hand, is full of holes. When teams use the smart dashboard, they are asking for trouble, the same goes for all the other dynamic dashboards. These programs are too broad in scope and attempt to fit too wide an application. The only way to write a truly stable dashboard is to write it yourself for your own robot.

So now we are at the physical connections in the system. The IP network is run using Cisco managed hardware that is, for what it's worth, very good equipment. That being said, there are a few bugs. For one, the main WiFi transmitter is just one unit. It is a very special, very powerful unit, but it is still just one transmitter. There is also the issue of it just being Cisco equipment. For that one, some background is needed: when a router is approaching the maximum amount it can handle, it will start dropping packets, TCP which allows for this, will simply request the missing packets at a later time, when there will presumably be less network usage. But here is where the fun part is, how does the router decide where to drop data from? Easy, starting at the lowest numbered numerical address, moving up. This easily explains why old teams (low number) see many more connection problems than the rookies (high number). This is unfortunately just the way the cookie crumbles, so there is no way around it short of coming up with a gigabit WiFi system.

So now for the last part, the FMS. The FMS is a fully redundant system consisting of two rack-mounted servers in the scorpion case. This system is the week link of the entire field. The FMS itself has to handle ALL of the following hardware/software functions:
  • 9 VLANS
  • 2 PLC's
  • 3 Monitors
  • 18 dynamic data sources
  • A twitter feed (not much here, but it all adds up)
  • The sundial system
  • WPA encrytion cycling
  • Match Scheduling

To top it all off, the whole thing is written in VB.NET! If you want a laugh, head over to AndyMark, where under the field rental section there is a place to download FMS light, which is the same program, just without the bits that handle the Field hardware. You will quickly find that it is only barely stable, and that the it suffers from huge amounts of odd quirks inherent to VB.NET apps. It is clear to me that the FMS program itself is the root of this evil. Since it handles the WiFi system, it is also to blame for the oddness of connections that work perfectly on one field but not on another, or robots that work on the practice field, but not on the real field. The FMS program is in dire need of a re-write, preferably in LabVIEW, which most FTA's can debug errors from very quickly.

Some of you may remember from watching the Alamo regional Finals that there is a very awkward pause during the finals right after one team called for their backup robot. Look it up on Blue Alliance if you haven't seen it, it is one of the better times the judges have had to stall. When I asked later what happened, the Score keeper (who actually sits at the FMS console the majority of the time) said that the system just refused to take the team substitution. It wound up requiring the entire field to be reset and rebooted. This was an issue that had never come up before, but was attributed to a rare access violation within the FMS program.

All in all, we are dealing with a very robust system. The only part of it that is not robust is the FMS program itself. To me, the only solution is to scrap the current VB.NET program entirely, and rewrite it from the ground up in a more sensible language like LabVIEW.
__________________
'Why are you a programer?' --Team Captain
'Because the robot isn't complicated enough!' --Me
Reply With Quote