Scouting Data Interchange Format

I think specifying a file format for scouting data would help when people start trying to share data. Since the main goal is to help with sharing, I thought I should try to include as many people as is feasable.
The goals I had in mind:

  • Flexibility: Not everyone is going to want the same data, or need the same data. If the file format can’t handle different sets of data, then it won’t get used by most teams, and there’s little point to any work that the FRC Teams collectively put into it. The only downside to flexibility is that scouting apps have to be designed to allow for data they weren’t expecting and critical data missing from a file.

  • Human Editable: If humans can’t read and change the file, then life becomes a lot more difficult when programs don’t behave as expected.

  • “Timeless”: The fewer times we have to re-invent the wheel, the better. If the file format can be generic across different years, then fewer details have to be worked out for any one year. “Timelessness” is probably the hardest with the match-specific data. A series of Goals with a certain number of Scoring Objects and a 3-D Position would work for at least Triple Play, Aim High, and Rack ‘n’ Roll.
    The types of data that might need to be covered by whatever format is decided upon:

  • Team List

  • Match Schedule

  • Pit Scouting Data/Robot Info

  • Match Results
    *]Competition List
    Before I give my thoughts on what the file format should be for each of those, what does everyone think of the goals and data?

A ‘file format’ will require applications to be specifically tailored to read these files. What you really want is a unified CSV structure, which specifies either common variables that are acceptable as column and row headers, or a common structure which makes Row A a team list, row B their record, and etc. Obviously specifics for anything need to be worked out, but by using a CSV you allow anyone to easily access this data, either with their own scouting application, or simply Excel, Openoffice, and Google Spreadsheets if they prefer. (I personally know in 2006 team 291 had a purely ridiculous/amazing spreadsheet set up in excel with tons of data that was easily manipulated by someone familiar with the structure).

I like the idea, but I don’t believe it is likely, that a large majority of teams would ever use it. If it is just a format.

I believe the only way you could ever have this be a success you would need to develop a complete, easily customizable, but still universal, tool chain. I think a CSV file(s) would be one of the best ways to do that.

If you are serious about this you should start a SourceForge project.

You may also want to add autonomous plays as well. We’ve been creating a grid for the field (think battleship) and recording the start and end points of auton… start A3, end D5. I can definately get behind your goals.

what about xml? xml files are easy to use all you need is to know how to interpret the structure and be able to parse it. and you can invent new tags just for robotics as long as the interpreters can handle them.

just my .02

…forest

CSV does already have applications that work with it, and XML does require an application to be modified to work with the data. How much work, then, do you have to do to interpret the same data in CSV vs. XML?
As for the toolchain: Yes, a toolchain would help significantly. No, I don’t think it will the ultimate definition of success or failure. Data can, and should, be shared without everyone using the same application. Don’t forget that teams 537 (?) and team 768 both have their own applications already. On top of that, there’s STAMP. I cannot commit myself to my definition of a full-on scouting application just yet, but that’s not necessarily a ‘no’.
For those teams that really want blow-by-blow autonomous, that could potentially be stored in the match results, particularly if they are per-team.

Now that I’ve given a reaction, let’s introduce some new material to digest/make-the-thread-more-confusing-with:

I was thinking the file data should be stored in XML. We could make CSV work, but I strongly suspect that it would be more work to interpret the data, especially considering the XML parsers for just about every language. From there, data could either be in one file, or split across files. I prefer split for things such as a team list, match list, and competition list, and pit data, but all-in-one for each competition’s match data. To allow all-in-one, we would have to have tags for each data type, with the year as a property:

<ScoutingData><MatchData year="2004"></MatchData>
<PitData year="2004"></PitData></ScoutingData>

Different years in the same file could be valid, but would at the very most be useful in a limited number of cases.
A team list would be very straightforward:

<TeamList year="1970">
    <team number=314159 name="Example Team Here" />
</TeamList>

A competition list would follow the same idea. It’d use name=" " and possibly location=" " and date=" ".
Pit data is slightly more complex. First, a certain number of teams have to agree as to what is useful pit data. Which team is specified using . For each bit of data stored, either the property is the tag name () or a property of a pre-defined tag (). If the value property isn’t specified, then it is assumed that the data between the opening tag and the closing tag is the content. The next sample would be valid for max speeds on various gears:


<PitData year=1970>
  <teamData number=314159>
    <maxSpeeds>
      <maxSpeed gear=1 value="4m/s" />
      <maxSpeed gear=2 value="9m/s" />
    </maxSpeeds>
  </teamData>
</PitData>

Match Data has similar problems. First, teams have to agree what is useful for match data. The basic outline would then be:


<MatchData year=1970 competition="That One Regional">
  <match number=19 type="Practice 1"> <!-- ... --> </match>
  <match number=19 type="Practice 2"> <!-- ... --> </match>
</MatchData>

In all the games I remember, points have been scored by placing objects in a certain place, or by doing something with object(s) in certain place(s). Based on that, one set of tags we can guarantee is:


<fieldState>
  <score team=1234 pos="(1, 2, 0)" type="type-id" />
</fieldState>

The pos indicates relative position of the thing that was scored. Example, Rack ‘n’ Roll could use first two to indicate which leg, and the third to indicate whether the object was closer to the center or farther away. More score-related tags:


<totalPoints alliance="red" value=125 />
<penalties alliance="red" value="20" reason="safety violation" />
<finalScore alliance="red" value="105" />

I tried to make that self-explanatory.

Just so everything is clear: certain groups of teams will have to agree what data should be included in their files and applications. Not everyone will be able to share their data with everyone else. I’m not about to try and do anything about it.
Sorry for the length, thanks for reading the whole post through. What do the readers think?

Personally, I’m inclined to agree that XML may well be a better format for many of the reasons enumerated above. (Admittedly, it’s also just as easy to have both as export/import methods - I did write such - but I have to say that I think XML is more easily readable when there’s a problem, as opposed to having to either whip out Excel or count commas. :slight_smile: )

I think that agreeing on the majority of the structure of such a file, though, is something better left for when we get into the new season. Some basic bits and pieces could be agreed upon now, such as which information is needed for a team, or for regional competitions, but admittedly the lion’s share is probably something that needs to wait until January.

another cool thing about xml is that it is easily viewed online all you need is some php a simple server (wamp, lamp, xampp, etc can even be run off a thumb drive) and a browser then you can have all the stuff on a server and dynamically updated throughout the season. and at regionals 1 person could run a server and then upload that data online that night to make it available to scouts at home.

…forest