A New Way to Scout

As any scout knows, data is king. More data is better (as long as you can analyze it). What I wanted to discuss in this post were the methods we use for competitive analysis. (I actually gave a presentation on this recently - the slides are linked at the bottom).

As far as I have seen, there are five methods of scouting:

  1. Paper
  2. Spreadsheet
  3. Tablet/Web/Phone Application
  4. Databases (supplementary - The Blue Alliance, CheesyScout, CowScout, etc.)
  5. Crowdscouting

Each of them has their own strengths and weaknesses. What I specifically wanted to talk about was crowdscouting. Crowdscouting would be a way for multiple teams that go to different events to contribute to the same dataset, which would be accessible by all members of the crowdscouting alliance (and publicly released after the season). It hasn’t been used on any large level (that I know of), so there’s a lot of potential for expansion here.

Why?
There are two problems that we will face in the upcoming years:

  1. More data in less time - As FIRST grows, and we see more events (nationwide districts by 2017) and more teams, we’ll be interacting with more teams at more events. We’ll have less time to know each team’s strengths and weaknesses, and to notice changes between each event (a team that does horribly at their first event could improve multifold by their second event).

  2. Competition will increase - This was a major factor behind the change to districts. In the words of one of the planners of the district system, “Michigan is able to put out so many competitive robots because they have more events.” Scouting capabilities will need to increase with competition.

Crowdscouting Discussion
I wanted to start a discussion on what guidelines and practices we could use to create crowdscouting systems. I’ve thought of a few myself:

  1. Involved teams would need to decide what to keep track of early in the season. This could be done via email or a Hangout or Skype. They would need to decide on a method to collect data as well. Do they want an API to be able to collect realtime data?

  2. Quantitative data only - this means things like types of drivetrains, autonomous points scored, etc. We don’t want qualitative data because different teams want to track different things in different ways. There would be too much confusion based on judging data quality. Hard data can’t be argued with. Teams would be free to collect their own qualitative data.

  3. I’m having trouble imagining this with using an app. Data is only as helpful as how quickly you can gain access to it. On 1540, we use tablets with data connectivity to access the internet (because wifi can mess with the Field Management System). Our scout data is immediately accessible to the drive coach, who can then use data to plan our matches.

How many other teams do this? If there were enough, it would be simple to make an API to send data directly to a central server, and have any team able to access it immediately.

I’m already developing a crowdscouting API for this.

I look forward to the discussion.
-Hamzah

If you are interested in working with me on this, please contact me:
Email: [email protected]
Twitter: @JSandrobots1540

Slides - http://static.squarespace.com/static/521270c0e4b09a53f3486f39/t/52884c99e4b0d5b9928d5968/1384664217714/Competitive%20Analysis%20Presentation.pdf

That is some great information!

At each of our regionals, our team has partnered with a few other teams for scouting. We called it our scouting “alliance”. It works really well.

Currently, our awesome ginger programmer, Brennon (brennonbrimhall on cd) is developing this crazy effective scouting application. It’s basically completed. It combines our paper scouting with a easier method of data entry (multiple computer data entry) among other awesome things.

I can’t say I even close to understand it. But if you’d like, you can PM him. He is planning on probably releasing his program on CD sometime after it’s been regional-tested.

You also might wanna talk to 610, who is making their scouting data at their regional available to all the attending teams on a monitor in their awesome scouting room thing.

I don’t have anything technical to add, but I just want to say that I wish there were real statistics available to every team in every regional in real time. Then we could stop having arguments about OPR and have some really fun discussions about real scoring stats. Crowd-scouting would be a good step toward that.

Wasn’t there a team that already developed something like this for the championship? I remember somebody had a system of paper forms that anybody from any team could fill out as a scout, then that team compiled everything electronically.

I remember that thread, but I don’t think it was very active, so I don’t know how successful that was.

I am thinking of designing my application to run on the internet, through my Pi server. I am aiming at using WebCache to save all the required files, reducing the data requirements. Everything will be synchronized to the server. I will try to create a database for every team where they can store their information. Other than that, there will probably be competition databases and public databases to make sure that a team doesn’t have to share it’s data if it doesn’t want to. Also, another reason why I wan’t to limit the network bandwidth is because this server runs on a 768kbps uplink! This would hog all the resources if I do not limit the usage!

Currently, I am working on the concept of it, and brushing my PHP, MySQL and AJAX skills so that I will be able to create a working program in no time!

Some students on 1306 have a good system for crowdscouting at events, which only requires paper, one laptop, and a scanner (so more teams can use it; no need for lots of tablets/laptops). Making an online database to store the info would be pretty easy.

I know that this paper has a ton of statistical data and calculates OPR (and all of the other bells and whistles).

One simple way to get around the internet issues might be to have a “default” application for the hard data.

This would mean a mobile webpage that anyone can access from a smartphone. They would input hard data, which could be uploaded to the server at competitions without any internet issues.

The problem would be if a team wanted to have their own qualitative data. An option would be to publish the code, and allow changes so that the hard data go to the crowdscouting server, but the qualitative data goes to another server.

There is another method, which, for lack of a better term, I would refer as the “Mycroft Holmes” method (Wikipedia’s description is incomplete). It is someone that somehow can keep statistics of all the teams in his/her head, and be able to balance off which team would be good, and which to watch out. We’ve had a couple like that, and they are much better than any other system. I told the team’s teachers to look out for a sports fanatic student for this role.

I like your idea and I think it would work but I don’t see the reason to keep the data private during the competition season. There is no hurt that I know of from the data being public. Also that would help the rookie teams or teams that don’t have a proper scouting team in place to collect data. Also what is the point of the data after the season, it is more like a incentive that isn’t really an incentive to the teams that aren’t part of this but want the data or need the data. I don’t know. Just seems improper that you aren’t releasing the data to the public and keeping it private until the season is over.

I made this really advanced google docs scouting sheet

Here is how it worked:

  1. The scouter filled out a little form on google
  2. The form automatically sends the data to a google spreadsheet
    a. One page in the spreadsheet is for the raw data
    b. one is for the data that is sorted by teams and averaged
    c. one is for the interface that we can use for graphs
    3.The spreadsheet automatically sorts all the information and makes it easy for me or whoever is looking through it to see all the matches and trends in near real-time

The only downfall to this is you need internet (which was solved by writing down the matches and giving it to somebody that did)

and you need people who want to scout (which we had like nobody) (it also doesnt help that you dont do well)

If anyone wants to know how I did it, just ask, its pretty simple

Here is a picture of the interface, which has been improved with graphs and individual results after that competition
http://imageshack.com/a/img834/186/k4qi.png

Please note that all the boxes had info at it at one time. The code somehow broke when I was copying it for another competition

Team 696 used a similar Google Docs based scouting system for the 2012 and 2013 official seasons. Although simple and easy to use to create a system, it has MAJOR downsides.

  1. Google’s systems have crashed at every single competition. It seems like Google Spreadsheets can not handle the amount of formulas needed for any meaningful analysis. Because of this, we were not able to access our data for our Friday night scouting meetings.

  2. Its SLOW. At the height of it’s lag, it took around 30 minutes for data to be submitted. In addition, the sheets sometimes take forever to load.

I warned a team going to Champs not to use a Google Docs system, and they faced similar results (for post-day analysis).

If not for these, Google Docs would be the perfect scouting solution. This summer, we created a scouting system running off our website that would be able to input data, analyze the data, create rankings and team reports, and create a match strategy printout that people would be able to send to a printer in the pits. We successfully collaborated with 3476 and 3255 and used the system at IRI, Battle at the Border, and Fall Classic.

Our head scout for the last few years was this type of person. His skills were all enhanced by our data collection. It also made it easier for him to explain reasons to the rest of the team, which in turn allowed him to better develop strategies.

If anything, I think the “Mycroft Holmes” method should be supplemented by some form of collected hard data.

The reason I would make the data private during the season is because having data is currently a major competitive factor. Until we reach a point where there’s some third party scout that makes all data public, data collection will be a huge playing card at any event.

To clarify, “private” as I used it meant accessible to all teams who contributed to the dataset.

There will be some point at which scouting data and stats will become widespread, and at that point, the challenge will change primarily to using the data effectively. (think stock market - there’s a lot of data, and lots of opinions).

While our team does have it’s own form of scouting, I kinda think that this method is what suits me the most. I like to look at regionals I didn’t attend and try to work out what teams had the most synergy, and what the best possible alliances would be.

That’s always something to consider (synergy). And I definitely do think that those people are valuable. It’s just that we are all human, and we make mistakes and/forget things. Scouting data is the best way to confirm observations made throughout the day.

Additionally, explaining the rationale behind decisions always leads to better ideas, and allows for proof when you make a statement.

What advantage does it have over wether your teams lose or win though? Big teams already have their own scouting system and so does most medium sized teams. Also you could be helping rookie or new or small teams. I don’t see the point of it being secret, it’s not like it is a special robot strategy or robot function. It is merely harmless data.

Exactly. What makes scouting actually mean something is what you do with the data. You’re sharing the raw data, not how it’s manipulated to be effective for your team.

One of the ways we’ve debated doing scouting this season, as opposed to the hundreds of sheets of paper, is having a webpage which all of the scouters can connect to on their laptops, phones, tablets, etc. through a dedicated mobile hotspot. This data is collected, and organized on the scouting lead’s Ipad, which allows for medium speed data review, depending on the internet speed.
We haven’t thought about it fully, we’ve just been talking about it. The scanner method might work better though, as it seems simpler and more cost-efficient.