View Single Post
  #1   Spotlight this post!  
Unread 26-03-2010, 20:04
DanL DanL is offline
Crusty Mentor
FRC #0097
Team Role: Mentor
 
Join Date: Jan 2002
Rookie Year: 2001
Location: Somerville, MA
Posts: 682
DanL is just really niceDanL is just really niceDanL is just really niceDanL is just really niceDanL is just really nice
Send a message via AIM to DanL
Why do the offical published match results have only the bare minimum amount of data?

FIRST graciously publishes all match results. This is useful data, but it is extremely limited; only the final results are published. The only way to get preliminary result data (scores before penalties or bonuses like hanging are tacked on) is through manual scouting. Honestly, though, not every team has an army of freshman they can park in the bleachers to manually record every match. I would argue FIRST should publish all of these raw numbers with the final match results -- besides making well-informed scouting more accessible by leveling the field for teams that don't have 30 freshmen to spare, there is an important educational benefit to making this data public....

FIRST has been publishing match results ever since I participated "back in the day." Now that I'm on the other side mentoring a team, a trend I've noticed over the last few years is that a lot of teams are relying more and more on this data for scouting purposes. And not just simple win/loss scouting -- a lot of teams are doing some pretty sophisticated mathematical analysis with these numbers.

I've seen some pretty cool college-level linear algebra analysis like Offensive Power Ratings / OPR (post, PDF), Defensive Power Ratings / DPR (post), and Calculated Contribution to Winning Margin / CCWM (PDF). There are tons of scouting calculators/databases people have made. Basically, the students are getting more and more excited about the math behind the numbers. Google would be proud of this new generation of statisticians (and by proud, I mean wanting-to-hire).

I ask this of FIRST: why not publish all the raw data? There is no technical excuse: you guys already add everything up after every match and display it on the jumbotron... it's probably all electronic anyways. If people have come up with cool statistics like OPR's, DPR's, and CCWM's using just final numbers, imagine what other interesting analysis people will come up with once they start incorporating pre-penalty scores and bonuses into their algorithms.

Check out this 3-minute TED talk by Arthur Benjamin. He talks how the US high school math curriculum is geared towards one this: calculus. Don't get me wrong, as an EECS grad I have seen the divine beauty of calculus. But Benjamin makes the point that most people (unfortunately) don't becomes EECS grads. Many people -- everyone from doctors judging the efficacy of the latest study so that they can practice evidence-based medicine to your average Joe just trying to understand the latest political poll -- would benefit far more from understanding what a standard deviation is than from understanding derivatives. If you need more convincing, pick up a copy of Ian Ayres' NYTimes-best-selling Super Crunchers or read anything about how Google went from two grad students to one of the most powerful companies on the planet in less than a decade (the Ayres book is a really easy read and I encourage everyone to pick it up -- it's basically an extended PopSci article on various fascinating statistical analysis applications).

So FIRST: why not give us more data? Throw it out there... put it in an accessible format like CSV or XML that encourages us to play with it. Why keep it locked up and have the larger teams force their army of freshmen to sit in the stands and manually transcribe all this data like a bunch of monks in the middle ages? You already have awards for engineering, art, and leadership excellence... why not add a new "Interesting Application of Statistical Analysis Award"? Is there any reason NOT to give us this data?
__________________
Dan L
Team 97 Mentor
Software Engineer, Vecna Technologies

Last edited by DanL : 26-03-2010 at 20:10. Reason: grammar
Reply With Quote