Jaci's Annual FRC Datadump 2016

Welcome to the inaugural release of my FRC datadumps :smiley:

You can download the sql database here. It contains data about every match, stat, event, team, rank, district, award… everything. The database itself is about 13MB but it’s been zipped together with some instructions and examples to bring the filesize down to about 5MB.

So let’s see what we can do with it…

How about seeing all the matches your team has had over 100 points?
http://i.imgur.com/JvbB2Mu.png


SELECT matches.key, events.short_name, match_scores.total_points, match_scores.alliance_color FROM matches
INNER JOIN events ON events.id == matches.event                 # Join the Event so we can see the Event Name
INNER JOIN match_scores ON matches.id == match_scores.match     # Join the Scores so we can filter based on score
INNER JOIN match_teams ON matches.id == match_teams.match       # Join the Teams so we can filter based on team
WHERE match_teams.alliance_color == match_scores.alliance_color # Make sure the scores we're checking are the same as our team's alliance
AND match_scores.total_points > 100                             # Make sure the score is over 100 points
AND match_teams.team == 5333                                    # Make sure we're checking matches with our team
ORDER BY match_scores.total_points DESC                         # Order by the match score, descending

Or every match in 2016 over 250 points?
http://i.imgur.com/t9P336s.png


SELECT matches.key, match_scores.total_points, match_scores.alliance_color FROM matches
INNER JOIN match_scores ON matches.id == match_scores.match   # Join the scores table, so we can see the scores
WHERE match_scores.total_points > 250                         # Filter by scores 250+
ORDER BY match_scores.total_points DESC                       # Order the scored descending

Or every match where there have been more High Goals scored in Auto than Teleop?
http://i.imgur.com/v7mTAHo.png


SELECT events.short_name, matches.key, match_scores.alliance_color, match_scores.auto_boulders_high, match_scores.teleop_boulders_high FROM match_scores
INNER JOIN matches ON matches.id == match_scores.match                  # Join the matches so we can get the match key
INNER JOIN events ON matches.event == events.id                         # Join the events so we can get the event name
WHERE match_scores.auto_boulders_high > match_scores.teleop_boulders_high   # Check the auto boulders high is larger than the teleop equivilent
ORDER BY match_scores.auto_boulders_high DESC                           # Order by auto boulders high, descending

Or something a little more advanced… every Alliance Selection where the Captain’s OPR is less than the 1st Pick’s OPR…
http://i.imgur.com/Yuf4uxo.png


SELECT events.short_name, alliances.num, teams.id FROM alliances
INNER JOIN alliance_picks ON alliances.id == alliance_picks.alliance        # Join the alliance picks for this alliance
INNER JOIN events ON events.id == alliances.event                           # Join the event for this alliance
INNER JOIN teams ON alliance_picks.team == teams.id                         # Join the teams for the picks
INNER JOIN stats ON stats.team == teams.id                                  # Join the statistics (opr) for the teams
WHERE alliance_picks.pick == 0                                              # Select the Captain team
AND stats.event == events.id                                                # Make sure the event we're checking OPRs is the same event we're checking alliances on
AND
(SELECT stats.opr FROM alliance_picks                                       # Select the OPR stat
INNER JOIN stats ON stats.team == alliance_picks.team                       # Join the stats to the alliance picks
WHERE alliance_picks.alliance == alliances.id                               # Make sure the alliance we're checking is the same alliance the captain is from
AND alliance_picks.pick == 1)                                               # Make sure we're checking the 1st Pick. At this point, the OPR stat is selected
> stats.opr                                                                 # Check the 1st picks OPR from the previous select against the captains

All these can be seen in the examples of the ZIP file.

Thanks for pulling all this together. I use R as a tool to look at data, so it will be nice to have all this available locally to mess around with.

*Very impressive! Nice work.
*
*

Thank you for this. Really great work. Your ‘more advanced’ example intrigued me because I know our team was a 1st pick for alliance 5 at the OCR (CAPL) and we had a higher OPR. When I ran your query, we did not appear in the results. I did a little testing and found an improvement to the SQL.


SELECT events.short_name, alliances.num, teams.id, stats.opr FROM alliances
INNER JOIN alliance_picks ON alliances.id == alliance_picks.alliance
INNER JOIN events ON events.id == alliances.event
INNER JOIN teams ON alliance_picks.team == teams.id 
INNER JOIN stats ON stats.team == teams.id
WHERE alliance_picks.pick == 0 
AND alliances.event == stats.event
AND
(SELECT stats.opr FROM alliance_picks
INNER JOIN stats ON stats.team == alliance_picks.team
WHERE alliance_picks.alliance == alliances.id
AND stats.event = alliances.event
AND alliance_picks.pick == 1)
> stats.opr
ORDER BY events.short_name, alliances.num

Now, we appear in the results as expected!

Thanks again for putting this together.

Ahh, keen eye, good job! I can’t wait to see what queries ya’ll can come up with using this Data Dump

*Jaci, how is your name pronounced? Does it rhyme with “classy” or “lacy”?
*
*

It rhymes with classy :slight_smile:

Thanks for making all this data easily accessible. So here’s something my team was casually tracking throughout the season: the 100% breach club. Here’s the the sql for one way to get the answer:

select substr(teams.key, 4) as team, match_counts.match_count
from teams
inner join
(select match_teams.team, count(match_teams.id) as match_count 
from match_teams 
group by match_teams.team) as match_counts 
on substr(teams.key, 4) = match_counts.team
inner join
(select match_teams.team, sum(match_scores.defenses_breached) as breach_count 
from match_teams inner join match_scores on match_teams.match = match_scores.match and match_teams.alliance_color = match_scores.alliance_color
group by match_teams.team) as breach_counts
on substr(teams.key, 4) = breach_counts.team
where match_counts.match_count = breach_counts.breach_count
order by cast(substr(teams.key, 4) as int)

and here’s the result:

Team Match Count
101 10
159 11
525 45
558 68
1014 30
1569 17
1750 11
1986 51
2062 28
2354 11
2767 63
2855 11
2882 10
3225 16
3238 65
3310 46
3488 10
3546 65
3683 50
3695 10
4476 12
4931 28
5254 45
5271 10
5618 27
5805 21
5905 11
5951 14

Just for reference: instead of doing a substr on teams.key, teams.id is the team number itself. We use the Team Number as the Primary Key since they’re unique per team.

I am not an SQL expert so I was hoping someone could help me out.

Some trivia questions I was curious in:

  1. Alliance with the greatest sum of team numbers;
  2. Alliance with the lowest sum of team numbers;
  3. Alliance with the narrowest range of team numbers;
  4. Alliance with the widest range of team numbers;
  5. Any Palindrome alliances during the year? (either full team numbers like 2052,525,2502 or spread across team numbers like 610,3130,16)

Hmm… Franks blog post claims there were 13,303 matches played, but the matches table has only 13,302 rows.

Clearly there is something going on here and someone is hiding the data for a match that they don’t want made public :smiley:

For all 3-alliance teams that actually played a match:

6138+6137+6208 = 18483 (blue)

11+56+41 = 108 (red)

Can you show your work, please? :slight_smile:

I haven’t done SQL in decades. I just cobbled up an AWK script to read the match_teams table.

(ac=="red") && ($3=="red"){red+=$4; next}

(ac=="blue") && ($3=="blue"){blue+=$4; next}

(ac=="red") && ($3!="red"){
	if(red>max){max=red; idmax=$1}
	if(red<min){min=red; idmin=$1}
	ac="blue"; blue=$4; next}

(ac=="blue") && ($3!="blue"){
	if(blue>max){max=blue; idmax=$1}
	if(blue<min){min=blue; idmin=$1}
	ac="red"; red=$4; next}

I just did:

select count(*) from matches;

which returned one less than Franks numbers.

I’ve been doing SQL forever…

*If you export the database tables, the “sqlite_sequence” table lists the number of records in each table:

event_types			7
match_levels			5
defenses			9
districts			8
events				139
alliances			1088
alliance_picks			3336
**matches				13302**
match_defenses			133020
match_teams			79812
match_scores			26604
awards				3251
stats				6194
rankings			6194
district_teams			1180
district_rankings		1179
district_event_points		2789

Here is the SQL that gives the same answer:

sqlite> select a.team, b.team, c.team, max(a.team+b.team+c.team) from match_teams a join match_teams b on a.match=b.match join match_teams c on b.match = c.match where a.team <> b.team and a.team <> c.team and b.team <> c.team and a.alliance_color = b.alliance_color and b.alliance_color = c.alliance_color ;

6138|6137|6208|18483

sqlite> select a.team, b.team, c.team, min(a.team+b.team+c.team) from match_teams a join match_teams b on a.match=b.match join match_teams c on b.match = c.match where a.team <> b.team and a.team <> c.team and b.team <> c.team and a.alliance_color = b.alliance_color and b.alliance_color = c.alliance_color ;

41|11|56|108

Widest range:

sqlite> select a.team, b.team, c.team, max(a.team - c.team) from match_teams a join match_teams b on a.match=b.match and a.team > b.team join match_teams c on b.match = c.match and b.team > c.team where a.alliance_color = b.alliance_color and b.alliance_color = c.alliance_color ;
6231|5536|11|6220

Narrowest:

sqlite> select a.team, b.team, c.team, min(a.team - c.team) from match_teams a join match_teams b on a.match=b.match and a.team > b.team join match_teams c on b.match = c.match and b.team > c.team where a.alliance_color = b.alliance_color and b.alliance_color = c.alliance_color ;

177|176|175|2

Thank you

Yay AWK, and once again showing there is more than one way to parse the data. I’m going to guess that Ether’s was a little faster, since the data only will get a single read, the other method using SQL has quite a few joins.