[TBA Blog] We <3 Big Data

Have you ever wanted to run your own SQL queries over the entire TBA database? Now you can, with our new BigQuery dataset! If you’ve managed to write any cool queries, post them here and share!

And to prove how easy - Tim Flynn set forth the challenge to find elimination alliances that have previously played together in qualifying matches which I of course took WAY too literally to mean EVER. Anywho, h̲͇̰̩͕͔͇e̢̬͖̫͔̰̪r̴͚̗̲͇̞̣e͉͔’̘̬̟̟͉ͅs̻ ̦͈͎̭̭̀t̪̲̝̰͕ͅh͈̥̳̪̼̫e̗͔̣̝̠ q̡̞ṳ̖͡e̗̳̘̼̰̠r̝͔͈̻͉̰ý͙ (it felt appropriate)

CREATE TEMPORARY FUNCTION
  sort_arr(a ARRAY<string>) AS ((
    SELECT
      ARRAY(
      SELECT
        t
      FROM
        UNNEST(a) AS t
      ORDER BY
        1) ));
WITH
  teams AS (
  SELECT
    ARRAY[JSON_EXTRACT(alliances_json,
      '$.red.teams'),
    JSON_EXTRACT(alliances_json,
      '$.blue.teams')] AS alliances,
    __key__.name AS match_id
  FROM
    `tbatv-prod-hrd.the_blue_alliance.match`
  WHERE
    comp_level = 'qm' ),
  elim_alliances AS (
  SELECT
    ARRAY_TO_STRING(sort_arr(REGEXP_EXTRACT_ALL(JSON_EXTRACT(alliances_json,
            '$.red.teams'), '"(frc[0-9]+)"')), ",") AS red,
    ARRAY_TO_STRING(sort_arr(REGEXP_EXTRACT_ALL(JSON_EXTRACT(alliances_json,
            '$.blue.teams'), '"(frc[0-9]+)"')), ",") AS blue,
    __key__.name AS match_id
  FROM
    `tbatv-prod-hrd.the_blue_alliance.match`
  WHERE
    comp_level != 'qm')
SELECT
  ARRAY_TO_STRING(sort_arr(REGEXP_EXTRACT_ALL(alliance, '"(frc[0-9]+)"')), ",") alliance,
  EA.match_id elim,
  t.match_id
FROM
  teams AS T
CROSS JOIN
  UNNEST(T.alliances) AS alliance
JOIN
  elim_alliances EA
ON
  (ARRAY_TO_STRING(sort_arr(REGEXP_EXTRACT_ALL(alliance, '"(frc[0-9]+)"')), ",") = EA.red
    OR ARRAY_TO_STRING(sort_arr(REGEXP_EXTRACT_ALL(alliance, '"(frc[0-9]+)"')), ",") = EA.blue)

Thanks for this powerful learning tool and data set! You all have been hitting it out of the park with these Tech Talks, I really look forward to seeing new ones and learning what all it takes to make TBA work as well as it does!

And, for those of you who use Tableau - Big Query is a supported data source.

Might make a good way to do some pre scouting for teams.

Yay! We love Tableau and this will be a huge help.

Looking good, now you’re starting to dig into my realm ;). Good to see some more SQL around these boards

I think this will be great to pull from, and for further processing by splitting the match score breakdown into its own tables for year-specific schemas.

But is it webscale?

http://i.imgur.com/FXgRz5J.png

Sorry for the kludgy schema around the score breakdowns. This dataset is literally a mirror image of the internal TBA datastore (I created it by importing backups), so it all needs to be generic enough to work across games. It shouldn’t be that hard to create a derived table that extracts the score breakdowns though.

All good, I figured it was because of year agnostic data. Nonetheless, this should make it easy to make local mirrors to avoid an insanely high number of API requests like in the past

Challenge accepted… accept it is denying me access :frowning:

This is a really cool tool that I’ve been playing around with over the last few days, though I’ve been running into a small issue of how to query the results I want.

I’m trying to ask “How many Chairmans’ and Engineering Inspiration Awards have been won by teams that have participated in a given regional over the last 4 years?”. The SQL query I made ended up being this, using the Los Angeles Regional as a test:


#standardSQL
WITH teams AS(
  SELECT
    event_name,
    team_number
  FROM (
   SELECT 
    team.name as team_number,
    event.name as event_name
  FROM `tbatv-prod-hrd.the_blue_alliance.eventTeam`
  WHERE year >= 2014 AND event.name LIKE '%calb'
  )
),
awards AS(
  SELECT
    team_number,
    award_name
  FROM(
    SELECT
      name_str AS award_name,
      team_list[OFFSET(0)].name AS team_number
    FROM `tbatv-prod-hrd.the_blue_alliance.award`
    WHERE name_str LIKE '%Chairman%' OR name_str LIKE '%Engineering Inspiration%'
  )
  GROUP BY award_name, team_number
)
SELECT
  award_name,
  COUNT(teams.team_number) as award_count
FROM teams
INNER JOIN awards
ON
  teams.team_number = awards.team_number
GROUP BY award_name

I was wondering, since I am not proficient with SQL, is there a better query that merges similarly named results (e.g. Chairman’s showing up multiple times because of small name changes to the award name over the years) and that show all teams that have won a given award in a separate column? (simply trying to list the team numbers after award count seems to break the query)

The ‘award_type_enum’ colum contains an integer constant that directly correlates to the type of the award listed. You can find the reference for the column here: the-blue-alliance/consts/award_type.py at master · the-blue-alliance/the-blue-alliance · GitHub