Have you ever wanted to run your own SQL queries over the entire TBA database? Now you can, with our new BigQuery dataset! If you’ve managed to write any cool queries, post them here and share!
And to prove how easy - Tim Flynn set forth the challenge to find elimination alliances that have previously played together in qualifying matches which I of course took WAY too literally to mean EVER. Anywho, h̲͇̰̩͕͔͇e̢̬͖̫͔̰̪r̴͚̗̲͇̞̣e͉͔’̘̬̟̟͉ͅs̻ ̦͈͎̭̭̀t̪̲̝̰͕ͅh͈̥̳̪̼̫e̗͔̣̝̠ q̡̞ṳ̖͡e̗̳̘̼̰̠r̝͔͈̻͉̰ý͙ (it felt appropriate)
CREATE TEMPORARY FUNCTION
sort_arr(a ARRAY<string>) AS ((
SELECT
ARRAY(
SELECT
t
FROM
UNNEST(a) AS t
ORDER BY
1) ));
WITH
teams AS (
SELECT
ARRAY[JSON_EXTRACT(alliances_json,
'$.red.teams'),
JSON_EXTRACT(alliances_json,
'$.blue.teams')] AS alliances,
__key__.name AS match_id
FROM
`tbatv-prod-hrd.the_blue_alliance.match`
WHERE
comp_level = 'qm' ),
elim_alliances AS (
SELECT
ARRAY_TO_STRING(sort_arr(REGEXP_EXTRACT_ALL(JSON_EXTRACT(alliances_json,
'$.red.teams'), '"(frc[0-9]+)"')), ",") AS red,
ARRAY_TO_STRING(sort_arr(REGEXP_EXTRACT_ALL(JSON_EXTRACT(alliances_json,
'$.blue.teams'), '"(frc[0-9]+)"')), ",") AS blue,
__key__.name AS match_id
FROM
`tbatv-prod-hrd.the_blue_alliance.match`
WHERE
comp_level != 'qm')
SELECT
ARRAY_TO_STRING(sort_arr(REGEXP_EXTRACT_ALL(alliance, '"(frc[0-9]+)"')), ",") alliance,
EA.match_id elim,
t.match_id
FROM
teams AS T
CROSS JOIN
UNNEST(T.alliances) AS alliance
JOIN
elim_alliances EA
ON
(ARRAY_TO_STRING(sort_arr(REGEXP_EXTRACT_ALL(alliance, '"(frc[0-9]+)"')), ",") = EA.red
OR ARRAY_TO_STRING(sort_arr(REGEXP_EXTRACT_ALL(alliance, '"(frc[0-9]+)"')), ",") = EA.blue)
Thanks for this powerful learning tool and data set! You all have been hitting it out of the park with these Tech Talks, I really look forward to seeing new ones and learning what all it takes to make TBA work as well as it does!
And, for those of you who use Tableau - Big Query is a supported data source.
Might make a good way to do some pre scouting for teams.
Yay! We love Tableau and this will be a huge help.
Looking good, now you’re starting to dig into my realm ;). Good to see some more SQL around these boards
I think this will be great to pull from, and for further processing by splitting the match score breakdown into its own tables for year-specific schemas.
But is it webscale?
Sorry for the kludgy schema around the score breakdowns. This dataset is literally a mirror image of the internal TBA datastore (I created it by importing backups), so it all needs to be generic enough to work across games. It shouldn’t be that hard to create a derived table that extracts the score breakdowns though.
All good, I figured it was because of year agnostic data. Nonetheless, this should make it easy to make local mirrors to avoid an insanely high number of API requests like in the past
Challenge accepted… accept it is denying me access
This is a really cool tool that I’ve been playing around with over the last few days, though I’ve been running into a small issue of how to query the results I want.
I’m trying to ask “How many Chairmans’ and Engineering Inspiration Awards have been won by teams that have participated in a given regional over the last 4 years?”. The SQL query I made ended up being this, using the Los Angeles Regional as a test:
#standardSQL
WITH teams AS(
SELECT
event_name,
team_number
FROM (
SELECT
team.name as team_number,
event.name as event_name
FROM `tbatv-prod-hrd.the_blue_alliance.eventTeam`
WHERE year >= 2014 AND event.name LIKE '%calb'
)
),
awards AS(
SELECT
team_number,
award_name
FROM(
SELECT
name_str AS award_name,
team_list[OFFSET(0)].name AS team_number
FROM `tbatv-prod-hrd.the_blue_alliance.award`
WHERE name_str LIKE '%Chairman%' OR name_str LIKE '%Engineering Inspiration%'
)
GROUP BY award_name, team_number
)
SELECT
award_name,
COUNT(teams.team_number) as award_count
FROM teams
INNER JOIN awards
ON
teams.team_number = awards.team_number
GROUP BY award_name
I was wondering, since I am not proficient with SQL, is there a better query that merges similarly named results (e.g. Chairman’s showing up multiple times because of small name changes to the award name over the years) and that show all teams that have won a given award in a separate column? (simply trying to list the team numbers after award count seems to break the query)
The ‘award_type_enum’ colum contains an integer constant that directly correlates to the type of the award listed. You can find the reference for the column here: the-blue-alliance/consts/award_type.py at master · the-blue-alliance/the-blue-alliance · GitHub