![]() |
The Blue Alliance - Data Loss
UPDATE: We experienced some data loss on The Blue Alliance. I'm working to restore the situation, and it should be restored entirely by the end of April 4th, 2012.
Those curious about technical details can read more in this Google Doc, where I am recording what I am doing to fix it. Most of 2012's data is recovered already. Not all events have been, so some pages will continue to generate errors. Whatever doesn't heal itself with cronjobs will get fixed tomorrow night. Thanks, Greg |
Re: The Blue Alliance - Data Loss
my.usfirst.org appears to be down, which is preventing The Blue Alliance from rescraping data from FIRST.
http://www.usfirst.org/whatsgoingon fails to load its iframe, for example. |
Re: The Blue Alliance - Data Loss
At this point, nearly all data has been recovered. 2012 Events and Matches are partially missing. 2011 Events and Matches are entirely missing. This data will be recovered when FIRST's pages come back online.
Some teams who have not competed in 2012 have lost their details like nickname. These will be restored when FIRST's pages come back online. Will write more about backup measures we should take in the future in the document, but not tonight. |
Re: The Blue Alliance - Data Loss
FIRST's servers are rejecting our scraping attempts from our production server. I've emailed frcteams@usfirst.org to attempt to resolve the issue. Does anyone know anyone else I can get in touch with?
"Google access to this page has been blocked due to repeated failure to respect robots.txt" |
Re: The Blue Alliance - Data Loss
1 Attachment(s)
What are you violating? The 10 sec page request delay or something else?
You'll want to assure them you'll respect current settings and future changes in robots.txt. I've talked to their IT dept before, but normal changeover almost assures that it's different people by now. New employees are publicized in the FIRST Newsletter. I've attached a list, but beware, the earlier people may be gone or re-positioned within the organization as time passed. |
Re: The Blue Alliance - Data Loss
Quote:
|
Re: The Blue Alliance - Data Loss
Thank you for keeping such a website up and running. It's a great tool to use and FIRST wouldn't be the same without it.
|
Re: The Blue Alliance - Data Loss
Thanks Greg for all your hard work. The Blue Alliance is a true asset to the community.
|
Re: The Blue Alliance - Data Loss
I believe all data is restored now. We lost some metadata that had been manually edited, but the Events and Matches should be back. We'll monitor our logs for errors in the next few days, and fix anything else that crops up.
We're now investigating backup options :-) |
Re: The Blue Alliance - Data Loss
Thanks for your hard work Greg! I needed this back up ASAP to do some scouting.
|
Re: The Blue Alliance - Data Loss
Quote:
What event is 2012oj? We don't seem to have it, and I can't figure out what it is. People are trying to get to it though, and it's throwing errors. |
Re: The Blue Alliance - Data Loss
Quote:
|
Re: The Blue Alliance - Data Loss
Quote:
|
Re: The Blue Alliance - Data Loss
Many team names have been lost and replaced with just the number.
|
Re: The Blue Alliance - Data Loss
Quote:
Can you expand a little more on this portion? |
Re: The Blue Alliance - Data Loss
Quote:
For pre-2012 teams, I've opened an issue to fix this (we've never done it right): https://github.com/gregmarra/the-blu...nce/issues/108 Quote:
|
Re: The Blue Alliance - Data Loss
Team 801 is missing name and sponsors.
|
Re: The Blue Alliance - Data Loss
Quote:
Will dig in later this week, thanks. |
Re: The Blue Alliance - Data Loss
Quote:
|
Re: The Blue Alliance - Data Loss
Oddly enough, it looks like you only have team info for defunct team numbers like 40, 47 & 65. Wonder if that is stale backup data because it is not overloading them in failed scrapes.
All the team info TBA uses is available in one (easy to parse) tab deliminated page https://my.usfirst.org/frc/scoring/i...?page=teamlist It would be easier to just scrape that page. Plus, that is only 1 page request instead of thousands. Alternatively, you could ask 358 for its database (especially if you want to fill in defunct team data) Great job getting it back up, Greg! |
Re: The Blue Alliance - Data Loss
Quote:
This is amazing... |
Re: The Blue Alliance - Data Loss
Quote:
|
Re: The Blue Alliance - Data Loss
Quote:
Quote:
I think we should go ahead with implementing it and use it when we need data for the current season. |
Re: The Blue Alliance - Data Loss
Quote:
Other FMS pages you might find useful 2012 Event List Sample Event Team List ('12 CMP) You can get to the event team list by using index.lasso?page=event_teamlist&ID_event=<ID on the Event List Page> I have been using these pages for a for scouting/stat purposes since they parse easier in Excel than the /myarea/ ones linked from the FRC regional page (copypasta a team list from there each team will have their town will show up in its own row, and it may be over multiple pages). |
Re: The Blue Alliance - Data Loss
Quote:
This page is amazing!!! Issue opened to switch our scrapers to this - it's so much simpler! I made a FIRST Wiki page to trade notes on how to scrape FIRST pages. Borrowing Pat Fairbank's TPID scraper made scraping Team pages possible before, but I don't think that's a widely known technique. It needs more love, but an OK start. |
Re: The Blue Alliance - Data Loss
I'm currently super confused about the event code for the Michigan State Championship. http://frclinks.frclinks.com/ and https://my.usfirst.org/myarea/index....=2012&event=gl both suggest it should be "GL"
However http://www2.usfirst.org/2012comp/eve...edulequal.html and http://www2.usfirst.org/2012comp/eve...edulequal.html show similar but conflicting data, both for Gull Lake District. I'm guessing Gull Lake (MIGL) accidentally started posting to GL, and they changed half way through? With GL being the Michigan State Chapmionship? Can anyone shed some light on this discrepancy? Thanks. |
Re: The Blue Alliance - Data Loss
Quote:
Perhaps, this is the reason why MIGL is so odd among event codes. It is the only 4 digit event code and doesn't follow FiM's usually standard of basing the event code off of the Michigan county where is is held (it would probably be KZ for Kalamazoo County). |
| All times are GMT -5. The time now is 00:14. |
Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi