![]() |
paper: 4536 scouting database BETA
Thread created automatically to discuss a document in CD-Media.
4536 scouting database BETA by Caleb Sykes |
Re: paper: 4536 scouting database BETA
This is a beta test of a scouting database which calculates component calculated contributions (OPRs) using the data from the FIRST API. As this project is still in its infancy. Please report any bugs or potential improvements to Caleb Sykes (calebsyk@gmail.com). Each sheet currently contains data from a distinct week 1 event. Starting weekly on 3/21, a new database will be published which will contain data from all events up to that date.
Be extremely careful when using the individual defense crossings (columns J-Q on each sheet). At a given event, if a defense is chosen fewer times than there are teams at the event, a #NUM! error will appear. If a defense is chosen less than twice as many times as there are teams at the event, place limited faith in the numbers. See the "instructions" sheet for more detailed information on what each category represents. |
Re: paper: 4536 scouting database BETA
I have just uploaded version 1 of this database. It is populated with all week 1-3 events.
In addition to adding the additional data, the two main changes since the BETA include: An alternative calculation of eOPR (elimination OPR) is now included for each team. So there are now two eOPR calculations, which I have dubbed eOPR1 and eOPR2. Details on how these are calculated can be found in the "instructions" sheet. Although I have not verified this, I expect eOPR1 to provide better elimination predictions at weaker events where captures are more infrequent, and eOPR2 to provide better elimination predictions at stronger events where captures are more frequent. A new "world results" sheet has been added, which allows for component comparisons for every team at every event in which they have competed. Be aware that this list will have duplicates for teams that competed at 2+ events. Also, don't compare individual defense crossing data unless you know what you are doing. For example, team 5114 has a drawbridge contribution of 1722968039259170.00. 5114 is not that good at crossing the drawbridge, this just means that the drawbridge was not chosen frequently enough at Midland for there to be meaningful results for drawbridge contributions. As a rule of thumb, you can almost always trust the rock wall, sally port, and cheval de frise contributions, but be wary of the others. Remember, this project is still quite young, and there are very likely errors in places (especially since I have not yet automated everything, and have to do some copying by hand). If you see any errors, please let me know and I will look into it. |
Re: paper: 4536 scouting database BETA
First: very cool spreadsheet! I'm glad to have a resource that looks at the component OPR for pretty much every possible condition! It has all the usual OPR caveats, but it does seem useful for establishing some trends and making some comparisons.
As a sidenote, thank you to FRC HQ for making this data more available for capture. The API certainly provides much better data than the twitter feed over recent years. I have some questions about the "units" of some columns... I'm pretty sure they're my initial guess for most of them, but I wanted to double-check. For columns H and I (teleop Capture or Breach), I presume a "1" would indicate a successful Capture/Breach? For columns J - V and AM (defense crossings), is "1" a single defense traversal (5 pts) or a weakened defense (10 pts, 2 traversals)? Also, how are eOPR 1 and eOPR 2 calculated? What's the difference? They differ dramatically from the OPRs based solely on match scores. |
Re: paper: 4536 scouting database BETA
Quote:
Quote:
Quote:
"teleop Tower Captured" and "teleop Defenses Breached" both have units of ranking points. A 1 in either of these would indicate that the given team contributes an average of 1 ranking point each match. All categories that have "crossings" in their name have units of crossings, not weakenings. That is, a 2 in any of these categories would indicate that the given team contributed 2 scored CROSSINGS over this defense each match. eOPR1 and eOPR2 are my rough attempts to compensate for different scoring methods in quals and elims. Since breaches and captures provide points in elims, but not in quals, "normal" OPR probably does a poor job predicting elimination match scores (although this is as of yet unverified). eOPR1 essentially makes boulders and crosses scored in quals worth more, and eOPR2 takes breaching/capturing contributions and assigns them point values, and then adds those to the "normal" OPR. |
Re: paper: 4536 scouting database BETA
Week 4 data has been added.
Additionally, I deleted the unnecessary whitespace that was beneath most of the event sheets' data. This will allow sorting to make much more sense and cause the scroll bar to be more appropriately sized. Also, I hadn't realized that excel saved the position of the last cell selected, which is why seemingly random positions on each page were previously selected upon entering them for the first time. I have now selected the top-left corner cell on each sheet. As always, I appreciate feedback and/or error reports. |
Re: paper: 4536 scouting database BETA
Week 5 data has been added.
I will include the data for the Western Canada regional in the week 6 update. |
Re: paper: 4536 scouting database BETA
Thanks for producing this every week! It is very interesting how the results from this data aligns very closely with scoring averages by type in our scouting data (not a perfect match, but very close)--we'll definitely be using it for Championships scouting.
|
Re: paper: 4536 scouting database BETA
I believe there is a good chance that I am currently calculating "tech foul count" and/or "tech fouls drawn" improperly. I will be investigating more tonight, but for the time being assume that these metrics are erroneous.
|
Re: paper: 4536 scouting database BETA
Caleb, thank you for pulling this together every week! Our team has been using it as a "pre-assessment" of teams before each event. We will for sure being using it for CMP!
Thanks again! |
Re: paper: 4536 scouting database BETA
Thank you so much for putting this together. It's been a great tool so far.
|
Re: paper: 4536 scouting database BETA
Quote:
Quote:
Quote:
|
Re: paper: 4536 scouting database BETA
Quote:
The reason I thought these were wrong was primarily a result of me forgetting how exactly tech fouls are scored. |
Re: paper: 4536 scouting database BETA
Caleb, this is great stuff, very helpful. The world results sheet makes it great to use for Worlds scouting.
I do have a question on how you've calculated these numbers, though. I'm assuming for a given event you're taking averages, but how do we end up with negative numbers for things like teleop high boulder points, etc.? There are several fields with values like this that I don't understand as the minimum value should really be zero. Can you explain this? Thanks. Also, I'm wondering if you'll be producing a sheet that contains only the teams going to Worlds in St. Louis? That would be helpful for those who are going. Thanks for doing this work! /mike |
Re: paper: 4536 scouting database BETA
Quote:
Does this data just reflect qualification rounds or also playoff rounds? I assume the latter, but wanted to verify. Also, how are you calculating total points? This seems to be a really low number... /mike |
Re: paper: 4536 scouting database BETA
Quote:
As to why negative values arise, there are two main reasons this could occur. First, recognize that these values represent a given team's contribution to a given category, which is generally not equivalent to what we conventionally think of as scoring. For example, a team which never takes shots, but transports boulders into the courtyard, could have a positive value in "teleop Boulders High." Although scouts would never say that they scored boulders high, if alliances which they are a part of tend to score more high goals, their "teleop Boulders High" value might be positive. In the same way, if a team plays the game in a way that hinders partners from scoring high boulders (by taking balls from them, taking their desired shooting position, running into them, etc...) then this team will have a justifiably lower score in "teleop Boulders High" than just the average number of boulders they themselves scored high. The other reason a team could have a negative value in a category boils down to our assumption that every team contributes the same amount every match. This is very clearly false, but it is a reasonable enough approximation that we can still arrive at reasonably good results when making it. If team A never scores in the high goal, but happens to be on the same alliance as a very good shooter in the same match that the shooter breaks down, team A will likely receive a small negative value in "teleop Boulders High." Personally, when I interpret these values, I generally round all negative values up to 0, but YMMV. Quote:
Quote:
Quote:
Remember that these numbers represent only the given teams' contribution, not their average alliance's score. Also, remember that playoff scores are calculated differently than qual scores. If you want to approximate a playoff alliance's score, you will likely get better results using my eOPR1 or eOPR2 metrics. |
Re: paper: 4536 scouting database BETA
Quote:
Looking forward to the next update, thanks. /mike |
Re: paper: 4536 scouting database BETA
Week 7 data has been added.
Per request, I have also added a "championship preview" sheet which contains data on the best event (by OPR) of every team registered for championships as of 5PM CST on 4/18/2016. There is no new information on this sheet, all data are copied directly from the "world results" sheet. I am not planning to release updates if/when the championship team list changes, so you will have to update this sheet yourself. If someone could check the data from the Michigan State Championship against scouting data to see that they roughly correlate, I would appreciate it. When I originally made this database, all of my calculations assumed that no event would have more than 100 teams or more than 200 matches. Thus, I had to modify a few things to accommodate MSC, which makes me nervous that I may have introduced one or more small errors somewhere. Unless someone notices an error, I will not be releasing another update until after championships. |
Re: paper: 4536 scouting database BETA
Quote:
|
Re: paper: 4536 scouting database BETA
Quote:
By request, I have decided to release an update on Friday night with division preview tabs. I might also do match/ranking predictions using components, but no guarantees. |
Re: paper: 4536 scouting database BETA
I would just like to remind everyone that the changed tower strength at champs means that some of these metrics lose value if they are applied to the championship events. Specifically, teleop Tower Captured, eOPR1, and eOPR2 should not be directly applied to the championship event. I will create a new metric ceOPR (championship elimination OPR) in my Friday update which will be calculated in the same way eOPR1 is currently calculated, but modified to account for the change in tower strength.
|
Re: paper: 4536 scouting database BETA
Hi I was looking at your scouting info and didn't see my team 5712 in the championship preview. Also wanted to say how great this is and how useful this will be when checking out alliance partners and opponents.
|
Re: paper: 4536 scouting database BETA
Quote:
Quote:
|
Re: paper: 4536 scouting database BETA
I have uploaded a new championships preview database. It contains an updated teams list in the "championship preview" sheet, as well as divisional preview sheets. All results from these 9 sheets are taken directly from the "world results" sheet.
Additionally, I have created a new metric ceOPR (championship eliminations OPR), which can be used to predict elimination scores at championships. This value is equivalent to (total points) + 2.5*(subtracted tower strength) + 2.5*(cross defense count). This value is only in the world results, championship preview, and divisional preview sheets, not in the previous events. Let me know if you have any questions or concerns. |
Re: paper: 4536 scouting database BETA
Quote:
|
Re: paper: 4536 scouting database BETA
Quote:
|
Re: paper: 4536 scouting database BETA
Caleb, you're awesome for doing this, thanks!
One suggestion: On the championship preview tab, it would be great to have an additional column with what division each team is in. This would allow us to just ingest the data once into any analytics platform and very quickly group the teams into divisions by that field rather than having to load them each separately and join them. Thanks again! /mike |
Re: paper: 4536 scouting database BETA
Thanks for the excellent work!
Please check the Carson preview tab. It appears to be blank. |
Re: paper: 4536 scouting database BETA
Quote:
|
Re: paper: 4536 scouting database BETA
Quote:
|
Re: paper: 4536 scouting database BETA
I have ran the calculations for championships, but there seems to be a bit of a discrepancy between my data and what is posted on TBA. Could someone independently run OPR calculations for Carson to see if the error is more likely on my side or on TBA's side? I will be investigating more on my own, the discrepancy seems like it might be related to qualification match #1.
Here is Carson's top 15 OPR according to TBA: Code:
1024 67.95Code:
1024 67.75374586 |
Re: paper: 4536 scouting database BETA
Quote:
Code:
1024 67.94699488 |
Re: paper: 4536 scouting database BETA
Is a post-CMP update coming? :-)
|
Re: paper: 4536 scouting database BETA
Quote:
If the 2834 database isn't updated by Friday, I'll rework my database and publish an update no later than Saturday. |
Re: paper: 4536 scouting database BETA
I have uploaded a final update to this database. This update has tabs for each of the championship divisions. I have also removed all of the championships preview tabs.
I hope everyone who downloaded it found it useful. I am planning to maintain this effort in the upcoming years. I am also planning to spend some more time developing the interface to reach the level of the 1114 and 2834 databases in future versions. I will also be looking to develop new metrics next year, depending on what the game is and what data the API provides. Keep an eye out for a thread near the end of build season next year where I will be asking for feedback on what everyone would like to see calculated. Thanks to teams 1114 and 2834 for providing my inspiration for creating this. Special thanks to Ether for providing the CSV files on which my entire database is founded. None of this would have been possible without him. |
| All times are GMT -5. The time now is 10:39. |
Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi