Chief Delphi

Chief Delphi (http://www.chiefdelphi.com/forums/index.php)
-   Extra Discussion (http://www.chiefdelphi.com/forums/forumdisplay.php?f=68)
-   -   paper: 4536 scouting database BETA (http://www.chiefdelphi.com/forums/showthread.php?t=145892)

Caleb Sykes 18-03-2016 12:41

paper: 4536 scouting database BETA
 
Thread created automatically to discuss a document in CD-Media.

4536 scouting database BETA by Caleb Sykes

Caleb Sykes 18-03-2016 12:42

Re: paper: 4536 scouting database BETA
 
This is a beta test of a scouting database which calculates component calculated contributions (OPRs) using the data from the FIRST API. As this project is still in its infancy. Please report any bugs or potential improvements to Caleb Sykes (calebsyk@gmail.com). Each sheet currently contains data from a distinct week 1 event. Starting weekly on 3/21, a new database will be published which will contain data from all events up to that date.

Be extremely careful when using the individual defense crossings (columns J-Q on each sheet). At a given event, if a defense is chosen fewer times than there are teams at the event, a #NUM! error will appear. If a defense is chosen less than twice as many times as there are teams at the event, place limited faith in the numbers.

See the "instructions" sheet for more detailed information on what each category represents.

Caleb Sykes 21-03-2016 18:27

Re: paper: 4536 scouting database BETA
 
I have just uploaded version 1 of this database. It is populated with all week 1-3 events.

In addition to adding the additional data, the two main changes since the BETA include:
An alternative calculation of eOPR (elimination OPR) is now included for each team. So there are now two eOPR calculations, which I have dubbed eOPR1 and eOPR2. Details on how these are calculated can be found in the "instructions" sheet. Although I have not verified this, I expect eOPR1 to provide better elimination predictions at weaker events where captures are more infrequent, and eOPR2 to provide better elimination predictions at stronger events where captures are more frequent.

A new "world results" sheet has been added, which allows for component comparisons for every team at every event in which they have competed. Be aware that this list will have duplicates for teams that competed at 2+ events. Also, don't compare individual defense crossing data unless you know what you are doing. For example, team 5114 has a drawbridge contribution of 1722968039259170.00. 5114 is not that good at crossing the drawbridge, this just means that the drawbridge was not chosen frequently enough at Midland for there to be meaningful results for drawbridge contributions. As a rule of thumb, you can almost always trust the rock wall, sally port, and cheval de frise contributions, but be wary of the others.



Remember, this project is still quite young, and there are very likely errors in places (especially since I have not yet automated everything, and have to do some copying by hand). If you see any errors, please let me know and I will look into it.

Nathan Streeter 22-03-2016 11:38

Re: paper: 4536 scouting database BETA
 
First: very cool spreadsheet! I'm glad to have a resource that looks at the component OPR for pretty much every possible condition! It has all the usual OPR caveats, but it does seem useful for establishing some trends and making some comparisons.

As a sidenote, thank you to FRC HQ for making this data more available for capture. The API certainly provides much better data than the twitter feed over recent years.

I have some questions about the "units" of some columns... I'm pretty sure they're my initial guess for most of them, but I wanted to double-check.

For columns H and I (teleop Capture or Breach), I presume a "1" would indicate a successful Capture/Breach?

For columns J - V and AM (defense crossings), is "1" a single defense traversal (5 pts) or a weakened defense (10 pts, 2 traversals)?

Also, how are eOPR 1 and eOPR 2 calculated? What's the difference? They differ dramatically from the OPRs based solely on match scores.

Caleb Sykes 22-03-2016 11:53

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by Nathan Streeter (Post 1561054)
First: very cool spreadsheet! I'm glad to have a resource that looks at the component OPR for pretty much every possible condition! It has all the usual OPR caveats, but it does seem useful for establishing some trends and making some comparisons.

Thank you. I'm glad to see people are benefitting from it.

Quote:

As a sidenote, thank you to FRC HQ for making this data more available for capture. The API certainly provides much better data than the twitter feed over recent years.
Agreed. Also thanks to Ether and team 2834, I have built off of both of their work, and this spreadsheet would not be possible without them.

Quote:

I have some questions about the "units" of some columns... I'm pretty sure they're my initial guess for most of them, but I wanted to double-check.

For columns H and I (teleop Capture or Breach), I presume a "1" would indicate a successful Capture/Breach?

For columns J - V and AM (defense crossings), is "1" a single defense traversal (5 pts) or a weakened defense (10 pts, 2 traversals)?

Also, how are eOPR 1 and eOPR 2 calculated? What's the difference? They differ dramatically from the OPRs based solely on match scores.
The "instructions" tab has more detailed descriptions about each category, let me know if anything there is unclear and I can revise it. I'll summarize here anyway though.

"teleop Tower Captured" and "teleop Defenses Breached" both have units of ranking points. A 1 in either of these would indicate that the given team contributes an average of 1 ranking point each match.

All categories that have "crossings" in their name have units of crossings, not weakenings. That is, a 2 in any of these categories would indicate that the given team contributed 2 scored CROSSINGS over this defense each match.

eOPR1 and eOPR2 are my rough attempts to compensate for different scoring methods in quals and elims. Since breaches and captures provide points in elims, but not in quals, "normal" OPR probably does a poor job predicting elimination match scores (although this is as of yet unverified). eOPR1 essentially makes boulders and crosses scored in quals worth more, and eOPR2 takes breaching/capturing contributions and assigns them point values, and then adds those to the "normal" OPR.

Caleb Sykes 30-03-2016 12:23

Re: paper: 4536 scouting database BETA
 
Week 4 data has been added.

Additionally, I deleted the unnecessary whitespace that was beneath most of the event sheets' data. This will allow sorting to make much more sense and cause the scroll bar to be more appropriately sized.

Also, I hadn't realized that excel saved the position of the last cell selected, which is why seemingly random positions on each page were previously selected upon entering them for the first time. I have now selected the top-left corner cell on each sheet.

As always, I appreciate feedback and/or error reports.

Caleb Sykes 05-04-2016 13:07

Re: paper: 4536 scouting database BETA
 
Week 5 data has been added.

I will include the data for the Western Canada regional in the week 6 update.

Ben Martin 12-04-2016 10:13

Re: paper: 4536 scouting database BETA
 
Thanks for producing this every week! It is very interesting how the results from this data aligns very closely with scoring averages by type in our scouting data (not a perfect match, but very close)--we'll definitely be using it for Championships scouting.

Caleb Sykes 13-04-2016 14:58

Re: paper: 4536 scouting database BETA
 
I believe there is a good chance that I am currently calculating "tech foul count" and/or "tech fouls drawn" improperly. I will be investigating more tonight, but for the time being assume that these metrics are erroneous.

Dancin103 13-04-2016 15:38

Re: paper: 4536 scouting database BETA
 
Caleb, thank you for pulling this together every week! Our team has been using it as a "pre-assessment" of teams before each event. We will for sure being using it for CMP!

Thanks again!

hutchMN 13-04-2016 16:13

Re: paper: 4536 scouting database BETA
 
Thank you so much for putting this together. It's been a great tool so far.

Caleb Sykes 13-04-2016 17:19

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by Ben Martin (Post 1571753)
Thanks for producing this every week! It is very interesting how the results from this data aligns very closely with scoring averages by type in our scouting data (not a perfect match, but very close)--we'll definitely be using it for Championships scouting.

The results align reasonably well with our data as well. We got the most similar results when rounding all negative values up to 0 and rounding all positive values down to the nearest 0.1 or 0.2.

Quote:

Originally Posted by Dancin103 (Post 1572598)
Caleb, thank you for pulling this together every week! Our team has been using it as a "pre-assessment" of teams before each event. We will for sure being using it for CMP!

Thanks again!

Quote:

Originally Posted by hutchMN (Post 1572616)
Thank you so much for putting this together. It's been a great tool so far.

Thanks for the compliments, I'm glad to hear it is getting some good use.

Caleb Sykes 13-04-2016 17:21

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by Caleb Sykes (Post 1572578)
I believe there is a good chance that I am currently calculating "tech foul count" and/or "tech fouls drawn" improperly. I will be investigating more tonight, but for the time being assume that these metrics are erroneous.

Never mind, I think these are still being calculated properly.

The reason I thought these were wrong was primarily a result of me forgetting how exactly tech fouls are scored.

mitchellzone 17-04-2016 13:08

Re: paper: 4536 scouting database BETA
 
Caleb, this is great stuff, very helpful. The world results sheet makes it great to use for Worlds scouting.

I do have a question on how you've calculated these numbers, though. I'm assuming for a given event you're taking averages, but how do we end up with negative numbers for things like teleop high boulder points, etc.? There are several fields with values like this that I don't understand as the minimum value should really be zero.

Can you explain this? Thanks.

Also, I'm wondering if you'll be producing a sheet that contains only the teams going to Worlds in St. Louis? That would be helpful for those who are going. Thanks for doing this work!

/mike

mitchellzone 17-04-2016 13:51

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by mitchellzone (Post 1574261)
I do have a question on how you've calculated these numbers, though.

Thought of another couple:

Does this data just reflect qualification rounds or also playoff rounds? I assume the latter, but wanted to verify.

Also, how are you calculating total points? This seems to be a really low number...

/mike

Caleb Sykes 17-04-2016 14:40

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by mitchellzone (Post 1574261)
I do have a question on how you've calculated these numbers, though. I'm assuming for a given event you're taking averages, but how do we end up with negative numbers for things like teleop high boulder points, etc.? There are several fields with values like this that I don't understand as the minimum value should really be zero.

Can you explain this? Thanks.

These numbers are calculated using a least-squares approximation on qualification scores assuming that every team contributes the same amount to the selected category in every match. This value is each team's calculated contribution (or OPR) in that category. The only inputs to the algorithm are the category scoring breakdown per match and the match schedule. For more detail on how OPR is calculated, see the first link on this page titled "Presentation to explain new scouting database."

As to why negative values arise, there are two main reasons this could occur. First, recognize that these values represent a given team's contribution to a given category, which is generally not equivalent to what we conventionally think of as scoring. For example, a team which never takes shots, but transports boulders into the courtyard, could have a positive value in "teleop Boulders High." Although scouts would never say that they scored boulders high, if alliances which they are a part of tend to score more high goals, their "teleop Boulders High" value might be positive. In the same way, if a team plays the game in a way that hinders partners from scoring high boulders (by taking balls from them, taking their desired shooting position, running into them, etc...) then this team will have a justifiably lower score in "teleop Boulders High" than just the average number of boulders they themselves scored high.

The other reason a team could have a negative value in a category boils down to our assumption that every team contributes the same amount every match. This is very clearly false, but it is a reasonable enough approximation that we can still arrive at reasonably good results when making it. If team A never scores in the high goal, but happens to be on the same alliance as a very good shooter in the same match that the shooter breaks down, team A will likely receive a small negative value in "teleop Boulders High."

Personally, when I interpret these values, I generally round all negative values up to 0, but YMMV.

Quote:

Also, I'm wondering if you'll be producing a sheet that contains only the teams going to Worlds in St. Louis? That would be helpful for those who are going.
Good idea. I will include a sheet like this in my next update.

Quote:

Originally Posted by mitchellzone (Post 1574294)
Thought of another couple:

Does this data just reflect qualification rounds or also playoff rounds? I assume the latter, but wanted to verify.

I do only factor in qualification rounds. There are a number of reasons for this, many of which are described by Ed Law here. The reasons there are important, but the largest reason for me is that using qualification matches only has become the de facto standard on calculations like these, and it is important to me that my scores are equivalent to those listed on TBA, the 2834 database, and the 1114 database.

Quote:

Also, how are you calculating total points? This seems to be a really low number...
Total points is actually equivalent to OPR. This number represents the calculated contribution to the match scores of a team throughout their qualification matches.

Remember that these numbers represent only the given teams' contribution, not their average alliance's score. Also, remember that playoff scores are calculated differently than qual scores. If you want to approximate a playoff alliance's score, you will likely get better results using my eOPR1 or eOPR2 metrics.

mitchellzone 18-04-2016 09:12

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by Caleb Sykes (Post 1574321)
These numbers are calculated using a least-squares approximation on qualification scores assuming that every team contributes the same amount to the selected category in every match. This value is each team's calculated contribution (or OPR) in that category.

Okay, knowing these are all calculated similarly to OPR makes sense and explains the numbers, probably including the next one I was going to ask about which was why challenge/scale likelihood was often >1.0.

Looking forward to the next update, thanks.

/mike

Caleb Sykes 18-04-2016 18:48

Re: paper: 4536 scouting database BETA
 
Week 7 data has been added.

Per request, I have also added a "championship preview" sheet which contains data on the best event (by OPR) of every team registered for championships as of 5PM CST on 4/18/2016. There is no new information on this sheet, all data are copied directly from the "world results" sheet. I am not planning to release updates if/when the championship team list changes, so you will have to update this sheet yourself.

If someone could check the data from the Michigan State Championship against scouting data to see that they roughly correlate, I would appreciate it. When I originally made this database, all of my calculations assumed that no event would have more than 100 teams or more than 200 matches. Thus, I had to modify a few things to accommodate MSC, which makes me nervous that I may have introduced one or more small errors somewhere.

Unless someone notices an error, I will not be releasing another update until after championships.

Caleb Sykes 19-04-2016 11:32

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by Caleb Sykes (Post 1575060)
If someone could check the data from the Michigan State Championship against scouting data to see that they roughly correlate, I would appreciate it. When I originally made this database, all of my calculations assumed that no event would have more than 100 teams or more than 200 matches. Thus, I had to modify a few things to accommodate MSC, which makes me nervous that I may have introduced one or more small errors somewhere.

These values have been verified to be correct. Ether ran calculations independently that provided results which matched the results in this database.

Caleb Sykes 19-04-2016 13:04

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by Caleb Sykes (Post 1575060)
Unless someone notices an error, I will not be releasing another update until after championships.

Update on updates.
By request, I have decided to release an update on Friday night with division preview tabs. I might also do match/ranking predictions using components, but no guarantees.

Caleb Sykes 20-04-2016 12:45

Re: paper: 4536 scouting database BETA
 
I would just like to remind everyone that the changed tower strength at champs means that some of these metrics lose value if they are applied to the championship events. Specifically, teleop Tower Captured, eOPR1, and eOPR2 should not be directly applied to the championship event. I will create a new metric ceOPR (championship elimination OPR) in my Friday update which will be calculated in the same way eOPR1 is currently calculated, but modified to account for the change in tower strength.

joeojazz 20-04-2016 21:43

Re: paper: 4536 scouting database BETA
 
Hi I was looking at your scouting info and didn't see my team 5712 in the championship preview. Also wanted to say how great this is and how useful this will be when checking out alliance partners and opponents.

Caleb Sykes 20-04-2016 22:11

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by joeojazz (Post 1576355)
Hi I was looking at your scouting info and didn't see my team 5712 in the championship preview.

Quote:

Originally Posted by Caleb Sykes (Post 1575060)
Per request, I have also added a "championship preview" sheet which contains data on the best event (by OPR) of every team registered for championships as of 5PM CST on 4/18/2016. There is no new information on this sheet, all data are copied directly from the "world results" sheet. I am not planning to release updates if/when the championship team list changes, so you will have to update this sheet yourself.

Your team will be included in this tab in the Friday update.

Caleb Sykes 22-04-2016 21:51

Re: paper: 4536 scouting database BETA
 
I have uploaded a new championships preview database. It contains an updated teams list in the "championship preview" sheet, as well as divisional preview sheets. All results from these 9 sheets are taken directly from the "world results" sheet.

Additionally, I have created a new metric ceOPR (championship eliminations OPR), which can be used to predict elimination scores at championships. This value is equivalent to (total points) + 2.5*(subtracted tower strength) + 2.5*(cross defense count). This value is only in the world results, championship preview, and divisional preview sheets, not in the previous events.

Let me know if you have any questions or concerns.

jajabinx124 22-04-2016 22:04

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by Caleb Sykes (Post 1577520)
I have uploaded a new championships preview database. It contains an updated teams list in the "championship preview" sheet, as well as divisional preview sheets. All results from these 9 sheets are taken directly from the "world results" sheet.

Additionally, I have created a new metric ceOPR (championship eliminations OPR), which can be used to predict elimination scores at championships. This value is equivalent to (total points) + 2.5*(subtracted tower strength) + 2.5*(cross defense count). This value is only in the world results, championship preview, and divisional preview sheets, not in the previous events.

Let me know if you have any questions or concerns.

Just wanna say thanks for making this scouting database and making a championship preview version of it! This makes pre-scouting much easier and these stats will be useful.

Travis Hoffman 22-04-2016 22:22

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by Caleb Sykes (Post 1577520)
I have uploaded a new championships preview database. It contains an updated teams list in the "championship preview" sheet, as well as divisional preview sheets. All results from these 9 sheets are taken directly from the "world results" sheet.

Additionally, I have created a new metric ceOPR (championship eliminations OPR), which can be used to predict elimination scores at championships. This value is equivalent to (total points) + 2.5*(subtracted tower strength) + 2.5*(cross defense count). This value is only in the world results, championship preview, and divisional preview sheets, not in the previous events.

Let me know if you have any questions or concerns.

You are quite the awesome person for doing this. Thank you kindly.

mitchellzone 23-04-2016 08:21

Re: paper: 4536 scouting database BETA
 
Caleb, you're awesome for doing this, thanks!

One suggestion: On the championship preview tab, it would be great to have an additional column with what division each team is in. This would allow us to just ingest the data once into any analytics platform and very quickly group the teams into divisions by that field rather than having to load them each separately and join them.

Thanks again!

/mike

Wayne TenBrink 24-04-2016 07:58

Re: paper: 4536 scouting database BETA
 
Thanks for the excellent work!

Please check the Carson preview tab. It appears to be blank.

Caleb Sykes 24-04-2016 11:49

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by Wayne TenBrink (Post 1578091)
Thanks for the excellent work!

Please check the Carson preview tab. It appears to be blank.

It displays fine for me. Do all of the other division preview sheets display properly for you?

Wayne TenBrink 24-04-2016 22:46

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by Caleb Sykes (Post 1578122)
It displays fine for me. Do all of the other division preview sheets display properly for you?

It's all there now. Third download was the charm. Thanks again.

Caleb Sykes 01-05-2016 19:29

Re: paper: 4536 scouting database BETA
 
I have ran the calculations for championships, but there seems to be a bit of a discrepancy between my data and what is posted on TBA. Could someone independently run OPR calculations for Carson to see if the error is more likely on my side or on TBA's side? I will be investigating more on my own, the discrepancy seems like it might be related to qualification match #1.

Here is Carson's top 15 OPR according to TBA:

Code:

1024        67.95
868        64.40
973        59.73
225        57.22
2052        57.07
610        56.95
2122        56.15
2590        54.44
5895        54.13
41        52.82
2067        51.82
3824        50.56
2137        49.56
3538        47.72
2474        47.07

Here is the top 15 OPR according to my calculations:

Code:

1024        67.75374586
868        63.84195886
2122        59.32732553
973        59.02368093
225        57.49356224
610        56.71770781
2052        56.52062215
2590        54.57781003
5895        53.93177915
41        52.86710115
2067        51.99822564
3824        50.64961634
2137        49.26312626
3538        47.7814427
2474        47.15846249


Ether 01-05-2016 21:01

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by Caleb Sykes (Post 1581251)
Could someone independently run OPR calculations for Carson

Code:

1024        67.94699488
868        64.39937013
973        59.72708602
225        57.21656682
2052        57.06823258
610        56.94982187
2122        56.15330503
2590        54.44241751
5895        54.13186804
41        52.82497403
2067        51.82418186
3824        50.55946533
2137        49.55923262
3538        47.71879089
2474        47.06784238
1918        47.06371556
4028        45.59624045
904        45.41944435
1718        45.29367738
1625        43.90561045
4362        43.58606725
2996        43.53157842
2655        43.28570403
525        43.20450209
2403        43.1175526
2771        42.92284815
3970        42.72891288
135        42.57181239
5907        42.13118592
1987        41.79576813
4264        41.54625885
2486        41.10297812
3688        41.04522675
5167        40.89402704
319        40.27903932
4131        40.17573866
1619        39.56191031
2485        37.35922029
1533        36.90397632
1137        36.73476528
6098        35.99548457
233        34.91407358
6144        34.34974789
5663        32.20977958
5913        31.95709232
60        31.8850688
1156        31.25624534
1258        30.90491415
5084        30.58091357
5332        28.5437442
5454        28.51064153
1126        28.40910065
2761        28.31012272
2445        28.30861263
1159        26.93097019
2202        26.2979391
5879        25.36524886
5712        25.16147429
6025        24.98551622
2978        24.36624945
3352        22.86652503
4592        22.59269553
11        22.58569118
4121        22.17481602
296        22.09641455
1939        21.41832877
4026        20.96318342
4135        20.9368346
5572        20.35016881
3021        20.24024128
2526        19.82585398
5897        19.74217565
746        18.84443104
51        17.41725192
1369        7.854537579


Nathan Streeter 03-05-2016 11:38

Re: paper: 4536 scouting database BETA
 
Is a post-CMP update coming? :-)

Caleb Sykes 03-05-2016 14:40

Re: paper: 4536 scouting database BETA
 
Quote:

Originally Posted by Nathan Streeter (Post 1582428)
Is a post-CMP update coming? :-)

I normally update within a day of the 2834 scouting database being updated, because the "world results" page uses event information from it. So if the answer to this question is yes, I will update within a day after the 2834 update. If there will be no update to the 2834 database, I will have to rework some things, so it will take me a bit longer.

If the 2834 database isn't updated by Friday, I'll rework my database and publish an update no later than Saturday.

Caleb Sykes 04-05-2016 19:18

Re: paper: 4536 scouting database BETA
 
I have uploaded a final update to this database. This update has tabs for each of the championship divisions. I have also removed all of the championships preview tabs.

I hope everyone who downloaded it found it useful. I am planning to maintain this effort in the upcoming years. I am also planning to spend some more time developing the interface to reach the level of the 1114 and 2834 databases in future versions. I will also be looking to develop new metrics next year, depending on what the game is and what data the API provides. Keep an eye out for a thread near the end of build season next year where I will be asking for feedback on what everyone would like to see calculated.

Thanks to teams 1114 and 2834 for providing my inspiration for creating this. Special thanks to Ether for providing the CSV files on which my entire database is founded. None of this would have been possible without him.


All times are GMT -5. The time now is 10:39.

Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi