Teams scoring vast majority of points

I was joking with some students that Recycle Rush is a good model of the US Economic system, where a small percentage of participants control a vast majority of the wealth… but maybe its not a joke.

Has anyone looked at what % of teams score 80% of the points? Is there a way to easily looks at how many total points team XXXX scored at an event compared to the total points scored at that event?

At the LA Regional, Team 330, Team 1197, and Team 973 seemed to overshadow the scoring of the other 63 teams. Maybe they didn’t score 80% of all points scored, but it was quite a bit.

Challenge: calculate, for each regional, what percent of ALL points scored were scored by the top 3 SCORING teams. (this is not just top three ranked by average score, but top three individual scorers.)

I would absolutely love to see this (and I have a hunch that you’re right), but this requires data beyond what’s available through FIRST. We’d need to find at least one team per event (preferably more) that has team-level data for every match, then go from there.

Actually, now that I think about it, maybe the good people behind frcscout.com could make this happen - they’ve got a lot of data in a standardized format.

You could try it with OPR. Not perfect, but might be interesting anyway. Insert usual caveats about OPR.

I would suggest OPR as well. It’s the best way we have so far to tease out each teams’ individual contributions without actually watching every match.

I decided to do some quick analysis with QA data from weeks 1 - 3 since I had it (and didn’t have OPR):

  • The top 10 teams have a higher total QA as the bottom 62 teams

  • The top 0.5% of teams have a higher total QA as the bottom 3% of teams

  • The top 20 teams have a higher total QA as the bottom 104 teams

  • The top 1% of teams have a higher total QA as the bottom 5% of teams

  • The top 241 teams have 80% of the total QA

  • The top 11.81% of teams have 80% of the total QA

  • 1114’s QA at Waterloo now (197.87) is the higher than the sum of the 13 lowest QAs from weeks 1 - 3.

It’s actually not as unevenly distributed as I expected. I’m guessing the fact that QAs are averaged, and therefore top teams can get pulled down by lower scoring partners and vice versa plays a role. It’ll definitely be interesting to see how OPR works out.

Notes:
“top” and “bottom” just refer to the highest and lowest QAs, not to any other aspects of those teams.
Percentages for the first two statistics are rounded to the nearest 0.5%

Yeah, looking at QA or OPR is pretty much all that can be done, unless someone is taking perfect scouting data at an event.

Looking up FRCScout.com there are some errors (330 has zero points listed for some matches that definitely weren’t zero). Maybe I’ll try to get this hooked up for the new Ventura Regional next week. Should be some heavy hitters outscoring a majority of teams.

it’s kinda this way every year…

% of OPR sum vs % of Teams for weeks 1 thru 3 Match Results data





Maybe I’m not “getting” this graph. It seems to show that as you include all teams in the OPR sum, you get the total OPR sum…

OPR Histogram





You asked for:

… and that’s what the plot shows (based on OPR, as suggested by previous posters)

On the Y axis look for 80%. Trace horizontally from there until you hit the red line, then go straight down. The answer is 80% of the total OPR is due to 47% of the teams.

It’s interesting to see a cumulative distribution go above 100% then decline!

I understand why that’s happening (due to negative OPR) but it does underpin that OPR can be an odd beast.

Is the reason % of OPS sum crosses greater than 100% at ~80% of teams due to interpolation error? Is it true that 80% of teams account for 100% of OPR (in other words, 1 in 5 teams never score anything)

Not sure OPR is necessarily the best metric to use here given that all teams from an alliance benefit from an individual’s good performance. Alliances make the individual harder to judge (a team only capable of “capping” stacks with bins but not producing their own will only score well when teamed with a robot that can make large stacks, for instance). OPR is negative for some teams, while points scored cannot be negative (except for penalties).

In other words, I feel the actual scoring potential of the best teams (the 1%) do produce more disparity than the 99% compared to Ether’s graph. But without more specific data this is not too knowable.

No. It’s because some teams have negative OPR.

Is it true that 80% of teams account for 100% of OPR

The sum of the OPRs of the Top (highest OPR) ~78% of the teams equals 100% of the sum of the OPRs of all the teams.

(in other words, 1 in 5 teams never score anything)

You can’t draw that conclusion from the graph. That’s not how OPR works.

Not sure OPR is necessarily the best metric to use here

This was discussed in earlier posts in this thread. Can you suggest something better, for which the data is available?

That’s really cool, thank you! The negative OPR threw me off for a while; it’s strange how the top 90% has a higher total OPR than all teams.

Just some statistics I got from OPR data (I used column G on the “OPR results” tab from Ed Law’s spreadsheet):

  • The top 10 teams have a higher total OPR than the bottom 516 teams

  • The top 0.5% of teams have a higher total OPR than the bottom 28.5% of teams

  • The top 20 teams have a higher total OPR than the bottom 603 teams

  • The top 1% of teams have a higher total OPR than the bottom 33% of teams

  • The top 845 teams (47%) have 80% of the total OPR

  • The top 388 teams (21%) have 50% of the total OPR

  • The top 143 teams (8%) have 25% of the total OPR

What’s really interesting is that following the stats above compared to the ones I had about QA, OPR seems to peak off more quickly, but have a larger “middle” section. I feel like part of the issue is that the OPR data I had only took the higher value for teams that have competed twice, as compared to the QA which had everything.

Graphing OPR vs QA gave me this:


Which seems to have a steeper curve in OPR, but not by as much as the stats imply. If I have some time I’ll redo the OPR calculations with full data and check what I did for the QA ones.





Could try this if you can point me to the raw data (I only found the Team 2834 OPR generation scouting database):

Calculate an “effective QA” for each team by:

  • For each match, sum up the final QA result of all teams in the alliance
  • For each team in the alliance, their personal contribution is estimated as a percentage of their alliance’s score proportional to the sum of the alliance team’s original QA
  • Calculate effective individual QA by averaging all matches in their competition (to normalize and account for different # of matches played at different events)

For example:

Team 1 QA = 95
Team 2 QA = 38
Team 3 QA = 56

Sum is 189
Match 1 Score = 87

Match 1, Team 1 “effective individual QA” = 95/189 * 87 = 43.7
Match 1, Team 2 “effective individual QA” = 38/189 * 87 = 17.5
Match 1, Team 3 “effective individual QA” = 56/189 * 87 = 25.8

In this case, teams with higher scores get rewarded with more credit for points in rounds when they played with normally underperforming robots. Also, the final sum of all teams represents the actual (normalized per regional) number of points scored at regionals, which more directly answers OP’s question

Did you call?

It was a bit harder than I thought it would be to make this visualization because I wanted to make it automatic and customizable.

Here is an interactive visualization (drag the slider to see the contributions from top nth teams)

https://public.tableau.com/profile/enzoman34#!/vizhome/TopnTeamContributions/TopTeamContributions

And here is a picture for those who have slower internet connections or just want to see a pretty graph.

http://imgur.com/gallery/GcfEz80/

Note: I filtered out any event that had less than 30 matches scouted in it. I could put them back in, but I trust the data for larger events more.

This was actually super fun to make. PLEASE tell your friends to use this app. If we can get more regionals in the database, frcscout.com could be a census of FRC. If anyone else is as big of a data nerd as I am, that would be a VERY exciting new opportunity for some awesome stats.

Your “effective QA” is essentially a simplified Version of OPR.

My team has an online OPR calculator that does something very similar to this already. We show what percentage a team contributed to their qual average (an interesting thing we noticed is at most regionals 2/3rds of teams contribute <33% to their totals). Should be easy to add exactly what you’re asking for.

Auto points are probably even more concentrated at the top than total points.

After SCH District I took a quick look at the 144 qual auto points (sum of Ranking page auto points / 3) scored there. If you take out matches involving 3 robots, 225 (stacker scored the majority of the points), 486 (consistent tote & can shove), and 365 (occasionally got 2 step cans in the auto zone), there are only 28 points left. That’s the top ~9% involved in ~80% of auto points. Of course that is just one small event.