# paper: How Accurate is OPR?

Thread created automatically to discuss a document in CD-Media.

How Accurate is OPR?
by: brennonbrimhall

By taking Team 20’s scouting data and comparing it to OPRs, we are able to quantify how accurate OPR really was for 2013.

By taking Team 20’s scouting data and comparing it to OPRs, we are able to quantify how accurate OPR really was for 2013. The results are very informative, and it is concluded that OPR should not be used to predict a team’s scoring output at an event in lieu of real scouting.

How Accurate is OPR.pdf (438 KB)
Dataset.xlsx (14.3 KB)

How did you validate the accuracy of your scouting data? Why did you choose to use percent error, rather then absolute error?

A few more things that would be interesting to look at:

Do the results change if you only look at the top 24 or 30 teams at an event (the teams you would be considering when forming a pick list)?
Is there an OPR at which it becomes more accurate? Looking at chart 13, OPRs above 15 seem much better then those below 15 (obviously game dependent).
Can you quantify the percent chance that Team A is better then Team B, given a specific OPR difference (IE, Team A has an OPR that is 1 higher then Team B, and has a 55% chance of being better then Team B, but Team C has an OPR that is 10 higher then Team B, and has a 90% chance of being better then Team B.
Does the percent error histogram still look normal if you discard the outliers and put more bins between -100% and +200%?

These are all good things to think about. Particularly, you may want to reconsider your use of percent error as opposed to absolute error. I’m estimating here, but it looks like absolute error would’ve been pretty consistent regardless of true scoring. Take a look at the correlation of those; I’d bet very little of the variation in absolute error is explained by variation in true scoring average.

If there were such a correlation, you would have noticed it in the residual plot for your OPR

-True Average linear regression. I didn’t notice a residual plot in your pdf; they are essential for determining if your model is a good fit for the data.

Merry Christmas!

At our events, we scouted collaboratively with other teams. Accuracy actually became a prime concern of ours, so we had every entry into our scouting database checked over by our head scout before entry (who had been watching each match as a whole immediately before validating). Additionally, we recorded matches for further verification. There were a few instances I can recall where errors were detected in entries, and we used our footage to re-scout that robot for that match.

This was an attempt to make the results of our paper more applicable to other events in the 2013 season. Archimedes, for instance, is hardly indicative of the average regional or district event, and yet it forms more than half of our data (51%, to be precise; out of 196 event/team combinations sampled, 100 were from Archimedes).

Let me get back to you on answering your questions backed up with relevant diagrams and calculations. I’d also like to look at posting our dataset too. However, here are my suspicions:

• If you see my response Basel A’s question below, sampling only the top 30 or so teams from each of our competitions should decrease the percent error, which would change the percent/tolerance table for the better. OPR

should become more accurate.

• I have no idea. Really intriguing question, though.

• Since we do have the averages and standard deviations for each team, you could let Team A and B be represented by a variable with mu equal to their Average Score and sigma equal to the Standard Deviation in their score. If you subtract the two means and add the two variances, you should be able to find the number you’re looking for by integrating from -infinity to 0 and 0 to infinity. We used this method to predict match outcomes, but it was not very accurate. I don’t know of a simple way to extend that to OPR

, though.

• I’ll get back to you.

If you look at the scatterplot on slide 9 and compare the least squares line to the scatterplot, you’ll see that absolute error decreases as Average Points Scored increases (I did have a residuals plot in here previously, but it looks like I accidentally removed it before I posted it). The percent error decreasing as Average Points Scored is not simply a function of the denominator for the percent error calculation increasing.

I did do a residual plot, and include it previously; it must have been accidentally removed during my revising. I’ll make sure to fix that to back up my previous claim (see answer to previous question).

``````

Here’s the Perl script I used. It takes a standard input file of a match number (just sequential is fine), red teams, blue teams, (“RED”|“BLUE”|“TIE”) (referring to who won.

``````
#!/usr/bin/perl

my %rating = ();

my @data = <STDIN>;
my \$K = 32;
my \$defaultElo = 1200;

# Just make sure a team number and default value of 1200 exists for each hash.
foreach(@data) {
my (\$match, \$red1, \$red2, \$red3, \$blue1, \$blue2, \$blue3, \$res) = split;

if (! exists \$rating{\$red1}) {
\$rating{\$red1} = \$defaultElo
}
if (! exists \$rating{\$red2}) {
\$rating{\$red2} = \$defaultElo
}
if (! exists \$rating{\$red3}) {
\$rating{\$red3} = \$defaultElo
}
if (! exists \$rating{\$blue1}) {
\$rating{\$blue1} = \$defaultElo
}
if (! exists \$rating{\$blue2}) {
\$rating{\$blue2} = \$defaultElo
}
if (! exists \$rating{\$blue3}) {
\$rating{\$blue3} = \$defaultElo
}
}

foreach (@data) {
my (\$match, \$red1, \$red2, \$red3, \$blue1, \$blue2, \$blue3, \$res) = split;

my \$Ered = 1 / (1 + 10**(((\$rating{\$blue1}+\$rating{\$blue2}+\$rating{\$blue3})/3-(\$rating{\$red1}+\$rating{\$red2}+\$rating{\$red3})/3)/400));
my \$Eblue = 1 / (1 + 10**(((\$rating{\$red1}+\$rating{\$red2}+\$rating{\$red3})/3-(\$rating{\$blue1}+\$rating{\$blue2}+\$rating{\$blue3})/3)/400));

if (\$res eq 'RED') {
\$rating{\$red1} = \$rating{\$red1} + \$K*(1 - \$Ered);
\$rating{\$red2} = \$rating{\$red2} + \$K*(1 - \$Ered);
\$rating{\$red3} = \$rating{\$red3} + \$K*(1 - \$Ered);

\$rating{\$blue1} = \$rating{\$blue1} + \$K*(0 - \$Eblue);
\$rating{\$blue2} = \$rating{\$blue2} + \$K*(0 - \$Eblue);
\$rating{\$blue3} = \$rating{\$blue3} + \$K*(0 - \$Eblue);
}
if (\$res eq 'BLUE') {
\$rating{\$red1} = \$rating{\$red1} + \$K*(0 - \$Ered);
\$rating{\$red2} = \$rating{\$red2} + \$K*(0 - \$Ered);
\$rating{\$red3} = \$rating{\$red3} + \$K*(0 - \$Ered);

\$rating{\$blue1} = \$rating{\$blue1} + \$K*(1 - \$Eblue);
\$rating{\$blue2} = \$rating{\$blue2} + \$K*(1 - \$Eblue);
\$rating{\$blue3} = \$rating{\$blue3} + \$K*(1 - \$Eblue);
}
if (\$res eq 'TIE') {
\$rating{\$red1} = \$rating{\$red1} + \$K*(0.5 - \$Ered);
\$rating{\$red2} = \$rating{\$red2} + \$K*(0.5 - \$Ered);
\$rating{\$red3} = \$rating{\$red3} + \$K*(0.5 - \$Ered);

\$rating{\$blue1} = \$rating{\$blue1} + \$K*(0.5 - \$Eblue);
\$rating{\$blue2} = \$rating{\$blue2} + \$K*(0.5 - \$Eblue);
\$rating{\$blue3} = \$rating{\$blue3} + \$K*(0.5 - \$Eblue);
}
}

while((\$key, \$value) = each(%rating)) {
print \$key . "	" . \$value . "
";
}

``````

Not sure how useful the Elo ratings are, but there is some statistical significance with having teams like 469, 67, 1114 and 2056 near the top. I’m nowhere near good enough with statistics to determine if this is enough data to work with (my instinct says no), but it’s just another interesting way to look at wins/losses

(Edit: Had mistake in my calculation)

I’ve added the dataset used for these calculations to the paper. Feel free to use it to do any more analysis that you’d like.

Yes. By selecting the top 30 teams (in terms of Average Score), the least squares line becomes

``````
y = 1.0486x - 0.7729

``````

with an R^2 of 87.32%. The model actually moves further away away from the line we’re expecting (y = x) when compared with the overall combined model, though it has a much higher R^2 value (a change of 7.02%).

In terms of the percent error model, the new mu is 1.11% with a sigma of 45.10%. The new table for the probability a team will fall within a given percent error is as follows:

``````
10%	17.543%
20%	34.248%
30%	49.396%
40%	62.475%
50%	73.230%
60%	81.649%
70%	87.927%
80%	92.383%
90%	95.396%
100%	97.336%
110%	98.525%
120%	99.219%
130%	99.605%
140%	99.809%
150%	99.912%
160%	99.961%
170%	99.984%
180%	99.993%
190%	99.997%
200%	99.999%

``````

Here’s a table of the different averages and standard deviations for OPRs greater than or equal to the OPR

listed. I see large increases in standard deviation from 10 to 20 (as you observed), and from 30 to 40.

``````
OPR	mu	sigma
80	6.88%	15.60%
70	13.02%	13.92%
60	12.57%	21.20%
50	18.15%	25.74%
40	20.00%	25.31%
30	23.33%	55.35%
20	20.56%	51.77%
10	24.36%	92.58%
0	20.76%	100.66%
-10	3.08%	131.97%

``````

This table has the mu and sigma for all teams within the OPR

bin.

``````
OPR	mu	sigma
80	6.88%	13.51%
70	17.12%	11.27%
60	12.12%	26.05%
50	26.12%	29.06%
40	24.18%	23.79%
30	28.43%	81.58%
20	13.57%	40.66%
10	36.60%	165.01%
0	4.63%	129.47%
-10	-287.83%	216.42%

``````

Let Team A be represented by a Normal model with mu OPR

, and sigma equal to the sigma in the table above multiplied by the team’s OPR
. Follow the same pattern for Team B. Subtract the two normal models (subtract the two averages; A-B, and add the variances to find the new sigma). Integrate underneath this curve from 0 to infinity to find the probability A would score more than B.

This is a method to approximate a prediction strategy I detailed in this post: http://www.chiefdelphi.com/forums/showpost.php?p=1277423&postcount=1

We’re not creating a model for the data we have, which is why I removed it; this wasn’t about creating a regression that was supposed to model the data. Instead, this was about checking one of the properties of OPR

: ideally, it should have a 100% correlation with True Average score, with an intercept of 0 and slope of 1.

That being said, here’s links to the residual plots:

As a function of OPR

:

As a function of True Average Score: