View Single Post
  #13   Spotlight this post!  
Unread 18-05-2015, 07:24
wgardner's Avatar
wgardner wgardner is online now
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 172
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: "standard error" of OPR values

Scilab code is in the attachment.

Note that there is a very real chance that there's a bug in the code, so please check it over before you trust anything I say below.

----------------------

Findings:

stdev(M)=47 (var = 2209) Match scores have a standard deviation of 47 points.

stdev(M - A O)=32.5 (var = 1060) OPR prediction residuals have a standard deviation of about 32 points.

So OPR linear prediction can account for about 1/2 the variance in match outcomes (1-1060/2209).


What is the standard error for the OPR estimates (assuming the modeled distribution is valid) after the full tournament?

about 11.4 per team. Some teams have a bit more or a bit less, but the standard deviation of this was only 0.1 so all teams were pretty close to 11.4.

To be as clear as I can about this: This says that if we compute the OPRs based on the full data set, compute the match prediction residuals based on the full data set, then run lots of different tournaments with match results generated by adding the OPRs for the teams in the match and random match noise with the same match noise variance, and then compute the OPR estimates for all of these different randomly generated tournaments, we would expect to see the OPR estimates themselves have a standard deviation around 11.4.

If you choose to accept these assumptions, you might be willing to say that the OPR estimates have a 1 std-deviation confidence of +/- 11.4 points.


How does the standard error of the OPR (assuming the modeled distribution is valid) decrease as the number of matches increases?

I ran simulations through only the first 3 full matches per team up to 10 full matches per team, or with match totals of:
76, 102, 128, 152, 178, 204, 228, 254


sig^2 (the variance of the per-match residual prediction error) from 3 matches per team to 10 matches per team was
0.0, 19.7, 26.1, 29.3, 30.2, 30.8, 32.5, 32.5

(With only 3 matches played per team, the "least squares" solution can perfectly fit the data as we only have 76 unknowns and 76 parameters. With 4 or 5 matches per team, the model is still a bit "overfit" as we have 102 or 128 unknowns being predicted by 76 parameters.)


mean (StdErr of OPR) from 3 matches per team to 10 is
0.0, 16.5, 16.3, 15.2, 13.6, 12.8, 12.2, 11.4

(so the uncertainty in the OPR estimates decreases as the number of matches increases, as expected)

stdev (StdErr of OPR) from 3 matches per team to 10 is
0.0, 1.3, 0.6, 0.4, 0.3, 0.2, 0.1, 0.1

(so there isn't much variability in team-to-team uncertainty in the OPR measurements, though the uncertainty does drop as the number of matches increases. There could be more variability if we only ran a number of matches where, say, 1/2 the teams played 5 matches and 1/2 played 4?)


And for the record, sqrt(sig^2/matchesPerTeam) was
0.0, 9.9, 12.0, 12.0, 11.4, 11.1, 10.8, 10.3

(compare this with "mean (StdErr of OPR)" above. As the number of matches per team grows, the OPR will eventually approach the simple average match score per team/3 and then these two values should approach each other. They're in the same general range but still apart by 1.1 (or 11.4 - 10.3) with 10 matches played per team.)
Attached Files
File Type: txt FrcOprStdErrorScilab.txt (1.0 KB, 6 views)
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.

Last edited by wgardner : 18-05-2015 at 07:28.
Reply With Quote