Quote:
Because there's no reason whatsoever to believe there's virtually no variation in consistency of performance from team to team.
Manual scouting data would surely confirm this.
|
[Edit: darn it! I tried to reply and mistakenly edited my previous post. I'll try to reconstruct it here.]
Your model certainly might be valid, and my derivation explicitly does
not deal with this case.
The derivation is for a model where OPRs are computed, then multiple tournaments are generated using those OPRs and
adding the same amount of noise to each match, and then seeing what the standard error of the resulting OPR estimates is across these multiple tournaments.
If you know that the variances for each team's score contribution are different, then the model fails. For that matter, the least squares solution for computing the OPRs in the first place is also a failed model in this case. If you knew the variances of the teams' contributions, then you should use weighted-least-squares to get a better estimate of the OPRs.
I wonder if some iterative approach might work: First compute OPRs assuming all teams have equal variance of contribution, then estimate the actual variances of contributions for each team, then recompute the OPRs via weighted-least-squares taking this into account, then repeat the variance estimates, etc., etc., etc. Would it converge?
[Edit: 2nd part of post, added here a day later]
http://en.wikipedia.org/wiki/Generalized_least_squares
OPRs are computed with an ordinary-least-squares (OLS) analysis.
If we knew ahead of time the variances we expected for each team's scoring contribution, we could use weighted-least-squares (WLS) to get a better estimate of the OPRs.
The link also describes something like I was suggesting above, called "Feasible generalized least squares (FGLS)". In FGLS, you use OLS to get your initial OPRs, then estimate the variances, then compute WLS to improve the OPR estimate. It discusses iterating this approach also.
But, the link also includes this comment: "For finite samples, FGLS may be even less efficient than OLS in some cases. Thus, while (FGLS) can be made feasible, it is not always wise to apply this method when the sample is small."
If we have 254 match results and we're trying to estimate 76 OPRs and 76 OPRvariances (152 parameters total), we have a pretty small sample size. So this approach would probably suffer from too small of a sample size.