Quote:
Originally Posted by AGPapa
I've calculated the errors in match score predictions for every 10th split from 50 to 150 matches.
The X axis in the attached chart is the number of matches in each division used as training set. The Y axis is the square root of the averaged squared error per match in the testing set (the remaining matches).
You can see that adding apriori estimations makes the predictions better across the board.
Interestingly, the MMSE (Oavg prior) estimation starts out pretty good at 50 matches (a little under a third of the way through). I wonder why Will's simulated results are different? You can see that the gradual improvements still exist, just as in the simulation (except for the last datapoint. I think that on the last day teams played differently than before).
|
Thanks AGPapa! As always, I have a few questions:
Can you elaborate on exactly what you mean by Oavg prior and World OPR prior? Does Oavg prior mean that you're setting them all to the same average a priori guess? Does World OPR prior mean that you're setting the a priori guesses to their final OPRs (as if you knew ahead of time what the OPRs actually were)?
I might expect that using the actual Worlds OPRs as priors would be uniformly good, as you're telling the algorithm ahead of time what the final guess would be (if I'm understanding you correctly, which I might not be). But I'm surprised that just setting the priors to be the overall average gives such uniform results.
Since your sim is probably easy to modify now (?), could you start it all off at 10 matches instead of 50?
Also, it doesn't surprise me that things get a little noisy at the very end because at that point you're only averaging the few matches left. They could just happen to be easy to predict or hard to predict and now there's only a few of them, so it seems entirely reasonable that they might move a bit from the trendline.