It confuses me when you talk about multiple possible definitions of OPR. OPR has a singular clear definition, which is the linear least squares solution to the set of equations derived from a set of match scores and teams in those matches. Within this definition, there is no uncertainty or error bars as the solution is unique (provided the system is overdetermined which will always be the case for reasonable schedules). Put another way, OPR does not intrinsically have any uncertainty, just like 4+5=9 does not have any uncertainty. The only way it could intrinsically have uncertainty would be if there was uncertainty in the schedule or uncertainty in the match scores. If either of these had uncertainty, then it would make sense to calculate the uncertainty of OPR and I would make sure to publish that uncertainty in all of my work.
With that said, there are plenty of different meanings you can try to assign to this mathematical construct we call OPR, some of which are probably more reasonable meanings than others, and each of which will have its own uncertainty/error bars. I don’t think any of these can reasonably be interpreted as the uncertainty or the error bars to OPR though, as there is no single usage case for OPR. An OPR value means different things to different people, and just as there’s no inherently right or wrong meaning, so too is there no inherent uncertainty in the meaning of OPR. I guess if there was some kind of “true ability” for teams then we could compare OPR against that, but I don’t think such a thing could even be represented by a single number. Even if it could, we have no way of knowing what it is, so it’s all moot.
That said, let’s look at the forward-looking score prediction case and the backward looking historical results comparison that you mention and find those uncertainties.
Case 1 - Backwards looking match result differences from in-event OPR:
Question: After an event has completed, what uncertainty on post-event OPRs would be required to make 95% of in-event match scores fall within the uncertainty range of the alliance’s post-event OPR sum?
Results: Here are the final match scores, pre-event OPR sums (score predictions), and post-event OPR sums (Drive link because dataset is too large to put in post).
The residuals between the post-event OPR sums and the actual scores have an average of 0.0 (by definition of OPR) and a stdev of 9.7. If we assume independence of teams and equal variance of teams, we divide this by sqrt(3) to get a team’s OPR stdev of 5.6. Multiplying the stdev by 2 gives us a 95% confidence interval for each team’s OPR according to the above question at ±11.2. Since looking backwards is always far easier than looking forwards, I think this is about as low as any reasonable definition of “OPR uncertainty” could go. Comparing post-event OPRs to scores is really just “fitting” the OPRs to the scores and finding how good the best possible fit is.
Case 2 - Forward looking match predictions:
Question: Before an event has started, what uncertainty on pre-event OPRs would be required to make 95% of future in-event match scores fall within the uncertainty range of the alliance’s pre-event OPR sum?
Result: We’ll use the same dataset as above, but this time we’re going to sum up each team’s maximum pre-event OPR (you could also look at average or most recent event, but I have found max to be the best predictor) . The residuals from the predicted scores to the actual scores have an average of 3.5, meaning that the predicted scores were on average 3.5 points higher than the actual scores (the positive value is unsurprising considering we predicted using max OPRs) . The residuals also have a standard deviation of 13.6. Multiplying by 2 and dividing by sqrt(3) as above gives us a 95% confidence interval according to question 2 on OPR of ±15.7 (plus a constant offset of 3.5/3 = 1.2). This is probably about the worst-case scenario for any definition of “OPR uncertainty” as we are combining OPRs from the whole season’s worth of different events and have no in-event information to go off of.
Case 3 - Match W/L predictions using predicted contributions:
I currently use “predicted contributions” to generate W/L match predictions in my event simulator. Predicted contributions are a hybrid of max pre-event OPR and in-event match results which are used to make live-updating match predictions. These predictions are generated according to the formula Red WP = 1/(1+10^((blue_PC-red_PC)/(2*stdev2019)))
Red WP = red alliance win probability
blue_PC = the sum of the blue alliance’s predicted contributions
red_PC = the sum of the red alliance’s predicted contributions
stdev2019 = The standard deviation of 2019 week 1 qual and playoff scores = 17.1.
As this is a logistic function, there’s no exact mapping to a normal distribution although they are very similar. The closest Gaussian fit gives a stdev for each team’s PC of 7.2, which implies a 95% confidence interval on PCs of ±14.4. Fittingly, this value is right between the values derived from case 1 and case 2, which makes sense as PCs are a hybrid between pre-event and post-event OPRs. I was a little worried before doing this analysis that this value would be way outside of those bounds, but it actually fits in nicely, which means I probably at least vaguely understand what I’m doing.
Generalizing for the future:
Depending on what your usage case is for OPR, I think you could reasonably say the uncertainty on OPR is anywhere from ±11.2 to ±15.7 points in 2019. Again, depending on the usage case, these uncertainties should generally scale proportionally to score standard deviations each year, so if we normalize these ranges to the 2019 week 1 score stdev of 17.1, we can obtain general uncertainties to use in future years. Call a year’s week 1 score stdev s, this gives us 95% confidence intervals of ±.65s on the lower end to ±.92s on the higher end. Here’s what those ranges would have looked like for the past few years:
Estimated 'OPR uncertainties'
|week 1 score stdev