Quote:
Originally Posted by GeeTwo
My takeaway on this thread is that it would be good and useful information to know the rms (root-mean-square) of the residuals for an OPR/DPR data set (tournament or season). This would provide some understanding as to how much difference really is a difference, and a clue as to when the statistics mean about as much as the scouting.
|
Yes. In the paper in the other thread that I just posted about, the appendices show how much percentage reduction in the mean-squared residual is achieved by all of the different metrics (OPR, CCWM, WMPR, etc). An interesting thing to note is that the metrics are often much worse at predicting match results that they haven't included in their computation, indicating overfitting in many cases.
The paper discusses MMSE-based estimation of the metrics (as opposed to the traditional least-squares method) which reduces the overfitting effects, does better at predicting previously unseen matches (as measured by the size of the squared prediction residual in "testing set" matches), and is better at predicting the actual underlying metric values in tournaments which are simulated using the actual metric models.