I've tried the L1 optimization problem, but l1-magic is giving me fits. For some reason, it blows up after 10 iterations.
However, I have done the analysis that I wanted. I calculated all the rating systems using data from events prior to CMP, then I used that data to predict CMP matches.
In summary:
OPR: 72.46% Correct
TrueSkill: 69.31% Correct
Elo: 72.90% Correct
Elo Mod: 71.71% Correct
TL;DR: We're okay, but not great at predicting matches. OPR is okay at it, but Elo is better.
I'm still somewhat surprised that Elo is slightly better.
Updated Spreadsheet:
https://dl.dropboxusercontent.com/u/...skill%202.xlsx