This post is primarily of interest to stat-nerds. If you don’t know or care what OPR and CCWM are and how they’re computed, you probably want to ignore this thread.
It has bothered me for a while that the CCWM calculation does not incorporate any knowledge of which teams are on the opposing alliance, which would seem to be important for a calculation involving the winning margin between the alliances.
The standard calculation is performed as follows:
Let’s say that in the first match, the red alliance is teams 1, 2, and 3 and the blue alliance is teams 4, 5, and 6. Let’s let the red score in this match be R and the blue score be B. We’re trying to find the expected contributions to the winning margins for each team. Let’s call these contributions C1, C2, … for teams 1, 2, …
The standard CCWM calculation models the winning margin of each match twice (!), first as
R-B = C1 + C2 + C3 (ignoring that teams 4, 5, and 6 are involved!)
and then again as
B-R = C4 + C5 + C6 (ignoring that teams 1, 2, and 3 are involved!)
It finds the least squares solution for the Ci values, or the values of the Ci numbers that minimize the squared prediction error over all of the matches.
This solution in matrix form is solving
(A’ A) C = A’ M
where (A’ M) ends up being the vector of the sum of the winning margins for each team’s matches, and
(A’ A) is a matrix with diagonal elements equal to the # of matches each team plays and the off diagonal elements equal to the number of times teams i and j were on the same alliance.
Note again that nowhere does this factor in if teams were on opposing alliances (!). If a particular team on the blue alliance always scores 1000 points, that will make the winning margin for the red alliance be awful, and IMHO, that should be taken into account.
So, here’s my proposal.
Instead of modeling each match outcome twice as above, do it only once as follows:
R-B = (C1 + C2 + C3) - (C4 + C5 + C6)
(the left set is all the teams on the red alliance and the right set is the blue teams).
Now, we’re factoring in both your alliance partners’ abilities AND your opponent’s abilities.
If you go through the entire derivation, you end up with a similar set of equations, but * the new A matrix has a 1 in the i,jth spot if the jth team was on the red alliance in match i, a -1 if the jth team was on the blue alliance in match i, and 0 otherwise*.
The solution has the same format, i.e. solving the following formula
(A’ A) C = A’ M
(A’ M) ends up being exactly the same as before even though the A and M are a little different: (A’ M) is just the sum of the winning margins for each team’s matches.
But now (A’ A) is a little different. The diagonal elements are the same, but the off diagonal elements are equal to the number of times teams i and j are on the same alliance minus the number of times they’re on opposing alliances (!). So now opposing alliance contributions are included.
One oddity emerges from this formulation: the new (A’ A) is not invertible (!). This is because if you add any constant to all of the teams’ contributions, the winning margins are the same. For example, if you think the red teams contributed 10, 20, and 30 points each and the blue teams contributed 40, 50, and 60 points each, you’d get exactly the same winning winning margins if the teams’ contributions were 110, 120, 130, and 140, 150, and 160, or even 1010, 1020, 1030, and 1040, 1050, and 1060.
But the easy way around this is to just find the minimum norm solution (one of the many solutions) using, say, the singular value decomposition(SVD), and then subtract off the mean from all of the values. The resulting combined contributions to winning margin values represent how much a team will contribute to its winning margin compared to the average team’s contribution (which will be 0, of course).
Thoughts? This seems like an improvement to me, but I’d be curious to hear what other stat-nerds like me have to say on the matter. And if somebody else has already looked into all of this, accept my apologies and please help educate me.