|
|
|
![]() |
|
|||||||
|
||||||||
![]() |
|
|
Thread Tools |
Rating:
|
Display Modes |
|
|
|
#1
|
||||
|
||||
|
Incorporating Opposing Alliance Information in CCWM Calculations
This post is primarily of interest to stat-nerds. If you don't know or care what OPR and CCWM are and how they're computed, you probably want to ignore this thread.
![]() --------------------------------------------- It has bothered me for a while that the CCWM calculation does not incorporate any knowledge of which teams are on the opposing alliance, which would seem to be important for a calculation involving the winning margin between the alliances. The standard calculation is performed as follows: Let's say that in the first match, the red alliance is teams 1, 2, and 3 and the blue alliance is teams 4, 5, and 6. Let's let the red score in this match be R and the blue score be B. We're trying to find the expected contributions to the winning margins for each team. Let's call these contributions C1, C2, ... for teams 1, 2, ... The standard CCWM calculation models the winning margin of each match twice (!), first as R-B = C1 + C2 + C3 (ignoring that teams 4, 5, and 6 are involved!) and then again as B-R = C4 + C5 + C6 (ignoring that teams 1, 2, and 3 are involved!) It finds the least squares solution for the Ci values, or the values of the Ci numbers that minimize the squared prediction error over all of the matches. This solution in matrix form is solving (A' A) C = A' M where (A' M) ends up being the vector of the sum of the winning margins for each team's matches, and (A' A) is a matrix with diagonal elements equal to the # of matches each team plays and the off diagonal elements equal to the number of times teams i and j were on the same alliance. Note again that nowhere does this factor in if teams were on opposing alliances (!). If a particular team on the blue alliance always scores 1000 points, that will make the winning margin for the red alliance be awful, and IMHO, that should be taken into account. So, here's my proposal. Instead of modeling each match outcome twice as above, do it only once as follows: R-B = (C1 + C2 + C3) - (C4 + C5 + C6) (the left set is all the teams on the red alliance and the right set is the blue teams). Now, we're factoring in both your alliance partners' abilities AND your opponent's abilities. If you go through the entire derivation, you end up with a similar set of equations, but the new A matrix has a 1 in the i,jth spot if the jth team was on the red alliance in match i, a -1 if the jth team was on the blue alliance in match i, and 0 otherwise. The solution has the same format, i.e. solving the following formula (A' A) C = A' M (A' M) ends up being exactly the same as before even though the A and M are a little different: (A' M) is just the sum of the winning margins for each team's matches. But now (A' A) is a little different. The diagonal elements are the same, but the off diagonal elements are equal to the number of times teams i and j are on the same alliance minus the number of times they're on opposing alliances (!). So now opposing alliance contributions are included. One oddity emerges from this formulation: the new (A' A) is not invertible (!). This is because if you add any constant to all of the teams' contributions, the winning margins are the same. For example, if you think the red teams contributed 10, 20, and 30 points each and the blue teams contributed 40, 50, and 60 points each, you'd get exactly the same winning winning margins if the teams' contributions were 110, 120, 130, and 140, 150, and 160, or even 1010, 1020, 1030, and 1040, 1050, and 1060. But the easy way around this is to just find the minimum norm solution (one of the many solutions) using, say, the singular value decomposition(SVD), and then subtract off the mean from all of the values. The resulting combined contributions to winning margin values represent how much a team will contribute to its winning margin compared to the average team's contribution (which will be 0, of course). Thoughts? This seems like an improvement to me, but I'd be curious to hear what other stat-nerds like me have to say on the matter. And if somebody else has already looked into all of this, accept my apologies and please help educate me. ![]() Last edited by wgardner : 05-25-2015 at 09:36 AM. |
|
#2
|
||||
|
||||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
I'd love to see somebody actually quantitatively compare the predictive power of all of these different metrics across the various games. For a given year, take the games from the first half of every competition's qualifying rounds, compute a stat for every team and measure it's ability to predict the outcome of the second half of the qualifying matches. Each of the last 4 years should give a sample size of ~7,500 matches and ~2,500 teams. EDIT: these counts are for FRC. An analysis for FTC would be interesting as well.
|
|
#3
|
|||
|
|||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
And also look at the opposing alliance prior record to get to this match. How did they get their OPR?
Wait, Dean is a big data guy and he wants us to dig into all the past matches? This entire robot thing is just a ruse to get interested in Big Data? ((I worked at one time for a MLB stat company and all of the stats are important. Weather, prior events, crowd sizes, etc. We have just touched the surface of stat possibilities)) |
|
#4
|
||||
|
||||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
I'm having trouble understanding this sentence. Could you please clarify?
|
|
#5
|
||||
|
||||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
Quote:
More useful than predicting the outcome of the second HALF of the matches is taking the Day 1 matches and seeing how they predict the Day 2 matches. This is more than half the matches, so it's bound to be a *better* predictor than half the matches because of the increased sample size too. Quote:
I don't suspect defensive ability and WLT have a very strong correlation though. I'd like to see that correlation proved before I try to "normalize" a team's OPR with this metric. |
|
#6
|
|||
|
|||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
Quote:
But I spent an hour and did some "what IF runs" through the data and the result is pretty low. While low scoring opposing alliances do make a difference, about match 8,9,10 things swing the other way. So while we all hate the "random" selections, it seems to work out in the end. I only did a small segment, with the full season available, feel free to run your own numbers. |
|
#7
|
||||
|
||||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
I ran the numbers on the 2014 St. Joseph event. I checked that my calculations for 2015 St. Joseph match Ether's, so I'm fairly confident that everything is correct.
Here's how each stat did at "predicting" the winner of each match. OPR: 87.2% CCWM: 83.3% WMPR: 91.0% I've attached my analysis, WMPR values, A and b matrices, along with the qual schedules for both the 2014 and 2015 St. Joe event. |
|
#8
|
||||
|
||||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
Quote:
On the same data doing the "remove one match from the training data, model based on the rest of the data, use the removed match as testing data, and repeat the process for all matches" method, I got the following results: Stdev of winning margin prediction residual OPR : 63.8 CCWM: 72.8 WMPR: 66.3 When I looked at scaling down each of the metrics to improve their prediction performance on testing data not in the training set, the best Stdevs I get for each were: OPR*0.9: 63.3 CCWM*0.6: 66.2 WMPR*0.7: 60.8 Match prediction outcomes OPR : 60 of 78 (76.9 %) CCWM: 57 of 78 (73.1 %) WMPR: 62 of 78 (79.5 %) Yeah! Even with testing data not used in the training set, WMPR seems to be outperforming CCWM in predicting the winning margins and the match outcomes in this single 2014 tournament (which again is a game with substantial defense). I'm hoping to get the match results (b with red and blue scores separately) for other 2014 tournaments to see if this is a general result. [Edit: found a bug in the OPR code. Fixed it. Updated comments. Also included the scaled down OPR, CCWM, and WMPR prediction residuals to address overfitting.] Last edited by wgardner : 05-27-2015 at 08:37 AM. |
|
#9
|
||||
|
||||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
Quote:
|
|
#10
|
||||
|
||||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
Quote:
|
|
#11
|
||||
|
||||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
Iterative Interpretations of OPR and WMPR
(I found this interesting: some other folks might or some other folks might not. )Say you want to estimate a team's offensive contribution to their alliance scores. A simple approach is just compute the team's average match score/3. Let's call this estimate O(0), a vector of the average match score/3 for all teams at step 0. (/3 because there are 3 teams per alliance. This would be /2 for FTC). But then you want to take into account the fact that a team's alliance partners may be better or worse than average. The best estimate you have of the contribution of a team's partners at this point is the average of their O(0) estimates. So let the improved estimate be O(1) = team's average match score - 2*average ( O(0) for a team's alliance partners). (2*average because there are 2 partners contributing per match. This would be 1*average for FTC.) This is better, but now we have an improved estimate for all teams, so we can just iterate this: O(2) = team's average match score - 2*average ( O(1) for a team's alliance partners). O(3) = team's average match score - 2*average ( O(2) for a team's alliance partners). etc. etc. This sequence of O(i) converges to the OPR values, so this is just another way of explaining what OPRs are. WMPR can be iteratively computed in a similar way. W(0) = team's average match winning margin W(1) = team's average match winning margin - 2*average ( W(0) for a team's alliance partners) + 3*average ( W(0) for a team's opponents ). W(2) = team's average match winning margin - 2*average ( W(1) for a team's alliance partners) + 3*average ( W(1) for a team's opponents ). etc. etc. This sequence of W(i) converges to the WMPR values, so this is just another way of explaining what WMPRs are. |
|
#12
|
||||
|
||||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
Currently we've mostly been seeing how WMPR does at a small district event with a lot of matches per team (a best-case scenario for these stats). I wanted to see how it would do in a worse case. Here's how each stat performed at "predicting" the winner of each match 2014 Archimedes Division (100 teams, 10 matches/team).
OPR: 85.6% CCWM: 87.4% WMPR: 92.2% EPR: 89.2% WMPR holds up surprisingly well in this situation and outperforms the other stats. EPR does better than OPR, but worse than WMPR. I don't really like EPR, as it seems difficult to interpret. The whole idea behind using the winning margin is that the red robots can influence the blue score. Yet EPR also models bf = b1 + b2 + b3, which is counter to this. Quote:
On another note: I've also found that it's difficult to compare WMPR's across events (whereas OPR's are easy to compare). This is because a match that ends 210-200 looks the same as one that ends 30-20. At very competitive events this becomes a huge problem. Here's an example from team 33's 2014 season. WMPRs at each Event: MISOU: 78.9 MIMID: 37.0 MITRY: 77.8 MICMP: 29.4 ARCHI: 40.8 Anyone who watched 33 at their second district event would tell you that they didn't do as well as their first, and these numbers show that. But these numbers also show that 33 did better at their second event than at the State Championship. This is clearly incorrect, 33 won the State Championship but got knocked out in the semis at their second district event. You can see pretty clearly that the more competitive events (MSC, Archimedes) result in lower WMPRs, which makes it very difficult to compare this stat across events. This occurs because using the least-norm solution has an average of zero for every event. It treats all events as equal, when they're not. I propose that instead of having the average be zero, the average should be how many points the average robot scored at that event. (So we should add the average event score / 3 to every team's WMPR). This will smooth out the differences between each event. Using this method, here are 33's new WMPRs. MISOU: 106.3 MIMID: 71.7 MITRY: 112.7 MICMP: 86.0 ARCHI: 93.5 Now these numbers correctly reflect how 33 did at each event. MIMID has the lowest WMPR, and that's where 33 did the worst. Their stats at MICMP and ARCHI are now comparable to their district events. OPR has proliferated because it's easy to understand (this robot scores X points per match). With this change, WMPR also becomes easier to understand (this robot scores and defends their opponents by X points per match). Since this adds the same constant to everybody's WMPR, it'll still predict the match winner and margin of victory with the same accuracy. Thoughts? Last edited by AGPapa : 05-27-2015 at 03:41 PM. |
|
#13
|
||||
|
||||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
Quote:
I'll try to get to the verification on testing data in the next day or so. I personally like this normalized WMPR (nWMPR?) better than EPR as the interpretation is cleaner: we're just trying to predict the winning margin. EPR is trying to predict the individual scores and the winning margin and weighting the residuals all the same. It's a bit more ad-hoc. On the other hand, one could look into which weightings result in the best overall result in terms of whatever measure of result folks care about. I still am most interested in how well a metric predicts the winning margin of a match (and in my FTC android apps I also hope to include an estimate of "probability of victory" from this which incorporates the expected winning margin and the standard deviation of that expectation along with the assumption of a normally distributed residual). And using these for possible scouting/ alliance selection aids (especially for lower picks). But other folks may be interested in using them for other things. |
|
#14
|
||||
|
||||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
Here's a generalized perspective.
Let's say you pick r1, r2, r3, b1, b2, b3 to minimize the following error E(w)= w*[ (R-B) - ( (r1+r2+r3)-(b1+b2+b3) ) ]^2 + (1-w) * [ (R-(r1+r2+r3))^2 + (B- (b1+b2+b3))^2] if w=1, you're computing the WMPR solution (or any of the set of WMPR solutions with unspecified mean). if w=0, you're computing the OPR solution. if w=1-small epsilon, you're computing the nWMPR solution (as the relative values will be the WMPR but the mean will be selected to minimize the second part of the error, which will be the mean score in the tournament). if w=0.5, you're computing the EPR solution. I wonder how the various predictions of winning margin, score, and match outcomes are as w goes from 0 to 1? |
|
#15
|
||||
|
||||
|
Re: Incorporating Opposing Alliance Information in CCWM Calculations
Quote:
Again, I like it because it is one number instead of two numbers. I like it because it has a better chance to predict outcome regardless of the game, rather than OPR being good for some games and WMPR being good for some other games. |
![]() |
| Thread Tools | |
| Display Modes | Rate This Thread |
|
|