View Full Version : OPR vs Record at championship
MrRiedemanJACC
27-04-2013, 07:53
One item that is amazing to me thus far is the discrepancy of OPR vs the record of the teams. At Michigan State Championship, it was a shoot out for most matches and OPR matched your ranking fairly well. So far from what I've seen, not so much so at St. Louis. But then again I am not there.
I can think of two reasons:
1) With 100 teams per division the field is not as deep as Michigan.
2) There are only 8 matches per team so the match difficulty is not equal. Or there is a lot left to the "luck of the draw"
So my questions are:
1) do you agree with these observations?
2) is it that way every year because the issues are inherent to the way the system is set up?
3) Am I biased because I am in Michigan? :)
8 matches to sort 100 teams is a much coarser function than 12 matches drying 64.
Abhishek R
27-04-2013, 11:16
Well, several teams have stepped up their game since their last regional, so sometimes it's not "technically" a discrepancy, just problems getting fixed.
Lil' Lavery
27-04-2013, 11:31
How well does one metric poor at evaluating robot performance correlate to another metric poor at evaluating robot performance?
How well does one metric poor at evaluating robot performance correlate to another metric poor at evaluating robot performance?
Poorly?
Ivan Malik
27-04-2013, 13:02
I forgot just how bad 8 matches is at sorting... add in the 40ish extra teams and its even worse... Smaller divisions are a must in the future.
For those who are way better at statistics than I am, would doubling the number of divisions, but halving the teams in each division result in better sorting? Same number of matches (more would be helpful) but a smaller group to sort...
You could keep the same number of fields; heck there could be two "pools" per field and have the winners of the pool elims play each other to determine the field champion. The field champion would then represent the field on Einstein.
The qualification ranking used in FRC is pretty poor. Even if you had some very good mathematical evaluation of robot performance (which in certain years, including this year, OPR is), it wouldn't correlate well because the standings are very often not a good evaluation of robot performance.
So really, there's a corollary to Lil Lavery's post:
-A bad performance indicator (OPR in his opinion) doesn't correlate well with another bad performance indicator (standard qual rankings with 8 matches played)
-Also, a good performance indicator (OPR in other's opinions) doesn't correlate well with a bad performance indicator.
Given the definition of OPR: "the average number of points a robot's presence adds to its Alliance's score", the real question about OPR is not whether it predicts final standings, but rather whether it accurately predicts an upcoming match score given the input of matches played before.
I forgot just how bad 8 matches is at sorting... add in the 40ish extra teams and its even worse... Smaller divisions are a must in the future.
For those who are way better at statistics than I am, would doubling the number of divisions, but halving the teams in each division result in better sorting? Same number of matches (more would be helpful) but a smaller group to sort...
You could keep the same number of fields; heck there could be two "pools" per field and have the winners of the pools play each other to determine the field champion. The field champion would then represent the field on Einstein.
The answer to this is to consider what would happen if you took it to the extreme: what if you had pools of 6 robots that played 8 matches against each other, and they had a single round playoff of one alliance against another to determine the pool winner, then gradually played against the other pool winners, tournament-style?
The answer: as you reduce the pool size, you reduce the chance of the objectively "best" robot actually winning the entire competition and increase the effect of luck - the best robot might have 4-5 mediocre robots in their pool to pick from, and so can't win the whole thing against an alliance of 3 medium robots from another pool. Put another way: the pool results would be very accurate (the best robot would tend to win all the time), but the overall results would be less accurate since the best robot might not have good robots to take to inter-pool play with it.
The main way to make championships more "accurate" is to play more matches, which really comes down to field-hours: you either need more fields for the same hours, or to simply use the 4 fields you've got for longer hours in the days you've got. Or you need to use Einstein to play more matches.
artdutra04
27-04-2013, 13:54
IMHO, the 2013 ranking system is the least problematic ranking system ever devised by FIRST. Very straight forward, entirely based upon robot performance, no weirdness from coopertition or losing score.
The only way this system can be improved is with more matches. The more matches a team has, the better the odds are that luck will filter out and skill will filter in.
I forgot just how bad 8 matches is at sorting... add in the 40ish extra teams and its even worse... Smaller divisions are a must in the future.
For those who are way better at statistics than I am, would doubling the number of divisions, but halving the teams in each division result in better sorting? Same number of matches (more would be helpful) but a smaller group to sort...
You could keep the same number of fields; heck there could be two "pools" per field and have the winners of the pool elims play each other to determine the field champion. The field champion would then represent the field on Einstein.Hmmm... this sounds like it could be one of those statistical traps, but I've thought it over a couple of different ways and I can't think of any way that would improve the variability. Not to mention that, even if it did, it would still weaken the eliminations alliances...
When I was in the TBA chat, we talked about what I thought was a really cool idea: Have 6 divisions, with the same number of teams (I think there's room in the dome for 2 more fields, don't you?) That would dramatically increase the number of matches, and make for a more regional-like environment. But what do you do with the 6 division winners at Einstein, you ask? This was the really good idea someone suggested: have a round-robin to decide the 2 alliances for the finals. Yes, it would take more time - 13-15 matches before the Einstein finals, instead of 4-6. But if you cut down on the hour before Einstein and all the speeches, you could almost fit them without expanding the schedule at all... not to mention that, with the smaller divisions, you could start eliminations earlier. Any feedback? I think a round-robin would be really fun...
shrkeatsman
27-04-2013, 18:01
i think that if there is an OPR there should also be a DPR defensive power rating,
because like my team we were all most always playing defense and were put up against teams that we had a hard time playing defense against. So our ranking defiantly didn't show our potential.
i think that if there is an OPR there should also be a DPR defensive power rating
If you want a number for DPR, there already is one. The DPR analogy to OPR is given by the following simple relation:
DPR = OPR - CCWM
But it's not very useful, for reasons discussed elsewhere (http://bit.ly/187BLBf). fixed broken link
shrkeatsman
27-04-2013, 20:37
If you want a number for DPR, there already is one. The DPR analogy to OPR is given by the following simple relation:
DPR = OPR - CCWM
But it's not very useful, for reasons discussed elsewhere ("DPR" site:chiefdelphi.com).
thank you but can you translate the ccwm?
thanks
DampRobot
28-04-2013, 15:00
How well does one metric poor at evaluating robot performance correlate to another metric poor at evaluating robot performance?
I know your user title has said something to this effect for a while, but I really have to disagree. OPR is never perfect, but I found that it worked pretty well this year. I used it to predict the Saturday seedlings of SVR Friday night with a fair degree of accuracy. It was a pretty good measurement of how much we contributed to our alliances, and we found that it represented other teams' performance fairly well too.
Of course, OPR can't predict exactly how many points you (or anyone else) will score in a given match. But in my experience, it does provide a fairly accurate way of predicting which alliance will win a given match.
Why did champs seeding not correlate well to OPR? Because 8 matches is way too small of a statistical sample.
PayneTrain
28-04-2013, 15:08
How well does one metric poor at evaluating robot performance correlate to another metric poor at evaluating robot performance?
I can tell you 422 went the opposite of the OPR projection of 2-6 and posted a 6-2 record and everyone will agree that CMP was by far nowhere near as good on the whole as DCR, and the actual robot scores show that.
OPR is becoming less reliable as teams in CMP divisions are coming in with pre-CMP match differences of up to 50.
I can tell you 422 went the opposite of the OPR projection of 2-6 and posted a 6-2 record and everyone will agree that CMP was by far nowhere near as good on the whole as DCR, and the actual robot scores show that.
OPR is becoming less reliable as teams in CMP divisions are coming in with pre-CMP match differences of up to 50.
Another reason for that is that OPR doesn't compare well across events. Suppose regional A had a ton of defense, and regional B had wide-open scoring. Two robots of equal quality could each go to their regionals and have vastly different OPRs because at regional A, nobody could score well and at regional B, everyone was free to score. Projections based on OPR from each regional might show that the teams that went to regional A were much worse than the teams from regional B, but they're actually pretty strong.
It also ignores robot, team, strategy, and luck upgrades that might occur between regionals and championships.
dtengineering
28-04-2013, 18:12
I'm okay with the qualifying system as it exists... I think it matches up operational requirements of the tournament with the goal of correctly sorting the teams. Its not perfect, of course, but it is as good as a round-robin is likely to get.
Round robin tournaments aren't the only way to organize a tournament however. I think a Swiss System (http://en.wikipedia.org/wiki/Swiss-system_tournament) type tournament could be organized where all the teams start at the same level, but either advance or drop down based on their record.
This way, while teams play their first match in random alliances, it would quickly work out so that top teams ended up playing against/with other top teams, while weaker teams would play against/with other weak teams.
It would take some planning, and I'm certainly not saying that change is needed... I'm just pointing out that change is possible.
Jason
Citrus Dad
29-04-2013, 17:19
The qualification ranking used in FRC is pretty poor. Even if you had some very good mathematical evaluation of robot performance (which in certain years, including this year, OPR is), it wouldn't correlate well because the standings are very often not a good evaluation of robot performance.
So really, there's a corollary to Lil Lavery's post:
-A bad performance indicator (OPR in his opinion) doesn't correlate well with another bad performance indicator (standard qual rankings with 8 matches played)
-Also, a good performance indicator (OPR in other's opinions) doesn't correlate well with a bad performance indicator.
Given the definition of OPR: "the average number of points a robot's presence adds to its Alliance's score", the real question about OPR is not whether it predicts final standings, but rather whether it accurately predicts an upcoming match score given the input of matches played before.
Based on my statistical analysis, the OPR this year was a better predictor than last year's. I was surprised, but teams improved more over last season than they did this season. I haven't yet run a full analysis on the predictive capability this year, but last year using average OPRs, I could predict qualifying match outcomes 75% of the time. I switched to max OPRs this year (and made some adjustments based on our own scouting data) and I think the prediction rate improved.
Standings can only be predicted by totaling the OPRs for all alliances in all matches and computing the win-loss record. Remember this is a team+teams sport, not just an individual or single sport, so an individual OPR won't carry many matches.
Citrus Dad
29-04-2013, 17:24
I can tell you 422 went the opposite of the OPR projection of 2-6 and posted a 6-2 record and everyone will agree that CMP was by far nowhere near as good on the whole as DCR, and the actual robot scores show that.
Note that defense is much more important at the championships than at any regional. Robots are much more beaten up after 8 matches than after 10-12 at regionals. That means that individual robot scores will drop.
minhnhatbui
29-04-2013, 23:41
Why did champs seeding not correlate well to OPR? Because 8 matches is way too small of a statistical sample.
And also, in most regionals I think, a team is more likely to play with or against every other team at least once - 9 to 12 matches with 35 to 60 teams, depending on the regional - which make the OPR more accurate than in a championship division, where you get to play with 40 to 45 teams - 8 or 9 matches, depending on the year - out of 100 teams.
OPR is meant to be more effective if a team plays against or with every other team at least once.
PayneTrain
29-04-2013, 23:44
Note that defense is much more important at the championships than at any regional. Robots are much more beaten up after 8 matches than after 10-12 at regionals. That means that individual robot scores will drop.
On the flip side of the coin, some teams have their robots at peak performance by championships.
And also, in most regionals I think, a team is more likely to play with or against every other team at least once - 9 to 12 matches with 35 to 60 teams
In each match, a team plays with 2 other teams and against 3 other teams. So in 12 matches, any given team could play against at most 36 other teams, and with at most 24 other teams.
But this ideal is rarely realized.
Here's an analysis of the qual match scheduling in the 2013 season:
http://www.chiefdelphi.com/media/papers/2822
MamaSpoldi
30-04-2013, 14:27
Note that defense is much more important at the championships than at any regional. Robots are much more beaten up after 8 matches than after 10-12 at regionals. That means that individual robot scores will drop.
REALLY?? We found far, FAR less defense at Championships than at either of the Regionals we attended this year (BAE - week 1, and CT - week 5).
OrangeCataclysm
30-04-2013, 15:33
REALLY?? We found far, FAR less defense at Championships than at either of the Regionals we attended this year (BAE - week 1, and CT - week 5).
More defense I would say was played in elims, especially in Curie. My team can attest to that
MamaSpoldi
02-05-2013, 12:05
More defense I would say was played in elims, especially in Curie. My team can attest to that
I didn't really get to see division elims as I was in the pit helping pack up, but I was surprised at the general lack of defense on Einstein.
OrangeCataclysm
02-05-2013, 16:39
I didn't really get to see division elims as I was in the pit helping pack up, but I was surprised at the general lack of defense on Einstein.
I saw more defense in Curie v. Archimedes than any other matchup. To be honest, I wasn't expecting a whole lot of defense. Once the full court shooters stopped full courting or were eliminated, much of the defense ceased.
Citrus Dad
03-05-2013, 15:07
More defense I would say was played in elims, especially in Curie. My team can attest to that
We selected 862 in large part for its defensive capabilities and they held off 67. 4814 demonstrated in Curie how important defense can be.
Citrus Dad
03-05-2013, 15:09
REALLY?? We found far, FAR less defense at Championships than at either of the Regionals we attended this year (BAE - week 1, and CT - week 5).
Maybe you needed to play in Curie. 4814 made it to the division finals through defense and counter defense. We beat 2056 through defense.
minhnhatbui
03-05-2013, 15:10
I didn't really get to see division elims as I was in the pit helping pack up, but I was surprised at the general lack of defense on Einstein.
As my students pointed out: defense was well planned on Curie, but went full speed to the landfill on Einstein field because, well, playing in front of 20,000 spectators is somewhat very stressful. I tend to agree with them on that one. :rolleyes:
Maybe you needed to play in Curie. 4814 made it to the division finals through defense and counter defense. We beat 2056 through defense.Curie was great, but I didn't see anything in terms of defense there that hadn't happened at other events. Similar D and counter-D are exceedingly common this year (for good reason).
Citrus Dad
19-05-2013, 15:01
Note: I was reviewing the 2834 database and think I found that the Championship OPRs are in error. The sums of the individual components often do not add up to the Total. (3824's in Curie is off by 32.) A quick scan of the regionals finds in some cases no deviations whatsoever and <2 pts maximum in others. I suggest going back and recomputing the OPRs.
Note: I was reviewing the 2834 database and think I found that the Championship OPRs are in error. The sums of the individual components often do not add up to the Total. (3824's in Curie is off by 32.) A quick scan of the regionals finds in some cases no deviations whatsoever and <2 pts maximum in others. I suggest going back and recomputing the OPRs.
redirect here:
http://www.chiefdelphi.com/forums/showthread.php?p=1276030#post1276030
vBulletin® v3.6.4, Copyright ©2000-2017, Jelsoft Enterprises Ltd.