National Rankings

I have been gathering all of the Standings data in a database over the last few weeks of regionals and I would like to build a National Standings page, just like the regional standings page, but aggregating all the standings data together.

Some of the fields to include are pretty obvious, but I am having trouble figuring out how best to present some of the fields when teams play at more than one regional and I am hoping the statistics genius out there might lend a hand…

Rank – Just like the regional, sort by QS, RS, then MP
Team – Another no brainer, one entry per team
Wins – Sum all wins
Losses – Sum of all losses
Ties – Sum of all ties
Played – Sum of all matches played

Now here is where I get a bit lost…
QS – Can’t just do a sum because the more regionals, the higher this value would be. Maybe a weighted average? How best to calculate this?
RS – Same thoughts as QS, maybe a weighted average?
MP – I think this is easy, MAX(MP) from all regionals the team played

I would also add a new column…
Regionals – Number of regionals where the team competed.

Any other data I should include?

Thanks in advance =:o

I am not sure how to normalize the data per regional for just wins and losses. The teams that have participated in three regionals have more opportunity for wins and losses. One of the things that I have done is look at the average score per match.

If one team consistently has more points in every match no matter who the alliance partners happen to be, we might be able to assume that this was a result of a combination of coaching, strategy, team work, and scoring. For example, in LA Regional, using the twitter data, one team averaged about 85 points where the next teams were group at about 65. Call this a alliance score rating. Just some thoughts…

Normalize it. Add them up and divide the sum by the total number of matches played to get a value from 0 to 2. Multiply by a large constant (e.g. 100) if you don’t like fractions.

RS – Same thoughts as QS, maybe a weighted average?

The Ranking Score is already an average, so yeah. Multiply by the number of matches in the regional, add, then divide by the total number of matches played.

Even with normalizing, I don’t think the results are comparable. A win at the Midwest regional with its pile of top-end teams and multiple matches over 100 points is worth more than a win at, say, Waterloo. A team that went 4-4-0 at midwest might go 8-2-0 at Waterloo because of the lower level of competition there.

You could try ranking teams by OPR. That would be a bit more comparable between regionals.

What do you need your statistics to show? What will they be used for? Statistics for the sake of statistics usually means that there is some bias, hidden or invalid data, and/or bad predictions on the outcome of the statistics. In FRC, a team that plays at more than 1 regional is usually better than a team that plays at only 1 regional. In addition (as Bongle just mentioned) you have no strength of schedule weighted into any of this. Competition may be fierce with magnificent strategies and robots at one event, whereas robots at another event may barely move and the one that simply works the best wins.

When considering how to rank nation-wide OPR statistics, I had to consider the end application for the way I want to use the statistics. Right now, I want to somewhat predict every match in Atlanta based upon OPR, CCWM, and SOS statistics. So I decided to use only the data from a team’s most recent regional, since many teams upgrade before Atlanta. However, if I were to see ‘what robot is better’ for the whole year (useful for FF picks), I would have to factor in all of a team’s competitions. If the team was terrible at one regional then upgraded for the next regional and was great, that should factor in to my predictions for next year’s teams’ performance.

The ranking score, which is based on the losing alliance’s score, is in fact a strength of schedule rating. The higher it is, the tougher your schedule. Last year, if you had an RS of 80+, you had a very, very tough schedule. Even IRI only had one or two teams with it that high…one of which wound up winning the event.

Is this being used for serious ranking, or just fun?

QS is quite simple. Win percentage . QS is a function of W-L-T anyway, so create a win% (with ties counting as .5 wins), and it would accomplish the same thing.

As a serious ranking, it simply isn’t feasible. No currently existing metric, whether official FIRST or Chief Delphi-created, can give accurate rankings from different events (even ignoring the issues of sample size). Even OPR and DPRs are skewed by different opponents (a team is going to score a lot more at an event with more “easy targets”).
If you could somehow create a complex algorithm based on teams who competed at multiple events to create a standard across those events, then maybe. But even then you run into problems with teams who change their robot, or otherwise get better or worse, from event to event. Israel literally has no crossover teams to be used, and many events have very few teams who compete at other regionals.

What I am trying to accomplish is the equivalent of the Regional ranking page for each regional but aggregating data from all the regionals.

If each team only competed in one regional, the task would be easy, I would just aggregate all the data. I need to deal with teams that compete in more than one regional, and I am leaning towards Alan’s suggestion of normalization.

I’m not trying to correct for other variables such as alliance strength that the OPR type of algorithm would do, as far as I know ;>

And for the question of whether this is for fun or serious business, well, it’s always for fun and I hope it produces something useful or interesting for others.

Even if each team only competed at one regional, they still play different numbers of matches. Especially since some go to eliminations and other don’t, unless your just basing this on qualifications. But even so a team at Florida played 9 matches in qualifyers, while a team in New Orleans may have played 12 or 13 (not sure of that number). I think an OPR over the whole nation would give you the closest things to an accurate ranking system. It would not be perfect but it would be close.

I’m not intending to tear down your idea, because it’s a good one, but team rankings aren’t really a great way to value success or valuing one robot over another…

We already have the OPR rankings that get done which are quite possibly the best possible way to rank teams based solely on match scores.

yea I agree…a team could compete in 3 events and have a much higher win loss record then a team that only competed in 1 event. This doesn’t mean the other team is bad, they just haven’t competed in as many matches. I think it needs to be based more on how much each team and score and defend. rather then win loss record.

I disagree that RS can be used to show the strength of schedule, as a team that plays good defense will have a lower RS than one which does not.

So here is a question that may stimulate more discussion…

What does the Standings page on the USFirst.org web site show, which is the same as the display in the pits during the regional?

I know technically what it is showing, but in realtion to this discussion.

Some here say that OPR gives a better idea of the overall strength of the team and thus provides better scouting info. That makes sense.

But yet there is always a crowd around the display in the pits to see what position in the regional standing the team is in. Would it be as interesting to see the same information across all regionals?

I’m into the SOS (strength of schedule) discussion. For example, at one recent regional, one team ended up ranked 6th and their wins included 3 no-shows. Another highly ranked team had no ability to score, just drive and deliver Empty Cells. (This is not a slam against any team, it’s a mathematical discussion.)

So, what about this for SOS:
Using historical OPR, RS, or whatever stat is most relevant, go back and give Alliances theoretical score predictions. If Blue Alliance of 1-2-3 has an OPR of 100 versus the Red Alliance of 4-5-6 and their OPR of 20, 1-2-3 should win by 80. That would be an “easy” match for 1-2-3, a “hard” match for 4-5-6. Each alliance would get a 1 for “easy”, 2 for “medium”, 3 for “hard”, and then average the match scores for their SOS.

You could go further, by using the difference of the Alliance OPR and the actual match score difference.

Just thinkin’, maybe I’ll get around to doing it.