Log in

View Full Version : Statistical Predictions


Greg Marra
25-04-2014, 00:39
Has anyone run predictive statistical models (a la Nate Silver) to predict later season competitions based on earlier results successfully?

I attempted to build a simple model that predicted Curie based on summing for each alliance each team's best Event OPR from earlier in the season. It scored 21/81 based on Thursday's matches. You'd be better off taking the opposite of my model by a long shot!

efoote868
25-04-2014, 01:06
Team age seems to be doing a little better than a coin flip - 44 / 79 (ignoring the tie in match 39).

Looking at the summary statistics of match score differences, Curie has the closest of matches:


Correct Q1 Q2 Q3 AVE STDEV
Archimedes 51 39 63 89.25 70.3 51.1
Curie 44 19 41.5 74.5 61.7 58.8
Galileo 51 29 55 101.25 71.9 55.8
Newton 49 25.75 47 102.5 66.3 53.9


(quartile, average, and standard deviation taken from the difference of match scores).

Basel A
25-04-2014, 09:29
Has anyone run predictive statistical models (a la Nate Silver) to predict later season competitions based on earlier results successfully?

I attempted to build a simple model that predicted Curie based on summing for each alliance each team's best Event OPR from earlier in the season. It scored 21/81 based on Thursday's matches. You'd be better off taking the opposite of my model by a long shot!

I've been doing it for several years, and this year it's been the worst by far. The game could hardly be more interdependent, which makes any OPR-based predictions bunk. Also, I should note that I've generally found Most Recent OPR to be a better predictor than Max OPR.

MikeE
25-04-2014, 11:05
I've been doing it for several years, and this year it's been the worst by far. The game could hardly be more interdependent, which makes any OPR-based predictions bunk. Also, I should note that I've generally found Most Recent OPR to be a better predictor than Max OPR.

Basel gets at the core of the issue for this year's game. The independence assumption inherent in OPR-like regressions doesn't model a game where a plurality of points scored are from cooperation between alliance members.
Unfortunately the data from matches is too sparse to model interactions between specific teams, so we would have to use more advanced techniques (e.g. clustering teams into equivalence classes) which brings an additional set of approximations.

GBilletdeaux930
25-04-2014, 20:15
Won't go into too much detail now because I'm on my phone, will return later. but I've successfully predicted ~78% of matches up until about 2 today for all fields at champs. A sizable percentage of the matches it gets wrong are close matches.

Essentially I score each team by what the average score is during a match they are in as opposed to when they aren't. Maybe I'll do a small white paper on it to explain further. But nonetheless I will be back this evening if people are curious.

Ether
25-04-2014, 20:39
Won't go into too much detail now because I'm on my phone, will return later. but I've successfully predicted ~78% of matches up until about 2 today for all fields at champs. A sizable percentage of the matches it gets wrong are close matches.

Essentially I score each team by what the average score is during a match they are in as opposed to when they aren't. Maybe I'll do a small white paper on it to explain further. But nonetheless I will be back this evening if people are curious.

Why do you suppose this should give a better result than an OPR, CCWM, or EPA calculation?

cmrnpizzo14
25-04-2014, 20:56
Why do you suppose this should give a better result than an OPR, CCWM, or EPA calculation?




What does OPR, CCWM or EPA yield for percentages?

GBilletdeaux930
26-04-2014, 00:50
Why do you suppose this should give a better result than an OPR, CCWM, or EPA calculation?




Hi Ether,

Going to start this off with an anecdotal statement. We ran these same statistics for last season and compared them to OPR. Last season, OPR was a little better at predicting matches than what we are calling Main Effects. So I cannot claim that Main Effects always gives better results than OPR or the rest.

As I said, we calculate ME by taking the averages of every match a team is in, and compare them to the averages of matches they aren't in. We do this for each piece of the match we can get from Twitter, so Auton, Tele, and Foul points. With that we get three numbers that tell us how good or bad a team is compared to the average team. A ME # of 0 in Auton means that their auton average is equal to the average auton for every alliance this season. We sum up those 3 numbers to get a Final ME on the entire match.

Because of the nature of this game and how much defense and fouls come into play, we do the same thing, but look at the opposing alliances scores. So when team X is on an alliance, how does the opposing alliance score? This is incredibly telling for foul points and can pinpoint a robot that has been in a lot of foul matches very easily. If a team is incredibly good at defense (or so good at offense that its presence forces the other alliance to dedicate robots to defense), you will see the Opposing Final ME be a negative number.

For match predictions then, I tally up the Red Alliance's "Score" as so. I sum up the 3 Final ME of the teams on the red alliance with the 3 Opposing Final ME of the teams on the blue alliance. I do the same for blue and use those numbers to predict the outcome of the match.

So just to display an example team ME spread, here is 1114 from this year:

TEAM MATCHES FINAL HYBRID TELE FOUL OPPSING_FINAL OPPOSING_HYBRID OPPOSING_TELE OPPOSING_FOUL
1114 38 102 19 71 13 -2 1 -3 2


1114 scores, on average, 102 points higher than the average alliance. This is largely contributed to their tele ME of 71. We can also see that alliances that play against them average a score of 2 less than the average alliance score.

Why do I think this has so much success this year where OPR hasn't? I think it can be largely attributed to the effect fouls have had on the game, as well as the fact that a good alliance consists of more than 1 good robot this year. I personally believe that that fact is what makes this game so good. Others clearly have opposing opinions, but the fact that one robot can't drive an alliance to victory single-handedly supports FIRST's mission far greater than everyone sitting behind while one team drags them to victory.

There are most likely many flaws in my reasoning that I'm willing to discuss because I honestly do not believe I have the definitive answer as to why it has been successful. But I would like to find out.

For everyone else who is paying attention, after every match today, ME has successfully predicted 465 out of 600 matches, or a 77.5% accuracy at championships using regional data. This coincides with my results throughout the season landing between 70% and 80% accuracy at each regional after the fact. Seems that that last 20% is most likely too random (robots breaking) to achieve.