Introducing Project Nautilus, a match prediction Twitter bot from Team 1410

Team 1410 is proud to finally announce Project Nautilus, a twitter bot which predicts the outcomes of FRC matches.

Nautilus uses 3 different methods to predict match outcomes. The first is a modified ELO system based off of Caleb Sykes’ system (paper: FRC Elo 2008-2016 - CD-Media - Chief Delphi). The second method is a a new match prediction method which we have developed over the past several months that we call MAR. MAR was inspired by David Burge’s Marble game, an effort to create a better ranking system for college football.

In 2018, our modified ELO had a 77.6% accuracy rate in predicting matches and MAR had a 70.9% accuracy rate.

Our third match prediction method is something that we call EMZ. EMZ is a combination of ELO, MAR, and some special sauce. EMZ had an accuracy rate of 72.6% last year, and it tends to have a better brier score than ELO or MAR. Here is a brief explanation of brier scores: Brier Score: Definition, Examples - Statistics How To. All of our match prediction method “run hot” in the sense that the predictions for match 72 are affected by the previous matches played.

We ran Project Nautilus’ prediction systems for all qualification matches during week one and two events. For those events, our modified ELO had a 68.50% accuracy rate in predicting matches, and MAR had an accuracy rate of 66.64%. EMZ had an accuracy rate of 67.5% this year so far. If you decide that any match with less than 65% prediction confidence is too close to call and discount those predictions, the accuracy of ELO jumps to 75.57%, MAR to 71.55%, and EMZ to 78.32%.

Project Nautilus will predict the qualification matches of the San Francisco Regional this week, and if everything goes well, will also predict the Colorado Regional during week four. We will continue to predict matches throughout the season, but have not finalized a list of events we will be predicting. To see our predictions, follow Project Nautilus on Twitter: @Nautilus_1410

As we are just launching this for the first time this week, something might go wrong and we may have to temporarily shut down Project Nautilus — if we do so, we will tweet about it on the @Nautilus_1410 account.

You can find Caleb Sykes’ ELO, which we used as a starting point for our ELO system here: paper: FRC Elo 2008-2016 - CD-Media - Chief Delphi

An explanation of the Marble game for college football that MAR was inspired by can be found here: The Marble Game in R - @alspur.

Special thanks to Lori V. from team 5414 for making Project Nautilus’ profile photo.

https://twitter.com/Nautilus_1410

11 Likes

Sounds awesome! Looking forward to watching this work this weekend.

2 Likes

Can you go into more detail on what modifications you made to my Elo model? I don’t mind at all that you did, but if you actually found a change that has appreciable improvement to predictive power, then I’d like to know what it is so I can investigate related ideas to improve mine.

3 Likes

The biggest change we made was modifying some of the constants in the Elo formula.

2 Likes

This is super cool.

Could you be more specific? Do you mean like K-values?

1 Like

The first-day tweeting match predictions has gone pretty well, tomorrow we’ll keep predicting the qualification matches for the San Francisco Regional. Here’s today’s accuracy! There were two ties today which are not included in the accuracy scores below.

ELO: 69.8%
MAR: 67.9%
EMZ: 67.9%

The accuracy for all predictions that have more than a 65% confidence:

ELO: 78.3%
MAR: 87.5%
EMZ: 85%

3 Likes

We’ve decided to set the K value at 8 and it seems to be working pretty well there. The other value that we have really changed is the 400 in the Elo formula, its no longer a static value and is impacted by the distribution of Elos in the event that is being predicted.

1 Like

At the end of qualification matches here are our accuracy numbers:
ELO: 69.4%
MAR: 65.2%
EMZ: 68.1%

For matches above 65% confidence:
ELO: 75%
MAR: 85%
EMZ: 85%

2 Likes

I’m not on Twitter so I’m not following this, but this looks very interesting.

Do you have probability predictions by predicted victory margin? Do you have a threshold where victory is certain for an alliance?

2 Likes

Neat! I’m always interested in prediction software. Just as a fun reference point, our top scout this weekend successfully predicted the winning alliance in 68 out of 76 qual matches (89.47%).

If you’re going to be modifying Caleb’s Elo model, I’d suggest looking into Eugene’s as well - the two together are a winning combination :wink:

1 Like

I think you are asking if we have probability predictions for certain winning margins, like what percent chance is there for a winning margin of 12 points. If that’s what you are asking, no, we don’t have that. If you are asking if we have accuracy percentages for predictions with above a certain percent confidence, then yes, we can calculate that and generally do after each week.

We don’t. I’m sure we can find a point at which we predict 100% correctly, but that would be only a couple matches a week. Also, no matter how one sided a match is there is always a chance that the alliance predicted to win loses, due to fouls and the slim chance all three robots on the alliance predicted to win don’t move the entire match.

2 Likes

These are generally the changes we made.

1 Like

Thanks, we will certainly look at Eugene’s model.

Also, normally at events that we attend we have an “assisted” version of our predictions that uses pit, match, and super scouting data to be more accurate.

1 Like

The matches this year may be close enough in general that this threshold is very high relative to potential scores, but last year when we built a simple match prediction model, it predicted match outcomes 100% correctly at some point, I believe about a 100 point difference.

1 Like

Interesting - it sounds like your prediction model predicts the winning margin of a match. Our models currently only calculate the percent chance of each alliance winning the match. Winning margin predictions are certainly something that we will look into.

1 Like

Here’s our accuracy for the Colorado Regional qualification matches:
ELO: 73.6%
MAR: 68.9%
EMZ: 72.4%

For matches with greater than 65% confidence:
ELO: 87.7%
MAR: 85.3%
EMZ: 93.2%

4 Likes

I doubt it, but maybe. 100% is a pretty high bar

To get from 99% to 100%, my guess is that you would need at a minimum a complete model of physics, and even then there might still be some quantum uncertainty stuff that you’d never be able to model better than probabilistically. For what it’s worth, I’ve predicted things with 99+% confidence that were wrong, but that’s part of the job when you predict thousands of things. I “predicted” (after the fact) red at extraordinarily high confidence for all of these and blue pulled it out:

Year Event Match Type Set Number Match Number
2011 Finger Lakes Regional qf 1 1
2011 Pittsburgh Regional qm 1 47
2014 Windsor Essex Great Lakes Regional sf 1 1

Of course, if your sample size is small you can hit 100% in sample (I actually know a way to flip a coin in such a way to always get heads, and I just did it once).

3 Likes

Coming back to this now, I now believe that you would need to reduce the % confidence to the point in which you predict only a single match, and predict it correctly. For any meaningful discussion, this point doesn’t exist, and really shouldn’t even be thought about. To put your quote of me back into context:

2 Likes