I’m currently working on analyzing the awesome timeseries data from TBA. I’ll have plenty more to come, but I’m at a point where I got some really sweet graphs, so I thought I’d share, and describe a rough outline of my live model at the same time.
I am currently analyzing the ~1500 matches that have the best timeseries data. It’s possible that I’ll go back later and clean up the messier data, but I wanted to focus my early analysis on data I could have high trust in. What I’m currently working on is a way to predict the match winner in real-time based on this data. Here is a Brier score graph of my current model:
The first five seconds of the match just use my pre-match Elo win probability, but from then on, I begin incorporating the real-time scoring (in conjunction with the pre-match Elo prediction) to create win probabilities. The Brier score is basically steady for the first 5 seconds (when I’m not incorporating match data) but also from ~19 to ~23 seconds, which is probably because teams have by this point scored their first set of cubes and are picking up the second set. Also, note that even at t = 150 seconds, the Brier score is not zero because the actual final score can differ from the last score shown on the screen.
I mentioned that I incorporate Elo ratings into the predictions, here is a graph showing how much weight I give to Elo versus live match data at each second:
This and the following graphs you will see were created by tuning my prediction model, so the values that you see were the most predictive ones I found. After the first 5 seconds, the importance of Elo drops sharply down to ~65% by the end of auto, where it holds roughly steady for the same 19-23 second interval described above. This makes sense, since if there isn’t much scoring, we wouldn’t expect the live scoring to increase in importance much. After that, the Elo weight decays roughly exponentially down to 0.
The general form of my model (excluding Elo) is red win probability = 1-1/(1+10^((current red winning margin)/scale)) where “scale” is how much of a lead red would need to have a 10/11 ~=91% chance of winning the match at that point in the match. Let’s call this “scale” the “big lead” amount so as not to be confused with the scale on the field. If a team is up by 40 at a point in the match where the “big lead” value is 40, that team has a 91% chance of winning, but if they are up by 80 (two big leads), that team has a 99% chance of winning. Obviously, what is considered a big lead will vary over the match, so here is a graph showing that change over time:
I excluded the first 5 seconds since the values there are indeterminate since I don’t incorporate match data then. The next few seconds of auto are also a bit weird, probably since not much happens at this time in most matches, and even in the matches that do have things happen, a “big lead” of 30+ points is not very intuitive since there is no way a team could even have this much of a lead this early (excluding penalties). By the end of auto though, we see the big lead value settle at around 20, which sounds about right, teams who are up by 20 after auto are probably feeling pretty good since they probably have control of both the switch and the scale. After auto, what is considered a big lead increases steadily until peaking at around 60 points at around 60 seconds in. This seems to make sense, because a team up by 20 after auto should be up by 60 40 seconds later if they control the scale the whole time and nothing else changes. After this, the “big lead” holds steady until 110 seconds, when it sharply drops but then recovers at ~122 seconds. I don’t know the explanation for this, but my gut tells me this has to do with climbing positioning. After that, the “big lead” drops until ending at 29 points. This means that, if you see your score on the screen at the end of the match at 30 points, you are about 90% likely to win the match in the final score, and if you are up by 60 points, you are about 99% likely to win the match.
I mentioned that the form of my model uses the red winning margin, but that’s not precisely true. In fact, I use an adjusted red winning margin where I account for ownership of the scale and of the switch. Basically, I found how much “value” to give to switch and scale ownership at each point in the match. What I mean by “value” is this, if red is down by X points but controls the scale, what is the value of X such that red and blue have an equal chance of winning the match? Here is a graph of value versus time for the scale:
Again, skipping the first 10 seconds of auto, we see scale ownership to be worth ~30 points after auto. It then drops in value to a min of 25 points at 23 seconds. This drop might be due to the initial scuffle for the scale. By 45 seconds, the scale has peaked in value at 49 points, and then it has a jittery drop until the end of the match. The same dip seen in “big lead” also appears in scale value at around 120 seconds. Interestingly, scale value does not go to 0 at the end of the match, but rather ends at 8 points. Perhaps scale ownership provides some indication of climb success?
Here is a similar graph for switch value:
Most of the same trends as in the previous graph also appear in this one. The biggest difference though is that switch value actually does go to 0 by the end of the match.
Let me know if you have any questions. I’ll have more to come soon, including win probability graphs and match “excitement” and “comeback/upset” scores.