# Paper: Miscellaneous Statistics Projects 2018

Are you referring specifically to any of my books or just speaking generally?

I’m assuming the latter here. My general match prediction algorithm is a raw average of predicted contribution win probability and Elo win probability. Neither of these methods “average scores” like you seem to be saying. Both of them only incorporate raw match scores though, not any scouting data or detailed score breakdowns. I’m looking to make a second more advanced Elo model soon which incorporates some aspects of the published score breakdowns, that will hopefully be noticeably more predictive than my current Elo model.

Not sure I answered your question though, so let me know if you were asking something else.

I’ve uploaded a new book, called “CC Ranking Comparison.xlsx”. I’m trying something new though, instead of making a long post here, for my next few projects I’m going to be explaining them on a site I just made called https://frcstats.blog/. We’ll see how this goes, feel free to give me feedback either here or there on which format you prefer or what I could do differently in either medium to improve my content.

Thanks for the shout out in the blog post! I’m glad my predictions were actually helpful for something other than placating my boredom.

I’m looking forward to see where this takes you next, and please let me know if there’s anything else of help I can do.

I’ve got another post up, this one regarding finding the best way to “measure” strength of schedule. I’ve decided to use a metric I’m calling “average rank difference” moving forward, which is found by:
Averaging the sum of all opponent ranks and subtracting out all partner ranks, and then also subtracting ((# of teams + 1)/2) and then dividing by the number of teams.

Another post is up talking about schedule strengths from 2005-2018.

All teams’ average schedule strengths (higher = harder, 0 = neutral):

``````team	average schedule strength
``````

Best schedules: 1031, 1988, 2236, 616, 3742
Worst schedules: 4323, 3326, 4621, 5296, 4812

I decided to take another look at the hypothetical “serpentine valley” I investigated last year. Back then, I was more interested in if teams going into an event had an incentive to perform worse than their ability. I found that if there was a so-called serpentine valley, it was very small and centered around rank 10.

Here, I did a similar investigation using end of event rank instead of start of event Elo. A serpentine valley here might indicate that teams in matches near the end of the quals might be incentivized to get fewer RPs if they want to maximize their chances of winning the event. Here is a book with that data as well as a summary sheet: serpentine_valley_v2.xlsx (1.4 MB)

I pulled rankings and results from all events since 2008. Which gave me a sample size of about a thousand events. I found how many event wins and how many wildcards were achieved from each ranking position, and dividing those by the number of events where a team got this rank gives us a win and win/wildcard probability from each rank. Below are graphs summarizing that data:

These graphs actually look remarkably similar to the pre-event Elo graphs from my earlier work. The “serpentine valley”, if it exists at all, is centered around rank 8 or 9. For reference, the gap between rank 8 (bottom of the valley) and 10 (top of the valley) is 1.1%, jumping from 3.8% to 4.9%, or about 12 out of 1000 events. This is much smaller than I probably would have expected, and that is the lowest point compared against the highest point. Comparing rank 7 or rank 9 to rank 10 is nearly identical. The “valley” that we are seeing could feasibly be largely noise, as there are larger “valleys” at ranks 15 and 23, and I have no reasonable explanation for why there would be dips around those ranks.

My takeaway is that this methodology really doesn’t provide evidence that any reasonable number of teams are incentivized to throw matches. That doesn’t mean those incentives don’t exist, just that you’d really have to dive much deeper into the data to prove that they actually do. Maybe someday when I add alliance selections into my event simulator I’ll revisit.

Fun fact, the lowest any team has ever ranked and still won the event was 62nd. Which happened twice, once at the 2015 LS Regional and once at the 2010 Michigan State Champs.

Caleb, since you have the data… Instead of win probability, what about probability of being on X alliance? I’m wondering if that would show any difference.

Here’s an updated book that includes alliance data, I limited the data set to 2014+ since earlier data gets iffy:
serpentine_valley_v3.xlsx (1.0 MB)

Probability of being on each alliance based on rank:

Playoff participation rate vs rank:

Average playoff alliance vs rank:

So roughly rank 11 gives the worst average alliance assuming that you play in playoffs. I clipped the graphs at about rank 30 since the averages start to go crazy around there due to smaller sample sizes, but people can play around with the data as they want.

One more fun graph, this shows the most common alliance a team at each rank plays on, again assuming that they participate in playoffs:

It’s probably all noise at ranks 20ish+, lower ranks are more interesting.

2 Likes

And here’s a fun gif of the data:

Look at that nice wave

I think I see a little ripple bounce back starting at around rank 21, which I expect is the effect of the serpentine draft. Difficult to say for certain though.

2 Likes

Man, this is such a fun data set. Just when I think I’m done, I keep finding other cool ways to break down the data.

I tried here to take the correlation between alliance probabilities for teams at adjacent ranks. However, I shifted the data in the lower rank forward one alliance (to simulate the forward draft direction) as well as backward one alliance (to simulate the reverse, aka “serpentine”, draft direction). The correlations between these can give us a sense of the relative impacts the “forward” and “reverse” draft directions have on teams at each rank. Here is a graph showing these results:

As we would expect, the serpentine direction has negligible effect on teams ranked 1-9 in terms of affecting which alliance they will be a part of. However, the effect of the serpentine draft quickly jumps at rank 10 to be slightly less important than the forward direction. This holds true until about rank 15, at which point, both directions have roughly equal impact until about rank 19. From ranks 19-22, these teams are slightly more impacted by the serpentine draft direction than the forward draft direction.

I had to clip the data around here because it starts to get really noisy quickly at rank 23. I’m not sure if there is any real practical use for this data, but it is fascinating to see direct impacts of the serpentine draft represented in data. Relative importances of the first or second round draft for teams was such a nebulous idea to me prior to doing this, I just kind of stumbled into what I think is a reasonable way to quantify it.

One potential implication of this is that if you are rank ~17 or lower, it might be best to sell yourself to the higher seeded teams as a first round pick, and if you are at a higher rank than this, your time might be better spent trying to sell yourself as a second round pick.

1 Like

Well, I’m not sure exactly how best to approach this, but here’s an excel book that will be expanded upon more in a blog post tomorrow. I’ll post a link to the blog post after it comes out. The data may look nonsensical to you until then.

FRC_dynasties.xlsx (9.2 MB)

Here is the blog post where I describe what the above book is all about: https://blog.thebluealliance.com/2019/01/23/1114-is-frcs-greatest-dynasty/?fbclid=IwAR1cZCbl1c-KCT-REk3QW7wSZWHrk3graLk-xbJAVdHFppbSV04FskLk-ks

Around discussion of this, I was asked if I could get a team’s average Elo over their own lifetime. I decided to use end of season Elo since I think that’s fairer for rookies. Here is a book that has all teams:

And here is a table with the top 25 teams:

Team Lifetime average Elo
2056 1935
1114 1899
254 1873
987 1802
67 1800
1717 1795
330 1785
469 1783
118 1780
1986 1761
233 1758
33 1747
25 1745
148 1741
1678 1737
5172 1737
4488 1731
3476 1729
16 1727
27 1725
111 1722
2122 1718
2481 1718
5406 1715
2590 1712

Here are my updated 2019 Start of Season Elo ratings based on the changes I described in my FRC Elo thread:
2019 SOS Elos.xlsx (171.8 KB)

And here are the new top 25 as well:

Team Start of Season Elo
254 1904
2056 1872
1678 1866
195 1860
148 1857
2481 1845
225 1832
118 1829
2590 1822
610 1822
1114 1820
2046 1815
694 1805
2767 1797
3357 1792
3310 1790
1519 1790
3309 1790
494 1788
1538 1788
1241 1786
217 1786
624 1786
230 1785
1619 1782

Also, does anyone know how to make scrolling lists? I could post all team ratings here but I don’t want to take up the whole screen.

1 Like

Here is the zipped folder of generated balanced schedules I mentioned in my TBA Blog post:
balanced schedules.zip (963.7 KB)

I’m super curious about running the strength of schedule metrics you used here: https://blog.thebluealliance.com/2019/02/12/schedule-strengths-2-of-3-historical-schedule-strengths/

Is there any way I could run this myself right after a schedule is released?

I don’t have a simple way to run these right now, maybe sometime I could put it into my simulator. You can however, run my event simulator and look at how each team’s ranking probabilities change before and after the schedule is released. The “average rank change” resulting from this is one of the metrics I evaluated in part 1 of that series.