paper: District Points Analysis

Thread created automatically to discuss a document in CD-Media.

Distritc Points Analysis
by: AriMB

Why does that guy keep posting about District Points in threads having nothing to do with DCMP qualification? This is why.

The District Points model is much more useful than just for deciding which district teams got to DCMP and CMP. It can also be used for post-performance analysis, predictions, and other things. You can find the code referenced in this document, with further documentation and constant updates on GitHub here.

District Points Analysis.pdf (104 KB)

1 Like

Very well written and interesting paper. I was wondering how you deal with inter-season predicions for teams with 2 or 3 years experience where they have multiple seasons but can’t satisfy the 4 section method that you outlined in the doccument?

At this point, I just use an Adjusted DP of zero for any year a team did not participate. Since the Inter-Season Score weights have a relatively steep drop off for the most recent seasons, a second year team would still have 68% of their score just from competitions one year. The other 32% that is all zeros is a ranking “punishment” for not having competed in multiple seasons meaning they probably still have a lot to learn. Once they’ve competed for 2 years, they’re only missing 14%. By 4 years it’s down to 5%. If you perform well at your competitions it’s not hard to overcome the “lost” points after the second year, but they give a small bonus to those teams that have been doing well consistently for years.

If you have a better idea, I’m all ears.

Interesting paper, although we are part of a District, so it sounds more obvious to me.

I would like to see more supporting evidence in the write-up. Specifically, you should present some data to justify the numbers for your Inter-Season Score. Also, some evidence of the predictive power would be nice.

I don’t necessarilly have a better idea but it may be worth looking into a system where you just count their previous season without a “penalty”. I can think of lots of cases where it would make more sense to just use the previous year without a penalty (ie: 2056, 5406 rookie seasons). Though this may just be me thinking of outliers with abnormally good rookie seasons and unprecidented sustainability. I would be interested to see whether the “penalties” in your current system add to the accuracy across all teams or whether it skews the results towards older teams.

I do have some data showing the predictive power of the District Ranking model, but I chose not to put it in the document because hopefully the predictive power will improve as I fine-tune the model. Early in the development I did a small study of CMP performance (measured by DP) based on pre-CMP performance (based on Adjusted DP) using 2018 data. I found about a 40% coefficient of determination between the two. That allowed me to predict 72% of CMP matches correctly using my MatchPredictor, described in the GitHub readme. I do plan on taking a closer look at how powerful this model truly is, and I will release those results when I have them.

I got the scaling coefficient by predicting the results for 2016, 2017, and 2018 using the previous seasons, finding the scaling coefficients that give the lowest RMSE (root mean square error) while still adding to 1, and averaging the coefficients. The “penalty” that I described is just my way of explaining what the coefficient finding process produced, not an inherent requirement for the model. In theory if a team’s performance this season were only based on the previous season as you are suggesting, the coefficients I found would reflect that and weight the previous season even higher than currently.

Just for fun, I tried checking the RMSEs just using teams’ previous season scores as the prediction. All three RMSEs were significantly higher than using the exponentially decreasing method I described in the report.

OK, thanks for the response! Very interesting stuff.

Just to clarify, I was only suggesting using just one year’s data in the case of second year teams.

Ah ok, I misunderstood your suggestion. I tried testing this too and it didn’t seem to make a big difference. Two of the RSMEs were 0.01 lower (better), and one was 0.1 higher (worse). For a sense of scale, the original RSMEs were 16.00, 16.93, and 16.69 (units are Adjusted District Points). So those small differences aren’t statistically significant compared to the difference between years.

Cool stuff, I’m looking forward to competing with you next season to see who can make better predictions. :wink:

Does anyone have a whitepaper/video describing exactly how the district point system was developed? I feel like I remember a video where Jim Zondag talked about how they developed it for FiM back in 2008 but I have no idea where that would be or if better resources exist on this.

I look forward to losing to you :grin:

Sorry my response took so long because of the update but thanks for communicating this. Once again, very interesting stuff. Best of luck against Caleb… you’ll need it!

For anyone who’s interested, I opened my new website last night which features a continually recalculating district points ranking. It should update every night at 5am here (aka 10pm Eastern). You can also look at the end of season ranking from years past.