Quote:
Originally Posted by Brian Maher
Interesting numbers, thank you for sharing! It's interesting to compare and contrast this list with the more qualitative evaluations shared on this thread. I'm a bit of a stat geek and have wondered about quantifying Chairman's chances for a while now (just for fun, of course), so I have a few questions:
1. How is CPR calculated? Is it based exclusively on Chairman's or are other awards and factors considered?
2. Exactly how accurate is it in predicting Regional/District/DCMP Chairman's Awards?
3. How does it do with predicting historic Championship Chairman's Award winners?
4. How much of an improvement is CPR over more naive metrics, such as total previous Chairman's Awards?
5. How well does it anticipate "outsiders" who have never won Chairman's receiving the award at the Regional/District level?
I'd love to tinker with your algorithm/code if you'd be willing to share.
|
1. To quickly summarize, CPR is calculated based off of Chairman's and Engineering Inspiration wins in the past four years (inclusive of the current one). Chairmans and E.I. both have base point values (currently 1 and .33) that are reduced then added to a team's CPR based off of when that award was won (i.e. winning Chairman's in 2016 is 1 point, winning Chairman's in 2015 is 0.75 points, 2014 is .5, ect...). There's also a point bonus for winning Chairman's at an event with another "high" CPR team present. I've experimented with also awarding points for streaks (winning multiple Chairman's and/or E.I.'s in a row), but I've found it tends to cause fluctuations in accuracy. I'm always tweaking the point values for everything and attempting to add new ways of increasing accuracy, but this is currently the most accurate combination of factors I've gotten.
2. For regionals and district championships, the script is fairly accurate at guessing the winners/likely winners. It guessed the correct Chairman's and E.I. winners (1st on the rankings being the Chairman's winner, 2nd often being E.I.) for numerous events and when it was wrong, was within a rank or two of being 100% correct (i.e. rank 3 winning Chairman's). It tends to be more inaccurate at guessing district events due to the larger amount of "upsets" and sometimes lack of any teams with a Chairman's background. As a typical rule, the more Chairman's history that exists for the teams at an event, the more accurate the predictions will be.
3. Champs predictions are more difficult because at champs, the winner is chosen based off of the quality of their submissions and outreach, which my script doesn't (and can't) consider when computing CPR. However, the script is very good at supplying a small list ranging 10-20 teams that "might" win the award. For at least the past 10 years, the HoF winner has been on the list the script provide, and is usually in the top 10 at the very least. For example, 987 was 8th for 2016's predictions, 597 was 10th for 2015's, 27 was 4th for 2014's, and 1538 was 3rd for 2013's.
4. I would say CPR is quite a bit better than just counting Chairman's awards. The script actually started off initially as a way to quickly display how many Chairman's awards the teams at an event had won, and the "predictions" it helped generate then were a lot more inaccurate then what CPR gives now.
5. This is probably the biggest weakness of the current script. All of it's stats are in some way based off of the team's Chairman's and E.I. history, so it never expects a team with a very weak (or non-existent) Chairman's record to win.
I'll be making the code public soon. I'm actually in the process of re-writing most of it to be user-friendly and readable by another developer. If you have any more questions about it (or want to see predictions for certain event(s) in the past or future) just let me know

I may start posting CPR predictions for 2017 regionals once they start getting closer, though I've already been running some internally.