

"I'm not going to tell you all that you all are winners. At this point you are smart enough to know whether you are or you aren't." Woodie Flowers  Barry Bonzack [more] 



Thread Tools  Rate Thread  Display Modes 
#1




paper: Miscellaneous Statistics Projects 2018
Thread created automatically to discuss a document in CDMedia.
Miscellaneous Statistics Projects 2018 by Caleb Sykes This whitepaper is a continuation of my miscellaneous statistics projects whitepapers from last year. For those not familiar, here is a summary of why I do this: I frequently work on small projects that I don't believe merit entire threads on their own, so I have decided to upload them here and make a post about them in an existing thread. I also generally want my whitepapers to have instructions sheets so that anyone can pick them up and understand them. However, I don't want to bother with this for my smaller projects. I have decided to make a new thread this year in order to not overload my other thread with too many whitepapers, and because I will be analyzing 2018 specific things here. As always, feel free to provide feedback of any kind, including pointing out flaws in my data or my analysis. 
#2




Re: paper: Miscellaneous Statistics Projects 2018
My first book for this year is an investigation of what my Elo model might look like if I tried to incorporate nonWLT RPs. This idea was spawned by posts 4144 in this thread. This workbook has identical data as my normal FRC Elo book for 20052015, but from 20162018, I make adjustments to incorporate the other ranking points. I was unable to find a nice data set to use for 2012 coop RPs, if someone knows of one, let me know and I might try to do this same analysis for that year. For each year in 20162018, there were two additional nonWLT ranking points available in each quals match. In 2016 and 2017, the tasks required to achieve these ranking points were also worth bonus points in playoff matches.
The concern that spawned this effort is that, in quals matches, many/most teams are not strictly trying to win, but rather are trying to maximize the number of ranking points they earn. Without some kind of RP correction, this means that teams who are good at earning these RPs might be underrated by Elo, since they might be more likely to win matches if they weren’t expending effort on the RPs. Additionally, since playoffs had different scoring structures than quals in 20162017, the teams that do well earning these RPs in quals will presumably be even more competitive in playoffs due to the bonuses. My approach for this effort was to find the optimal value to assign to the qualification RPs, and to add this value to teams’ winning margins for the quals matches in which they achieve this RP. I wanted to find the optimal value for each of the six types of RPs between 20162018. Although there are other approaches to incorporating RP strength into an allencompassing team rating, I always prefer to use methods which can be used to maximize predictive power over methods that don’t, since I can justify why I chose the values I did over just taking guesses about how much different things are worth. There are a few different metrics I could have chosen to optimize, but I settled on overall playoff predictive power over the full period 20162018. I chose to optimize for playoff performance since in playoff matches teams are almost strictly just trying to maximize their winning margin (or win). This contrasts with quals matches where teams may have other considerations for the match, potentially including going for RPs or showing off so they are more likely to be selected. I also chose to maximize predictive power over the full 20162018 instead of each year separately since Elo ratings carry over some between years, so the optimal value for 2016 RPs when maximizing predictive power for 2016 alone will be a bit different than the optimal value when maximizing over all three years since the latter will look at how well the rating carries over between years. Here were the optimal (+20% or 1 point, whichever is greater) values I found for each of the 6 RPs, measured in units of their respective year’s points: 2016 Teleop Defenses Breached: 2 2016 Teleop Tower Captured: 8 2017 kPa Ranking Point Achieved: 80 2017 Rotor Ranking Point Achieved: 40 2018 Auto Quest Ranking Point: 7 2018 Face The Boss Ranking Point: 45 All of these values are positive, which indicates that on average teams that get these RPs in quals are more likely to do better in playoffs than similar teams who do not. You can see the effects of these adjustments by looking at the attached book and looking at the “Adjusted Red winning margin” column. This value should be equal to the red score minus the blue score with additional additions/subtractions depending on the RPs both alliances received. For example, in 2018 Great Northern qm 31, blue wins 305 to 288, so red’s unadjusted winning margin is 17. Red got the auto RP and blue got the climb RP in this match though, so after accounting for these, red’s adjusted winning margin is 17+745=55. Here are my probably BS rationalizations of why these RPs have the values above: 2016 Teleop Defenses Breached: It really doesn’t surprise me that this value is so low. Teams tended to deal with the defenses in quals in much the same way they dealt with them during playoffs. Although there was a 20 point bonus in playoffs for the breach, any alliance worth their salt was going to get this anyway, so a team that got this RP consistently in quals wasn’t set up to do that much better in playoffs than a similar team who got this RP less consistently. 2016 Teleop Tower Captured: I don’t want to analyze this RP too much since its definition changed for championships, an event where teams were getting this RP much more frequently than a standard regional/district. I wouldn’t have expected this value to exceed 10, since it generally took at least a pair of competent scorers to get 8 or 10 balls, and the 20 point playoff bonus divided by 2 is 10. I don’t think teams would have played much differently in quals if this RP had not existed, except maybe being more conservative in the last few 30 seconds to make sure everyone surrounded the tower. 2017 kPa Ranking Point Achieved: This is by far the RP that had the most value. There are a couple of reasons I think it is so high. To start, there was a 20 point playoff bonus for this task that was unavailable in quals, and unlike the teleop tower captured in 2016, getting this RP was generally an individual effort, so a team that gets this RP consistently in quals should be worth at least 20 points more in playoffs than a similar team that does not. On top of this, because there were so few ways to score additional points in playoffs, the 4070 fuel points scored in playoffs are in a sense more valuable than the points scored with other methods. There were diminishing returns on gear scoring after getting the third rotor, and no value at all in scoring gears after the fourth rotor, and there’s not much teams could do to get more climbing points except potentially lining up a bit earlier to avoid mistakes. Fuel points though were unbounded, so a team that consistently got the kPa RP in quals was going to be so much better off in playoffs just because they could get 6090 points that were unachievable for a nonfuel opposing alliance. 2017 Rotor Ranking Point Achieved: Similar to 2016 teleop tower captured, I think most of the value of this RP comes from the playoff bonus of 100 points. This task required at least two competent robots to perform, which means I would have expected the value of this RP to be bounded above by 50. I don’t think the strategy changed much in playoffs due to this RP, since the goal of 40 points + RP in quals is comparatively lucrative to 140 points in playoffs. 2018 Auto Quest Ranking Point: I expected this RP to be worth around 5 points and I was correct. Teams likely opted for higher risk and higher average reward autonomous modes in playoffs than they did in quals because they could afford to have one robot miss out on the crossing or be okay with not getting the switch if they could get one more cube on the scale. This wasn’t a huge effect but it does exist. 2018 Face The Boss Ranking Point: I expected the value of this RP to be around 20 points because there is no playoff bonus for this task and I didn’t think the opportunity cost was particularly high, although certainly higher than the auto RP. This was the value that most surprised me at 45 points. In my original analysis, I was thinking of the opportunity cost of going for the climb RP, not the extra value of a team implied by said team achieving the climb RP. I think the distinction is important because relatively few teams were able to consistently achieve the climb RP, and the teams that did so were generally very competitive teams. This means that in the playoffs they can afford to spend a few more seconds scoring elsewhere in the field before going for the climb, and can climb much faster on average than teams that were not consistently getting the climb RP in quals. If I had thought about it more from this perspective, I might have predicted this RP to be worth around 30 points instead of 20. The remaining 15 still surprises me though, one possible explanation is that this value is overrated since we haven’t had the 2019 season yet, so the model doesn’t properly account for teams’ future success. Overall, this was an interesting analysis, but I will almost certainly not be incorporating a change like this into my Elo ratings moving forward for the following reasons: The adjustments made here do not provide enough predictive power for me to consider them worthwhile. These adjustments improved the Brier score for playoff matches in 20162018 by about 0.001. I would have needed it to be at least 0.003 to consider it worthwhile, since I am reasonably sure there exist other improvements to my model which can provide this much or more improvement. We have no guarantee that future games will have similar RP incentives. I try hard to keep the number of assumptions in my model to a minimum. I do this because I want my model to be valuable even when we get thrown a curveball for some aspect of the game like we did this year for timedependent scoring. Assuming we will continue getting games with this RP structure is just not a very good assumption in my opinion. There isn’t a clear way to find good values to use for the RPs during the season in some years. I am backfitting data right now so I have a good sample size of quals matches where teams get the RPs. However, if we get a game like 2017 again, where we didn’t get above a 2% success rate for either RP until week 4, there just wouldn’t be a good sample size of matches to use to find good values until late in the season. 
#3




Re: paper: Miscellaneous Statistics Projects 2018
Continuing with my investigation I did last year of autonomous mobility. I thought it would be interesting to look at auto mobility rates for every year since 2016. I have attached a book titled "20162018_successful_auto_movement" which provides a summary of this investigation. For each team that competed in 2018, it shows their matches played, successful mobilities, and success rates for each year 20162018. It also contains these metrics in aggregate over all of these years as well as a reference to the first match in which the team missed auto mobility. I counted both "Crossed" and "Reached" in 2016 as successful mobilities.
Note that this is using data provided by the TBA api, which pulls directly from FIRST. So there are certainly some matches where teams are incorrectly assigned auto mobility or not. There are many possible reasons for this, but one of the ones I identified last year was that referees at some events were entering mobilities based on team positions and not team numbers. Here's a fun graph of 2017 auto mobility rates versus 2018 auto mobility rates: Here are the teams that have competed each year 20162018, and have never missed auto mobility points according to this dataset: Code:
team matches 1506 149 5554 112 4550 86 4050 77 5031 69 3061 61 6175 59 6026 55 1178 48 3293 45 4462 39 6054 36 6167 35 5119 35 6155 35 5508 34 884 32 3511 32 4630 30 4728 29 6164 28 1230 28 2264 28 4054 27 4648 26 5171 26 
#4




Re: paper: Miscellaneous Statistics Projects 2018
I'm looking to make predicted schedules soon to use for a couple of projects. I would like the capability to do this even before the total number of matches at the event is known. With this in mind, I have attached a simple book which looks at, for every 2018 event, the number of teams at the event versus the number of qual matches/team.
Here is a plot for all events: And here is a plot for regional events: I'll likely just set district events (including district champs) to 12 matches/team, champs events to 10 matches/team (although I may change this depending on the structure of champs in future years and the game), and regional events matches according to the formula matches/team = 17.0(0.13*teams). 
#5




Re: paper: Miscellaneous Statistics Projects 2018
I just uploaded a workbook called "2018_schedule_strengths_v1".
I'm planning to make a new thread soon to discuss "strength of schedule" in FRC, so I made this book to hopefully inform that discussion. I labeled it v1 because I imagine I'll need to go back and calculate other metrics as the discussion in my upcoming thread progresses. Essentially, all I did was run my event simulator at each event twice, once before the schedule was released and once after. By looking at each team's ranking distribution change in this time period, we can pinpoint what effect exactly the schedule had. At least that's the idea anyway. I have some summary statistics of each team's ranking predictions for both of these periods, as well as the changes between them included. Additionally, I have what is my first pass at a "strength of schedule" metric. I calculate this by finding the probability that the given team will seed better with the actual schedule than they would have with a random schedule. So a "schedule strength" of 0% means that you will never seed higher with the existing schedule than you would have with a random schedule, and a "schedule strength" of 100% means that you are guaranteed to seed higher with the actual schedule than you would have with a random schedule. What I like about this metric: It compares the given schedule against other hypothetical schedules It is customized for each team, that is, it compares your hypothetical results with a random schedule with your hypothetical results with the given schedule. I'm not the biggest fan of teamindependent metrics since, for example, a schedule full of buddy climb capable partners is amazing for a team without a buddy climber, but just alright for a team that has a good buddy climber, and teamindependent metrics would have to give the schedule a single score for both of these teams. It's on an interpretable scale (0% to 100%) and has meaningful significance It's able to be calculated before the event occurs (I don't like metrics that require hindsight unless maybe we want to use SoS as a tiebreaker for something) What I don't like about this metric: Requires a full event simulator to calculate Teams that are basically guaranteed to seed first (like 1678 at their later regionals) will inevitably be shown to have bad schedules, since there is no schedule that would give them much of a better chance of seeding higher than their expectation (1st). Switching to greater than or equal ranks just flips the problem to high scores instead of low scores for these scenarios Average value is 48.1% instead of 50% Anyway, feel free to use this as proof of how bad your schedule was. The worst schedule this year according to my metric were (excluding the expected 1 seeds): 2096 on Hopper 4065 at Orlando 6459 on Roebling And the best schedules were: 2220 on Archimedes 5104 on Newton 1806 on Turing Last edited by Caleb Sykes : 07152018 at 04:05 PM. 
#6




Re: paper: Miscellaneous Statistics Projects 2018
Quote:

#7




Re: paper: Miscellaneous Statistics Projects 2018
Quote:
Where r and q are ranks and are summed over all ranks and all ranks greater than r respectively. Changing the second summation to over all q>=r would provide a very similar metric, just that it would err on the high side instead of the low side. For example, say that before the schedule is released a team is predicted to have a 20% chance of seeding first, 30% second, and 50% third. After the schedule is released, they have a 5% chance of seeding first, 25% second, and 70% third. Their schedule strength would then be: 0.05*(0.3+0.5)+0.25*(0.5)=0.04+0.125=0.165 or 16.5%. Looks like a pretty strong correlation to me. Excepting the teams which are heavy favorites to seed first it seems to be doing it's job. 
Thread Tools  
Display Modes  Rate This Thread 

