View Single Post
  #3   Spotlight this post!  
Unread 07-09-2016, 21:51
Siri's Avatar
Siri Siri is offline
Dare greatly
AKA: 1640 coach 2010-2014
no team (Refs & RIs)
Team Role: Coach
 
Join Date: Jan 2008
Rookie Year: 2007
Location: PA
Posts: 1,612
Siri has a reputation beyond reputeSiri has a reputation beyond reputeSiri has a reputation beyond reputeSiri has a reputation beyond reputeSiri has a reputation beyond reputeSiri has a reputation beyond reputeSiri has a reputation beyond reputeSiri has a reputation beyond reputeSiri has a reputation beyond reputeSiri has a reputation beyond reputeSiri has a reputation beyond repute
Send a message via ICQ to Siri
Re: paper: Stop the Stop Build

Quote:
Originally Posted by Lil' Lavery View Post
[Figure 2]...I would like to see the actual loss ratios for each bucket rather than just raw totals.

[Figure 5]...By removing 1-event teams from the 2nd even population, you're narrowing the sample to the teams that had the resources to compete twice and introducing a selection bias. There are even stronger selection biases with multiple event teams once you start factoring in teams that attended their district championships and/or FRC championship.

[Figure 6] This selection bias is demonstrated in fig(6). Teams playing 1 event have a lower OPR at their first event than teams playing 2 events...
Agreed. I'd like to suggest a new multi-panel figure that shows the event-over-event OPR distribution separately for each population segment. So FigNa would be the 1st and 2nd event curves for 2-event teams, Nb is 3 curves for 3-event teams, etc. It may also help to normalize the y-axis by percentage of given population, and I'm sure there are ways to improve this suggestion with better cross-comparison. Still, it seems like we're making a big leap that we have (well, Jim has) the data to fill.

I'll also echo the desire for loss ratios by OPR bucket for Figure 2. The probably has a lot of noise, though, and if it's possible the case would likely be stronger by normalizing the OPRs and aggregating multiple years. I don't know what your database looks like though, so this might be a pain.

I think there's also a way to address the questions that arise with Figure 6, but I'm not sure what it is yet. There should be a way to directly handle the relative difference in OPR between the populations versus the changes in each over time (demonstrating the salience of each factor). Similar to what Sean mentioned, for 2- versus 3-event teams, the fact that you are a 3-event team appears to be almost as useful if not more so than actually playing your third event--I would guess largely because you're a team that's going to qualify for DCMP based on your prior performance (or CMP). This is not to dismiss the paper's Point 5 that the figure is supporting, but the data is interesting.

Overall, I think this case could benefit from talking more about the dataset. In OPR progression, how many DCMP and CMP performances are in Figure 5's green 3rd event line versus just being a 3rd "normal" (district or regional) event? Is there enough data from "normal" 3rd events to look at this directly, or do we have another proxy adjustment available? Dropping teams that didn't qualify for DCMP is certainly going to shift the OPR distribution regardless of play number.
__________________
Reply With Quote