This is a really well written paper, thanks for putting it together!
I have some questions about how to choose VarD/VarO and VarN/VarO since I'm unfamiliar with MMSE estimation. How would you go about choosing these values during/before an event?
Quote:
|
Originally Posted by Page 31
The (VarD, VarN) numbers show the values of VarD/VarO and VarN/VarO in the MMSE search that produced the best predicted outcome on the Testing data.
|
Does this method lead to the same overfitting that using the training data as the testing data did with the LS estimators? Choosing the apriori variances after the fact to get the best results seems wrong, or is the effect actually too small in reality to be a factor? It seems like each set of training data also needs to find what variances work best, and
then apply them to the testing data, instead of "searching" for the best values and applying them after the fact.
Quote:
|
Originally Posted by Page 44
// pick your value relative to sig2O, or search a range.
// 0.02 means you expect defense to be 2% of offense.
|
From this I'd expect that the values for VarD/VarO to be largely dependent on the game, yet the data shows that the "best" values depend very little on the game. For example, in the 2014 Newton Division the best values for VarD/VarO for sCPR was 0.10, but for 2014 Galileo it was 0.00! The complete other side of the search range! How can two divisions in the same year have such different values?