View Single Post
  #7   Spotlight this post!  
Unread 06-06-2015, 20:17
wgardner's Avatar
wgardner wgardner is offline
Registered User
no team
Team Role: Coach
 
Join Date: Feb 2013
Rookie Year: 2012
Location: Charlottesville, VA
Posts: 171
wgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to beholdwgardner is a splendid one to behold
Re: Overview and Analysis of FIRST Stats

The attached png file shows some interesting data.

This is for the 2014 casa tournament with simulated data using the model from the paper with Var(O)=100 (or stdDev(O)=10), Var(D)=0, and Var(N)=3*Var(O). The tournament had 54 teams, so each team got to play 1 time every 9 total matches. The tournament had 108 total matches, or 12 matches played by every team.

Each of the first 4 plots shows the estimated OPRs vs. the number of matches played per team (so X=1 means 9 total matches, X=2 means 18 total matches, etc.). The data points from 1-12 on the X axis correspond to 1 match per team, ... up to 12 matches per team (the whole tournament). The 13th point on the X axis is the actual underlying O values.

Plot 1 corresponds to the traditional Least Squares (LS) OPRs, which is also the MMSE solution where Var(N) is estimated to be equal to 0. Note that there are no OPR values until each team has played 4 matches, as that's the number of matches needed to make the matrix invertible.

Plot 2 corresponds to the MMSE OPR estimates where Var(N) is estimated to be equal to 1* Var(O). As the actual Var(N)=3*Var(O), this is underestimating the noise in each match.

Plot 3 corresponds to the MMSE OPR estimates where Var(N) is estimated to be equal to 3* Var(O) (the "correct" value).

Plot 4 corresponds to the MMSE OPR estimates where Var(N) is estimated to be equal to 10* Var(O), greater than the actual noise.

Plot 5 shows the percentage error each curve has in estimating the actual underlying O values.

Comments:

The LS OPR values start out crazy and then settle down a bit. Looking at the step from X=12 (the final OPRs) to X=13 (the "real" O values), you can see that the final OPRs have more variance than the real O values. This means that the final OPRs are still overestimating the variance of the abilities of the teams.

Look at the X=1 points for Plots 2-4. The MMSE estimates start conservatively with the OPRs bunched around the mean and then progressively expand out. Plot 4 shows the noise overestimated (the most conservative estimate), so the OPRs start out very tightly bunched and stay that way. Plot 2 starts out wider, and Plot 3 starts out in the middle.

Interestingly, you can see that each X=1 point for the MMSE plots have 3 teams with the same estimate. This makes sense, as after having played 1 match, the 3 teams on each alliance are indistinguishable from each other and it requires more than 1 match played by each team to start separating them.

Look at the X=12 (the final estimates) vs X=13 (the "real" O values) points for Plots 2-4. Plot 2 looks like it's still over estimating the variance, Plot 3 has it about right, and Plot 4 has underestimated the true variance even at the end of the tournament (you see the Plot 4 OPRs expand out from X=12 to X=13). [Edit: checking the numbers for the run shown, the variances of the OPRs computed by LS, MMSE 1, MMSE 3, and MMSE 10 were respectively 164, 138, 102, and 47, confirming the above comment. The MMSE 3 solution using the "right" Var(N) estimate is quite close to the true underlying variance of 100. Over multiple runs, the MMSE 3 solution is slightly biased under 100 on average, showing that more matches are needed for it to converge to the "right" variance. All of the techniques do eventually converge to the right solution and variance if the tournament is simulated to be much greater than 108 matches.]

In Plot 5, the performances of the different techniques get close to each other as the tournament nears completion. They should all converge as the number of matches grows large as the LS and MMSE solutions will eventually converge to each other. But they are off by quite a bit early on. Even though the MMSE 1 solution with Var(N) underestimated at 1*Var(O) is underestimating the Var(N), it still gives pretty good results.
Attached Thumbnails
Click image for larger version

Name:	OprComparison.PNG
Views:	53
Size:	63.8 KB
ID:	19093  
__________________
CHEER4FTC website and facebook online FTC resources.
Providing support for FTC Teams in the Charlottesville, VA area and beyond.

Last edited by wgardner : 06-06-2015 at 20:40.
Reply With Quote