2015-2016 correlation coefficient matrix

I am continuing to develop tools to help me make a predictive model of FRC matches. Here, I compare the calculated contributions for a bunch of 2015 categories and compare them with contributions for 2016 categories.

I have attached a correlation coefficient matrix which compares week 1 2016 contributions of all 2016 non-rookies that competed in week 1 with their 2015 contributions from their last official event. I didn’t bother computing some of the derived categories published in the 2016 4536 scouting database such as subtracted tower strength.

Essentially, this gives some indication of how the 2016 game tasks were related to the 2015 game tasks. Most of the correlations are positive, and the strongest correlation is often “total points” or OPR. For 2017, after we have completed week 1 events, I will probably use calculated contribution to total points to create estimated calculated contributions for all teams which did not compete in week 1.

I didn’t think this effort was worth a whitepaper, but I thought I would attach it in case anyone is interested.

2015-2016 correlation coefficient matrix.xlsx (16 KB)


2015-2016 correlation coefficient matrix.xlsx (16 KB)

Thanks for sharing this. I think it’s very cool to see how the more complex game elements in 2015 (auto points) had the strongest correlation with the most complex game elements in 2016 (scaling).

I recently took an interest in FRC data analysis and was wondering if there were any adjustments you made when calculating contribution for game elements that have a commonly reached ceilings (ex. defense crossings, breach achieved). The calculated contribution values for these statistics seem to bunch up near the average value because the same result is reached in most matches. For example, with defense crossings at champs divisions, all of the calculated contribution values are between 2.2 and 3.2 when there were clearly some teams that focused more heavily on breaching than others.

Did you run in to this problem? If so, what techniques would you recommend for getting more accurate/adjusted calculated contributions?

That was one of my favorite insights as well.

I recently took an interest in FRC data analysis and was wondering if there were any adjustments you made when calculating contribution for game elements that have a commonly reached ceilings (ex. defense crossings, breach achieved). The calculated contribution values for these statistics seem to bunch up near the average value because the same result is reached in most matches. For example, with defense crossings at champs divisions, all of the calculated contribution values are between 2.2 and 3.2 when there were clearly some teams that focused more heavily on breaching than others.

I did not treat any categories differently based on the likelihood that they would occur. You are correct though that calculated contributions have less value when looking at categories which happen incredibly frequently or incredibly infrequently. Defense crossings last year was a great example.

Did you run in to this problem? If so, what techniques would you recommend for getting more accurate/adjusted calculated contributions?

I would prefer not to change how I determine normal calculated contributions, because they mean something very specific mathematically, and they would lose that meaning if we performed adjustments. I might be willing to provide supplemental categories with adjustments, but based on the data the API provides, I really can’t think of a good way to, for example, determine which teams spent more time doing defense crossings if nearly every match has close to the same number of defense crossings. I’d be interested in ideas though if anyone has any.

However, if I were trying to predict the matches in which breaches or captures would occur, I would likely proceed the following way:
First, I would try to pull in as much relevant information as possible. For breaches, that would include looking at the “A crossings,” “low bar crossings,” etc… categories. For the capture, that would include looking at the “subtracted tower strength” and “challenge or scale count” categories.
Next, I would find the best way to add each team’s calculated contribution for each of these categories together to create predicted average points for each category. The easiest way to add them together would be to just have predicted score p = a + b + c, but I can imagine situations where it would be beneficial to add contributions in log space (ln(1+p)=ln(1+a)+ln(1+b)+ln(1+c)), in quadrature (p^2=a^2+b^2+c^2), or with a weighted sum (p=kAa+kBb+kC*c).
Then, I would look at the correlations between categories. Is a breach more likely for an alliance if they have a high predicted C crossing score and low predicted A crossing score or if the alliance has an average C crossing score and an average A crossing score?
Finally, I would add in uncertainty to come up with a likelihood of a breach or a capture.

My long list of things to look at before week 1 events does include predicting breaches/captures from last year in preparation for predicting 4 active rotors and pressure threshold reached this year.

Here is another correlation coefficient matrix, this one using 2016 exclusively. It compares 2016 scoring categories with each other.

Although the result is in the same form as the 2015-2016 matrix, the methodology to get this was dramatically different. Essentially, I made predicted scores for every category of each week 1 match and recorded the errors between my predictions and the actual result. This matrix represents the correlation coefficients of the errors between categories.

2016 correlation coefficient matrix.xlsx (13.8 KB)


2016 correlation coefficient matrix.xlsx (13.8 KB)

Small error in the above attachment, “teleop Boulders Low” actually has a stronger correlation with “teleop Tower Captured” than does “teleop Boulder Points.” This error is corrected in the attached sheet.

2016 correlation coefficient matrix.xlsx (13.8 KB)


2016 correlation coefficient matrix.xlsx (13.8 KB)

Here are the correlation coefficients between 2017 calculated contribution categories. Try saying that five times fast. Only each team’s most recent event is used.

2017 category comparison.xlsx (30 KB)


2017 category comparison.xlsx (30 KB)

Very interesting data!

It’s interesting that “teleop points” has a 91% correlation with “teleop takeoff points” :stuck_out_tongue:

I ran correlations with OPR in 2012 vs. 2013 and 2013 vs. 2014, and some component correlations as well for preseason scouting. In those years, the correlations were in excess of 0.60, so the 2015-16 correlations appears to be weaker, perhaps because of the differences in the games.

#DevalueClimbing

#MakeHavingMultipleGameObjectivesRelevantAgain