As Karthik has said in one of his lectures, “a lot of people are just like: ‘what’s the average points per match? what’s the average?’” In Karthik’s case, he was advocating for people to pay attention to standard deviations: for people to care about the spread of a team’s performance as well as measures of central tendency like the mean. I’d like to approach what Karthik observed from a different angle: teams are always saying “what’s the average?”, and I’d like to encourage them to ask: “what’s the median?”*

Let’s be real: it’s extremely unlikely for any official competition to have more than 12 qualification matches per team. The Waterloo regional had 13 qual matches per team, but I can’t think of any other official event with that many matches per team. So, when we go to make statistical decisions about teams for match strategy and for picklisting, teams ought to remember that they are working with a **maximum** of n=12: often much less. Usually important picklisting decisions are made with data in the realm of 8 or 9 matches, even at district events. I haven’t competed in the regional system yet, but I expect that regional teams picklisting the night before alliance selection are working with even less data.

That sample size matters when it comes to outliers. The median is resistant to outliers, while the mean isn’t. If you rely on the mean alone, you are choosing to be influenced strongly by outlier matches. In some situations, this can be valuable, but for most alliance captains, and when making strategy for most qualification matches, teams are more interested in the likely outcome of a team rather than their tail results. It’s not like the median is difficult to calculate, either. More teams than ever use complicated electronic scouting systems, and advanced statistics like OPR are commonplace among teams making any sort of strategic assessment. If you’re doing the linear algebra to calculate OPR, how hard is it to type =median(range) into excel?

Of course, the mean has its place. It’s a good measure of central tendency, and it has the advantage of combining all the available data points into one summary statistic rather than selecting one representative (or splitting the difference between two.) The mean is the basis of most methods of statistical inference, so if teams want to use z-scores, confidence intervals, or p-values in their scouting and strategy decisions the mean is critical. But how many FRC teams use statistical inference? I remember seeing z-scores in 1678’s scouting documentation from 2016, and I know that 1712 used gear confidence intervals and tried to approximate the probability of a 4-rotor or 40 kPa match in 2017, but realistically most teams aren’t building statistical inference into their scouting decisions.

Spending half a minute to add the median to your analysis toolbox is going to be more valuable than spending half a week to add OPR to your analysis toolbox. That’s not to say that OPR isn’t a helpful statistic (which is an entirely different conversation), but rather that the median is heavily undervalued for the minimal time and effort it takes to use effectively. Teams should care about the median, not just the mean.

- Yes, the median is technically an average. But it’s pretty obvious that people asking “what’s the average” are asking about the arithmetic mean.