What data points do you want to see, but haven’t yet?

What are some of the coolest data points, graphs, or charts you want to see, but haven’t yet due to not knowing where to look or the amount of raw data you’d have to sift through?

I’ve always wanted to see leaderboards for the more obscure awards or figure out the correlation between primary team color and EPA.


I wanna see who has the most imagery awards so we can take them down (in a graciously professional way).


I want to see Cost of Entrance Fees / Number of Matches per team per year

And cost of entrance fees / Number of Wins

The denominator is obviously easier to get because of TBA. Numerator would take some leg work.


I’m pretty sure you can get matches played from a statbotics CSV download, you’d just be downloading a ridiculous amount of data alongside it

Data I’d love to see but probably never will:

  • Actual data on submitted awards. How do submission rates vary between awards or across regions or year to year?
  • Volunteer data. What do volunteer demographics look like across the program? How about in specific regions? Or in specific roles? Has the size volunteer base contracted post-COVID?
  • Mentor data. What does the distribution of registered mentors per team look like? Are there any correlations between number of mentors associated with a team and the team’s longevity?
  • More granular financial data. What does HQ support for different regions or districts look like? How efficient is funding regionals vs districts?
  • Non-anonymous usage reporting data. I don’t really get why usage reporting is anonymous but again this is a source that I’d love to look for trends in. Do LabVIEW teams have a higher probability of folding? How about teams who are still tank drive?

I’d love to see stats about LabVIEW teams compared to teams using other programming languages

1 Like

you guys don’t know but this guy is the actual data science goat bruh put some respect on his name

Part of the problem you may find in mentor or even student data is whether it is truly submitted.
How many people are on the roster, and how many mentors may be held over or not actually show up/help?

I would love to find a better way to pull more specific stats, whether searching for every team that’s competed in a region. (say for example in NY 250 different teams have signed up, or lists of teams each year broken out more easily. I know you can pull the data and set some of this up but not easily I believe.)

1 Like

James, pchild posts his data here quite often.

Some sort of robot similarity index.

Compare robots based on:

  1. Appearance/Subsystems
  2. Deanonymized Usage/Code Releases
  3. Scoring metrics

Building upon A/B comparisons, having a seasonal robot diversity index that shows which seasons robots were most similar or least similar.


Two reasons:

  • Teams who consider such information their “secret sauce” (even though it really shouldn’t be!) and thus would expend effort to falsify or disable it entirely, which makes the data useless for its primary purpose (informing the software developers to focus on certain things / make decisions)
  • Bandwagon effects/false equivalence: “Team X used Y, so we should too!” “Teams on Einstein used Y, so clearly that’s the best thing to use” Etc. Yes, some teams publish this info, but many do not, so see the previous point.
1 Like

Are there such teams? I struggle to see how there’s much “secret sauce” that could be revealed from the data fields collected in usage reporting. The only instance of fake reporting that I can think of is a joke.

For anything hardware related this happens regardless. I’m not convinced that anonymizing usage data has any meaningful impact on it.

Could be helpful seeing oh team X uses these items and being able to reach out if you had a problem.

1 Like

I do too. But the issue is that it’s all about perception. What you and I think doesn’t matter, it’s what individuals on teams choose to believe–and we don’t have a good way to find that out for all ~4000 teams. Considering the number of teams who keep their (even very simple) code private post-season even though there’s no secret sauce to protect… :person_shrugging:

1 Like

Here’s the top 20, counting just those who have won specifically an award with “Imagery” in the title (i.e. I didn’t count “most photogenic”.)

And here is a table of the number of teams with each number of Imagery award.

Code I used to generate this is here.

1 Like

Ayy Im part of the 5 Imagery Awards Club!!

1 Like

what highest → lowest epa graphs look like per region and the distribution of different levels of teams

1 Like

Results of the post-event surveys (specifically event and season satisfaction) broken down by districts vs regionals.


To add on:

  1. What inspires people to volunteer the first time?
  2. What causes volunteers to continue for their second, third, etc. event?
  3. What causes volunteers to NOT return?

Would also love to see event/region feedbacks. Could be helpful if people might want to get involved/might be able to help improve other people’s experience.