Predicting team startup growth

Every year a lot of people face the question “how many teams can we grow this year”. It is a complex question but for fun let’s throw a little theory at it. If someone throws out an arbitrary number as a growth target is the number pulled out of the hat or is it predictable.

Question: How fast can FIRST and VEX (plus others) grow ? What are the probable program growth limit ?

The answer may lie in some mathematical theory that describes unconstrained social networks. The reason it is an “unconstrained social network” is because participation is essentially by behavioral choice, not government or institutional mandate. Even if participation is mandated, performance will still follow the curve.

In your own experience you know some team members are ‘higher producers’ or more participatory than other members. If you could quantify ‘participation or productivity’ and then plot productivity versus number of students you would get a chart that looks like the following.

http://www.kellrobotics.org/my_images/Long_tail.png

The same plot describes a million other things, such as company financial ranking versus number of companies (plot the S&P 500), Zipf’s law of city-town distribution, popularity of girl scout cookies versus recipe, soft drink sales versus variety, all kinds of things in the physical and biological universe, the list goes on and on.

This is the basis of the “Pareto Principle (also known as the 80-20 rule, the law of the vital few, and the principle of factor sparsity) and can help you manage those things that really make a difference to your results.

So for fun I’d like to throw around some numbers – quick estimation without pulling out a slide rule. Here in Georgia there are roughly 380 public high schools but for this exercise I’ll round it off to 400.

I’m guessing maybe 40 of them will become FRC teams.

Roughly halving the cost of the program maybe quadruples the 40 number so that produces 160 FTC/VRC teams.

One more round 4 * 160 = 640 FLL teams.

So you get something looking like 40/160/640 for a FRC/ (FTC/VEX) / FLL spread. Is the model approximately correct ? Is it correlatable to other data ? For example if you looked at the states in a region for number of football teams versus highly competitive football teams. BOA marching bands versus total marching bands.

It is a neat exercise that might be useful in being a predictor in the number of teams produced in a region. To use an economist’s terms, exceeding those boundary’s would incur a higher marginal cost per team startup. I am NOT arguing that we should not pursue a team in every school but that we should have a good grip on how we approach team growth.

Some interesting stats:

California 34M population
145 teams = 4.3 FRC teams / M
145 teams / 2079 schools = 0.07 FRC teams / school

Michigan 10M+ population
132 teams = 13.2 FRC teams / M
132 teams / 752 schools = 0.18 FRC teams / school

Georgia 9M+ population
28 teams = 3.1 FRC teams / M
28 teams / 356 schools = 0.08 FRC teams / school

Having said all that it seems achieving > 50% participation of high schools in FRC/FTC/VEX is very doable and in the not too distant future. Most of the growth (in raw numbers) would be in FTC/VEX.

For the truly mathematical inquisitive you can start reading here:

http://en.wikipedia.org/wiki/Scale-free_network

http://en.wikipedia.org/wiki/Power_law

Ed

First off, I wonder if FIRST has done this type of analysis.

Second, thanks, this is really cool information. One question, instead of using the raw population might there be a benefit to using the population between 5 and 18, the people FIRST targets?

These numbers look like

California 9.365M population
145 teams = 15.48 FRC teams / M
145 teams / 2079 schools = 0.07 FRC teams / school

Michigan 2.39M population
132 teams = 45.5 FRC teams / M
132 teams / 752 schools = 0.18 FRC teams / school

Georgia 2.549M population
28 teams = 10.9 FRC teams / M
28 teams / 356 schools = 0.08 FRC teams / school

The population data was pulled from wolframalpha. I am assuming it is based on census records but haven’t really looked into it.

Again, thanks for bringing this up.

My impression … and it is only an impression is that the target numbers from FIRST are just numbers pulled out of the air, sort of the same way a sales manager at a car dealership creates numbers. But I could be completely and thoroughly wrong.

Probably irrelevant assuming acturary data across age groups are more or less consistent across the US. Meaning you could just as easily correlate to the number of 1 year old babies. What we need to look for are “comparative team/population ratios” between states/regions, not age groups. methinks…

.

I am inclined to agree now that you mention it. But just to make sure I understand what you are saying (This sort of stuff is WAY out of my league, I can get information but analyzing it is not my forte) The only way that calculating using only the specific age group would be relevant is if the age groups were heavily skewed in an area?

http://datacenter.kidscount.org/data/acrossstates/Rankings.aspx?ind=99

This site has the breakdown of “under 18 year olds” by state. Low: 21% in Maine (19% in DC). High: 31% in Utah.

That’s a 50% difference in that demographic by percentage, so I think it is worth taking into account.

(Granted, this number represents age 0 - 18, and I don’t know a lot of newborns who are participating in FIRST. But in 5 years they could be in Jr. Lego League).

EDIT: Here is a better set of numbers that has a finer breakdown by state. But you’d need to compute percentages yourself.

EDIT 2: I did the math. US average for 14-17 year olds (the majority of FRC participants - unfortunately 18 year olds were in the next bin) is 5.72% of population. Maximum: 7.30% in Utah. Minimum: 5.11% in Massachusetts (3.94% in DC).

We don’t have to be rocket scientists (we will leave that to Dave) to get our head around the basic principles and estimates of what we are doing. This is an exercise in estimation, not formal precision. We don’t run the risk of crashing a spacecraft, just not getting kids inspired.

Jared just blew a big hole my acturarial assumption. We now have statistical proof that Utah makes more babies than Maine’rs. So I’m now sold on using more refined data but we should limit it to 9 to 18 year olds and don’t worry about what is falling off those age fringes.

Actually just picking a single age range like 12 years old would probably work just as well as it should be proportionately the same and still would accurately explain differences between regions. Unless 11 year olds are fleeing Maine for Utah.

edit:

July 2010 projections, US Census, 14 to 17 year olds

California 2,152,803, 14 to 17 year old population
145 teams = 67 FRC teams / Million
145 teams / 2079 schools = 0.07 FRC teams / school

Michigan 575,071, 14 to 17 year old population
132 teams = 229 FRC teams / Million
132 teams / 752 schools = 0.18 FRC teams / school

Georgia 534,000, 14 to 17 year old population
28 teams = 52 FRC teams / Million
28 teams / 356 schools = 0.08 FRC teams / school

The state to state ratio (for these three) haven’t significantly changes from the beginning of the exercise but take into account when you pull in Maine and Utah.

.

At the least it might be a decent metric for where teams have good potential to start. Here is some Upper Midwest (and surroundings) FRC data (FTC/VEX does not have a large presence in the Upper Midwest to my knowledge):

of high schools taken from here and population from 14-17 year olds in 2010 census estimates (here) (which should be the same metrics as Ed’s).

North Dakota: 5 teams, 177 high schools, 36,059 people
138.7 teams/M people
0.028 teams/high school

South Dakota: 0 teams, 198 high schools, 42,612 people
0 teams/M people
0 teams/high school

Nebraska: 0 teams, 328 high schools, 96,724 people
0 teams/M people
0 teams/high school

Missouri: 52 teams, 615 high schools, 316,255 people
164.4 teams/M people
0.085 teams/high school

Iowa: 3 teams, 391 high schools, 163,257 people
18.4 teams/M people
0.008 teams/high school

Minnesota: 81 teams, 756 high schools, 284,312 people
284.9 teams/M people
0.107 teams/high school

Wisconsin: 29 teams, 578 high schools, 298,927 people
97.0 teams/M people
0.050 teams/high school

Illinois: 46 teams, 869 high schools, 703,099 people
65.4 teams/M people
0.053 teams/high school

Indiana: 30 teams, 384 high schools, 364,770 people
82.2 teams/M people
0.078 teams/high school

Disclaimer: I am a little suspect of the high school count for Minnesota as it seems quite high.

Observations/questions:
Are areas without cities of 100K (rough approximation) really going to have difficulty with FRC teams?
Obviously, looking at Nebraska, Iowa, and South Dakota one might think this is the case. I happen to know that all three teams in Iowa are near larger cities (or universities). At the same time though the largest city in North Dakota that has a team is Minot which is under 40K people.
Off the top of my head, I can think of a few cities in NE, IA, and SD that would be able to have teams (Lincoln, NE, Omaha/Council Bluffs, NE/IA, Des Moines, IA, Sioux City, IA, Sioux Falls, SD, and maybe Rapid City, SD (SD school of mines is there)) from a population standpoint. A question then arises of a lack of teams in the area for each of them and longer drives to regionals (probably around 4 for all of them to the closest regional).
For schools outside of these larger cities, the challenges would seem to go beyond that to major difficulties in fund raising (yes, I know it is a challenge for most teams, but it would seem to be even more the case when the number of potential sources of funds is smaller).
In addition, there is the difficulty of people. Example: my mother and father went to high schools of around 20-30 people in each graduating class in rural South Dakota (they played 9-man football). Everyone was in the major sports because there was one coach. Now, this was close to thirty years ago, but numbers haven’t changed that much in those areas. I would be interested to hear from teams that have less than 50 people in their school to understand how it works. I already know that small numbers of people on the team can be difficult as I mentor a team with 6 students (it is 4-H and mostly home schooled students so the pressures are different).
In addition to the lack of students, can teams in those regions easily overcome a shortage of mentors? I know a lot of schools have cut industrial technology when faced with budget cuts. Again, I would be interested in knowing from those teams that are not near a larger city to draw people from industry (and yes, I do realize that farmers often have a lot of technical know-how even if not trained in an educational setting beyond high school industrial technology and on-the-job with a parent or neighbor, so that might be the solution).

In this context, the growth in Minnesota over the last few years is all the more amazing as I would guess it is one of the highest teams/M ratios around.

For SC,

The Minnesota data looks correct. It is geographically a really big state. Just a lot of smaller schools.

South Carolina: 23 teams, 256 high schools, 239930 people
96 teams/M people
0.09 teams/high school

SC used to be quite a backwater when I was little but have come a long way. Twice the team density of GA. They have several small towns including one that is maybe 10,000, but that is nothing like the kind of small they have out on the great plains with 9-man football. The main point about the SC thing is I know of several teams in towns that have only a few thousand people.

The small town issue brings up another really important point. Lets go back to the graph again.

http://kellrobotics.org/my_images/Long_tail.png

For this exercise the X-axis is “rank of city” in size. So on the far left in the 1st position would be NYC, then Chicage, LA, etc. On the far right are the towns with the 9-man football teams.

The Y-axis is the population. Integrate under the curve and you have the population.

So the first few cities, the ones in brown have easily 1/4 of the total country population. The people in the yellow towns is in ‘fly-over’ country. It used to be the yellow towns have a LOT of difficulty in producing/delivering opportunity to its residents.

What has changed in the past 15 years is the cost of communicating has fallen through the floor. The internet is here. The world has gotten flattened. You can mentor through the net. NO it isn’t nearly has good as personal one-on-one mentoring but it vastly improves the ability for the on-site mentors and students to learn from those that know the ropes.

One of the next challenges is to develop really good online e-learning resources and place it in a repository so the yellow country folks can get connected. (yellow country is really a metaphor for anyone isolated from resources, urban, surburban, or rural). BTW the new FIRST site does a lot better job of cataloging ‘best practices’.

We no longer have to give up the value that the yellow country provides. We have a way to get there.

Changing discussion slightly:

Let’s say the Y-axis was dollars spent per team. Then the X-axis would simply be ranking, 1 to n, just like in the previous example for population vs. ranking.

We will probably see something like a 3 region chart. The leftmost region dominated by FRC, central region by FTC/VEX, and the long tail by FLL. The area under the curve would be total dollars spent. Long term I’d bet the area under the curve for FTC/VEX will exceed all other categories even though FRC has the big impressive peak on the left.

And the people that do this theoretical stuff will tell you the curve compresses to the left, the top taller, the tail longer, the curve more L shaped as more and more participants join at all levels.

Ed

2010 census estimate data - 14 to 17 year old population group
http://www.census.gov/population/projections/SummaryTabB1.xls

high school information
http://schooltree.org/high/

2009 Lunacy team data
http://usfirst.org/whatsgoingon.aspx

The reason it looks suspect in my mind is that Michigan is of similar size area-wise with 2x (roughly) the number of 14-17 year olds yet there are a few more schools in Minnesota. Now, I wouldn’t expect the number of schools in Michigan to be double that of Minnesota’s, but at least a fair amount more.

If they’re in close proximity, you build a bigger school. Less administration to go through, more teachers and students.

Here is a rough model in this spreadsheet

How it works.
Every tenth high school gets an FRC team.
Four times that numbers get a FTC/VEX team.
There are eight times as many occurances of FLL teams.

Slightly over half of ALL high schools will have one of FRC/FTC/VRC and that is fed by over 33,392 FLL teams.

The model is domestic US only.

Color coding scheme

. green - exceeds targets
. yellow - approaching exceeding target
. copper - work in progress
. red - inhales vigorously, houston we have a problem
. the copper/red cutoff point is my arbitrary number
. blue - aggregate data

Feel free to blow this thing up.

Enjoy,

Ed Barker

http://www.kellrobotics.org/my_images/Long_tail2.png

2010 census estimate data - 14 to 17 year old population group
http://www.census.gov/population/projections/SummaryTabB1.xls

high school information
http://schooltree.org/high/

2009 Lunacy team data provides the team statshttp://usfirst.org/whatsgoingon.aspx

http://en.wikipedia.org/wiki/Scale-free_network

http://en.wikipedia.org/wiki/Power_law

.

Not exactly. The evidence is that there are more people of that age in UT, but not necessarily that they were born there or born of residents of UT. It might be a reasonable assumption, but it’s not proof.

But, that’s just a spurious quibble, and doesn’t invalidate anything here.

Fascinating topic Ed, and from what little I know about the subject, it seems to be valid. Sure, the “real” curve might look slightly different from the idealized one you show, but the concept behind it doesn’t get changed.

I. too, wonder if FIRST has done this kind of analysis. My gut reaction is that yes, they have (but, I admit, that’s speculation because I have not asked). Remember, we’re not talking about government politicians, these are people who made their money by being smart. I bet that Woodie or Dean could write a white paper on this. (I mean, if I know a little about it…)

What jumped out at me from this is it can be used as a means of deciding where to focus efforts towards recruiting new teams.

I’d also be interested in an analysis of why some areas are so very different from others.

And, is a state the proper geographic delineation? I mean, what’s the difference between Bergen County NJ and Rockland County NY? (look on a map). Montvale and Chestnut Ridge are extraordinarily similar in most every way, except there happens to be a state line between them.

More questions than answers, I know.

It might be a reasonable assumption, but it’s not proof.

very true, I just made the comment trying to be silly…

Sure, the “real” curve might look slightly different from the idealized one you show, but the concept behind it doesn’t get changed

The thing that got us thinking was this Ted lecture.
It was a subject we were thinking about when we were trying to develop FIRSTsteps, communication, coordination, collaboration. There are all sorts of natural phenomena that uses this math.

I. too, wonder if FIRST has done this kind of analysis?

Maybe yes, maybe no. If not it has probably been in the back of a lot of peoples minds but never really put on paper. I say that simply because of the effort to drill out the data. And FIRST being a really busy place may not have had time ? Who knows.

What jumped out at me from this is it can be used as a means of deciding where to focus efforts towards recruiting new teams.

Very true and I think there are multiple things being said.

a) where to focus resources for team startup

b) resolving the ‘unpleasantness’ between FRC / FTC / VRC. There is a relationship between all groups. Movement in one group causes movements in all groups. Ask Coke and Pepsi. Advertise more Coke, and both Coke and Pepsi sales increase.

c) identifying a suitable metric of ‘success’. If you doubled the target numbers on the spreadsheet you would have a team in every school.

d) It helps with the problem of trying to create some realistic budget numbers for policy makers.

I’d also be interested in an analysis of why some areas are so very different from others.

speculating yet again, I’m guessing it’s a history lesson, something along what might be described by James Burke Connections - history interacting as a web of social relationships.
Dean and Woodie and the rest knew someone that went home and started some teams and it organically grew based on personal relationships and professional networking. Folks keep passing it forward. Kinda like how some Georgia folks are reaching into Tennessee right now.

And, is a state the proper geographic delineation?

It is simply a way to perform assessment. If the analysis is good enough for us all to do our job then it probably doesn’t matter. It is mostly driven by the ease of getting institutionally produced reliable data.

Ed

Would it be feasible to calculate the results on a county-by-county basis?

Some larger states may have a large metropolitan area which “carries” the state, while leaving a lot of suburban or rural areas under-served. By using county lines as the boundary for analysis, we can more easily find under-served regions without worrying about a metropolitan area carrying a state. Except for a small number of extraordinarily large counties in the Rockies, having one FRC team per county wouldn’t be too far of a commute for its members.

sorry for the bump but the spreadsheet is updated into a consolidated document here

Most of the numbers I’ve been reading (and I’ll admit that I skipped a few) are on a “one team per highschool basis”. My team is composed of two schools. And I know another team composed of four or five. And another team composed of two schools. Moral of the story? FIRST may also expand in how many schools pool together to create a team (which I think would be better), not just on number of teams…

Only if there is existing data at that level, otherwise gathering it would be too burdensome.

But, in general, I think by county might be better for some regions. But, as Ed mentioned, state is probably fine at this point.

With the exception of perhaps New Jersey, population density at a certain level is a fractal problem. Take a look at where people live in the following states:

Washington
New York
California
Georgia
Minnesota
(and I just grabbed these off the top of my memory stack)

All of them have populations of five million or more, but within the state the population is clustered in one (or two) major urban areas with lots of “flyover” towns and cities in the rest of the state. Even within counties, density is not level, with town vs. suburbs or rural areas occurring in many places. At every level down to the neighborhood, density will be unven.

Counties might be a more useful geographical entity than states, but I’m guessing that you will ultimately find that telephone area codes will do a better job of approximating population distribution. (County population densities are wildly disparate – Texas, for example, has 24 million people and 254 counties, while California has 36 million people and 58 counties.) As the phone companies add area codes to handle mobile devices, they are becoming a better approximation of population distribution than other political boundaries. This list of area codes by state almost looks like a histogram: http://www.allareacodes.com/area_code_listings_by_state.htm . Of course, in some places, like Southern California, a single geographical space now may have multiple area codes, but these things get complex. :slight_smile:

You might also want to look into Metropolitan Statistical Areas for analyzing the US. These are standardized areas used in urban analytics and planning. Google on metropolitan statistical areas and have fun wallowing in data!

Anyway, there are so few middle and high schools in North America with robotics STEM programs, that our collective challenge is to pursue either the easiest targets first (schools which indicate an interest) or those for whom volunteers have a special passion (never underestimate the power of someone with a dream). Trying to drive STEM education into the schools/school districts that aren’t interested is something to be done when the “interested but aren’t sure how” schools are being served. You utes might find this model interesting reading: Diffusion of innovations - Wikipedia. We are still in the Innovators and early Early Adopter phase of robotics in schools, and not yet at the level of mass adoption.

I believe Minnesota had a huge growth primarily due to a government / education system push to get FIRST into schools. That is the big reason for the “double regional” in Minneapolis (actually 2 events the same weekend).

Yes and no.
Yes in that the University of Minnesota has turned into a huge supporter of the program, particularly for teams outside of the Twin Cities and other major cities. In addition, there has always (as far as I remember in my years there) been a major push to be one of the top states for education, part of which has been STEM.
A lot more of it has to do with some individuals who have made contacts with the corporations in the area. If you look at the list of teams and their sponsors, you see a number of corporations that are sponsoring multiple teams and a few of those that have 5+ teams (I don’t want to accidentally leave out any of these companies, so I will not list any of them). I can only think of a few teams without a corporate sponsor as their major sponsor.
I do not know the entire story (/plug - if you come to the MN regional this year, I am sure you can talk to a few of the folks who have made it happen - /plug), but my understanding is that a few people got some of the corporations and the word got passed along to the other corporations in the area with the persuasion of the original individuals and year over year success.