It seems that the subject of bias in Chief Delphi polling is an increasingly hot subject on Chief Delphi. It has long been orthodoxy that polls on this site are skewed towards more experienced teams, but in the past week there’s been a quest to prove this. Spurred by this post, Caleb Sykes was led to start this thread to confirm whether the bias was actually there. Following that, that thread then spurred Rangel(kf7fdb) to start this thread. In my view, both of these polls largely confirmed that Chief Delphi is indeed biased towards more experienced teams.
Now, I want to find out something else about bias. I’ve long had a hunch that Chief Delphi polls are biased towards North American teams, but I want to see if this is true. To that end, I thought of this simple poll.
There’s a reason this matters. If polls are biased towards North American teams, that has profound implications for strategy and polling accuracy, particularly once you reach Nationals, where there will be teams from all over the world, which dilutes the accuracy of any polling results.
Thanks for Responding!
Update: I’ve checked the FIRST website’s Team & Event Search, and found that the listthat I used remained up to date. Therefore, I removed the “other” option from the poll.
Just a reminder there is a poll option in CD for adding polls to threads. I can see using an external poll site for a multi-question poll that you want to do additional stats on, but a simple “where are you” poll could have been done in CD and we could all see the results without going off to Googleland…
I’m not sure what the sudden interest in CD demographics are. It changes from year to year as roboteers, parents and mentors cycle through the FIRST cycle. There is a core of mentors with a collective 300 centuries of robot construction info that is yours for the asking. The rest of CD is like the billions of stars above you, some shine brighter than others, some last longer than others, but like gravity, each has a slight pull on the others, taking all of us all towards greatness.
Polls here should be for fun, trying to track a trend that has a real basis is almost impossible. But good luck on yours, for the record, I picked bacon.
Teams NOT from North America tend to land in a few categories…
–Europe (maybe half a dozen)
–Israel (See the list for the Israel Regional–about 50-60)
–Australia (See similar list–only fewer)
–South America (about another dozen)
–China (Not sure offhand, but give-or-take 30 as a guess)
So that’s 150 or so teams that aren’t from North America. You might sneak it up to 200 by adding Hawaii.
That leaves about 3000 teams from North America.
The VAST MAJORITY of FRC teams (by a factor of a lot) are from North America. If CD is anything like representative of the rest of FRC, the results are going to be overwhelmingly skewed towards North America.
I’ve closed the poll. At closing time, the poll had 119 responses. This is a small pool, and I wanted more responses to work with, but I’ll just have to run with what I have.
Per usual, I am supplying a link to the spreadsheet with the complete set of responses, just in case anyone wants to do any additional number crunching.
If we use EricH’s estimates, we would expect 3000/3150 = 95% of all responses to come from North America. Instead, the poll showed 85% of all responses originating in North America.
I don’t care to dredge up enough memories of my statistics courses to run the confidence interval on a data sample of this size (that is left as an exercise for the reader), but if anything I’d say that these stats seem to show that CD is skewed towards international teams.
You’re not wrong here, and you raise a good point. I do wish I had a larger sample size. But that turned out not to be the case, and responses had pretty much stopped coming in by the time yesterday came around, so I had to run with these results. Also, when I said that EricH’s prediction had been “correct,” I meant that the general point of his prediction- that results would be skewed towards the US- was correct. I should have specified this in the original post.
And I still think that point pans out. While the US did under perform, as you noted, it still took a much larger chunk of the pie than anyone else did. 79.8% of the responses were from the United States. Even if that is much less than 95%, the number EricH predicted, it still blows out the second biggest chunks of repondants- Canada and Israel, both with 5%.
If the US’s 79.8% was much less than expected, I think this can be attributed to the relatively small sample size. If I had more responses, I think the results would be much closer to 95%.
First off, we have to assume that the responses were a random sample of ChiefDelphi users. In reality, they almost certainly weren’t–this was a voluntary response survey, which are usually terrible at imitating randomness (for example, maybe non-US members would be more inclined to answer because they want to represent their country). But for the sake of calculations, we’ll pretend that it is.
The FRC 2015 season facts flyer says that 87.2% of FRC teams last year were in the US. To use this as our percentage of ChiefDelphi members who theoretically would live in the US, we need to assume that there is no difference in the average team size between the US and the rest of the world, all ChiefDelphi members are FRC members :rolleyes:, and the percentage of teams in the US hasn’t changed since last year.
Assuming all of the above are true or negligible: if ChiefDelphi members’ nationalities represented FIRST as a whole and the true proportion of ChiefDelphi members who are from the US is 87.2%, then the likelihood that 79.8% or less of 119 respondents would have been from the US is only 0.8%. This is very strong evidence that ChiefDelphi members are disproportionately from outside of the US.
Again assuming all of the above stipulations, we can be 95% confident that the true proportion of ChiefDelphi members who are from the US is between 72.6% and 87.0%. Whether this is a large enough difference to matter in strategy discussions, I can’t say.
I’m starting to question my initial readings of the results, but I’m still not totally sold on this yet. Don’t all of those assumptions sound like pretty big assumptions to make?
Of course, that’s why I pointed them out; I wanted anybody reading to take the results with a grain of salt. Unfortunately, these kinds of problems are much harder to fix than the sample size–which was actually more than large enough.