This is an interactive viz that shows the average travel distance for FRC teams to all the counties in the US. I thought it would be interesting to see where the championship would be if the only consideration was average travel distance. Distances were calculated as the crow flies for simplicity. I also looked at just USA teams, just the Detroit and Houston Teams, and just the international teams.
What about if you limited it just to teams who attended champs last year? Maybe it would make Michigan teams pull things a little less (though on second thought maybe not?)
I added three new sheets that do as you suggested, didn’t make much of a notable difference.
Correct, so for all the teams who go to the southern championship, they are traveling an average of 2036 mi (Harris County TX) as opposed to the minimum average distance possible which is somewhere in Colorado (~1900 mi).
Interesting that the South Champs has such a small automatic mileage spread on the color scale. It looks like there is a pretty wide swath of the West and South that can be chosen for South Champs with very minimal travel mileage impact.
Haha. Me too.
I guess my conceptual issue was I was thinking of distance from counties, with counties being weighed evenly… not by teams that actually exist in those counties. Explains why Alaska didn’t pull South Champs into Oregon.
Yikes, what happened to Oglala Lakota County (Shannon County) in SD? Were county seats used as an endpoint to calculate distance? This is the only explanation I can think of (Oglala Lakota County is the only county in the US (that I know of) to have its county seat outside of the county).
Is it possible to recalculate the distances as Manhattan distance? Then average the Euclidean and Manhattan? I have a feeling that will give a much closer representation of real-world travel distances. (Whether or not it is actually useful is another matter entirely.)
If I had the free time I would dive into this: https://stackoverflow.com/questions/17267807/python-google-maps-driving-time Select maybe 2 dozen candidate cities vs the team subset and let it go to town (no pun intended). Pick the city with the lowest residuals, boom. (It might hic-up where no road network is present (i.e. Hawaii)).
Also a quick cartographic tip to the OP (not trying to be mean), since you are dealing with difference and not a variance (i.e. positive only values) you should be using a monochromatic color scheme. The current color you are using is really only commonly applicable to +/- situations, such as standard deviation. The point I am trying to make is the eye is drawn to the ring of low saturation hues, and not to the point of your map: the geographic center of a distribution.
While the map is easy enough to read, the color scheme adds complexity where there doesn’t need to be any.
From a realistic standpoint of actual travel distance, proximity to a major international airport is key, ESPECIALLY for south champs. By major, I mean one where international flights come in from many directions.
Yes, it looks like the South Champs could be in California just as easily as Houston with little impact on travel distance. Los Angeles is 2,041 miles versus 2,028 for Harris County.
The endpoints for the counties were the coordinates to the approximate center of the county, taken from here. But I’m not really sure why Oglala county is not showing up.
This map would definitely be more interesting if each distance was calculated using the Google Maps API, and I looked into it, but it was a bit over my head. Someone else can give it a shot though! Thanks for the note about the color scheme, that makes a lot of sense. The single color gradient looks much better.
So the problem with google is you will end up paying for more than n accesses of the api per day. It’s not stupid expensive, but still I don’t have $100 to test all teams vs 24 unique cities (even if I used a geographically weighted subset of a few hundred teams). The key thing is travel duration. This might be possible through open street maps as well. I can toss the kinda-sorta-working python script on github if anyone is interested.
I might also be able to use arcmap’s network features. But the google approach seemed way easier at the time (untill I found out about the cap).
Another interesting result may be: instead of modeling distance as a linear relationship, model it as a nth degree polynomial. This way you could fit drivetime cost based on time of day leaving (i.e. preferance to not arriving to a hotel at 5am) I can elaborate more if anyone is interested.