Previously, I’ve mentioned that I thought grouping teams in lots of four and then having them play three matches against each other (in each possible combination of the pairings) would speed things up and might also provide a better selection mechanism. Well curiosity got the better of me, so I put together a Monte Carlo simulation of using that scenario vs. a series of random assignments.
The quick answer for those who do not wish to read further is that grouping appears to improve the selection process. A more notable finding, however, is that the current system of ranking is a fairly poor predictor compared to other methods. In fact, if other methods were used, half as many matches could be run while still achieving more accurate rankings.
I have attached a PDF file that summarizes the simulation. If people are interested, I will roll up all the spreadsheets and post those as well.
For those who are curious, I will attempt to briefly explain the method used to arrive at these conclusions. But first let me add that I am not a statistician, nor do I pretend to play one on TV. So be gentle.
That said, I was trying to answer a Boolean question: Is one approach better than the other. I knew it would be helpful to quantify the difference, but that was not my goal. Therefore, you might want to check my underlying assumptions before using the numerical results as gospel.
In any case, it seems to me that we could line up 100 robots in a room, and argue for hours about the merits of each and probably not agree on the exact sequential ranking of the group. We could, however, probably agree that a ranking does exist; we just do not have the power to easily determine it. Fortunately, for this simulation, that is all we need and, therefore, I assigned each robot a random “seed” value, much as they do in sports such as tennis. If a simulation is using 96 robots, each one has been assigned a unique seed value from 1 to 96.
A higher seeded robot (one with a lower seed value) should win more often than a lower seeded robot. To simulate this behavior, I established a baseline score. The highest seeded robot received the full score while the lowest received none; those in between were prorated.
I made the assumption that the scores of an individual robot would follow a normal distribution with a standard deviation of 20 percent. In other words, if a robot played 100 matches, and its average score was 50, in 69 of the matches its score would have been between 40 and 60.
For each match I added/subtracted a randomly determined amount (that followed the above normal distribution for the upper baseline score) to the each of the robots individual baseline scores (as determined by the seed value).
This was done for each of the robots on a team, and their total score was matched against the total score of the opposing pair of robots.
The result was a table of wins and losses for the robots. I also tabulated the total points scored, and because someone had suggested it, a delta score that is figured by summing the delta between winning and losing scores–winners add the delta, losers subtract it from their running total.
At this point, all that was left to do was to compare the final ranking with the original seed value and determine the difference. A comparison the average difference and standard deviation would determine which assignment method was preferred.
After running the simulation 100 times, it appears that using the group assignment method results in more accurate rankings—particularly if the win/loss method of scoring is used.
Now for the interesting part: It quickly became apparent that the total points scored by a team was a much better predictor than the win/loss record. In an effort to simulate the current scoring system, I added another measure which ranked the robots by wins and losses and then by total points. This helped somewhat, but it is still nowhere near the ability of the total points method.
In fact, you can run fewer matches and still have a better result using the total point method of scoring. To get a feeling for how many matches would be necessary, I ran simulations for 3, 6 and 9 matches in tournaments having 32, 64 and 96 robots.
Ok, fire away.
FVC Simulation Summary 06-03-07.pdf (20.1 KB)
FVC Simulation Summary 06-03-07.pdf (20.1 KB)