View Single Post
  #10   Spotlight this post!  
Unread 24-06-2007, 21:52
billw billw is offline
Registered User
FTC #3549
 
Join Date: Apr 2007
Rookie Year: 2006
Location: Alexandria, VA
Posts: 29
billw is on a distinguished road
Re: [FVC]: Analysis Shows Improvement Possible in Ranking System

First, I apologize for taking so long to get back to this discussion—I’ve had too many balls in the air lately.

I would like to thank Blake for covering in my absence. He has accurately described the simulation, and I am in alignment with all of his comments.

I would like to elaborate on three of Sean’s concerns:

1. Non-random groupings have been shown to be fraught with problems.

I was dumbfounded when I read the FRC thread you referenced above. It seems that the outcome was entirely predictable--and unsatisfactory. Playing the same team repeatedly will only skew results, not refine them. Sadly, in the end, the true top eight teams were probably not represented as alliance captains in any of these tournaments.

During this past VEX season, it appeared to me and others that a similar problem was occurring and that many teams which belonged in the top eight were not selected. My suspicion was that because very good teams were occasionally paired with a very poor team (or one that did not even show up), they would record a loss. In a similar fashion, some poor teams would record an “undeserved” win. The fact that only a few matches were played, only exacerbated the problem and did not allow the results to “average” out. Furthermore, the ranking points did little to help refine the results among teams that were tied.

It occurred to me, however, that forcing the group members to play against each other would do two things: it would spread these disparities more evenly, and it might add enough efficiency that more matches could be played. I was simply attempting to answer the question: Would such a system result in more accurate rankings?

The answer is, Yes, it does appear to improve the results. It may also allow more matches to be run during a tournament. And although some teams have more interaction with some teams than others, the overall rankings are more accurate, which implies that the grouping algorithm is, in fact, more impartial than the current system.

Hopefully, you can now see that the goal of this system is really aligned with what you and others ultimately want—an unbiased ranking system.


2. The simulation does not account for defensive abilities:

This is true insofar as there is not a discrete input which attempts to quantify this behavior.

If another random variable was added to account for defense and then combined in a linear fashion with the existing metric, we would essentially be summing two random variables to create a third—which by virtue of the central limit theorem would still have a normal distribution--albeit with a lower standard deviation. (My statistics are very rusty; someone please jump in if this is somehow incorrect). Thus, I believe that the seed value assigned to each robot inherently accounts for defense as well as offense. If the defense and offensive capabilities were exponentially different, a second variable would be necessary (as would a much faster computer because the simulation would take 100 times longer).

A larger problem, in my mind, is that the simulation assumes that a normal distribution accurately approximates the overall performance of the robots. Given, my personal experience, I would be surprised if this were the case. I base this on the fact that I have seen far too many robots experience partial or catastrophic failures which effectively disable them for the entire match. In a normal distribution, this occurrence would be quite rare, but in practice it is not. On the other hand, actual data would be required to determine the true distribution. Without that data I am forced to rely upon a distribution which is probably “close enough” to make the simulation valid. As an aside, there are also many very low scores in the data, so perhaps, a normal distribution is not that far off.

One last note regarding defense: I am not a fan of aggressive tactics where one robot is purposely pushing (and possibly entangling) another. I will elaborate on this idea further in the game suggestions thread, but I do not see this type of strategy as being appropriate for VEX. I did not see many examples this past season of a more refined defense whereby a goal was blocked or emptied of balls—both of which I would consider appropriate. I understand that FRC has a much different platform, and I can understand all of the benefits of the defensive tactics as they are applied there, but I am not convinced that they are benficial for VEX.

3. This would never work in FRC.

This simulation had nothing to do with FRC. The grouping algorithm would require far too many matches and robots. It really only works for a paired situation as is used in VEX.

Sean, I would be happy to further discuss any of the above if you wish.


Josh, I think you make a good point. I would be curious about the outcome of normally distributing the “ability” level as well. Unfortunately, it would take quite a bit of time to incorporate this change. At some point, it may be necessary to make some other major revisions. I’ll put it in then, if possible.

New Data: Since the original run, I have been bothered by the fact that I did not add in ranking points. I have since modified the simulation and replaced the “Delta” method with a W/L/T + ranking points. The results do not show much improvement. They are attached.

For those who are not looking at the results, keep in mind that simply switching to a “Total Points” system far surpasses the ability of the current algorithm as well as that of the grouping algorithm. Furthermore, it does away with the inconvenient requirement imposed by the grouping algorithm that a specific multiple of robots be used.

One of my largest concerns with the VEX, now the FTC, program is that it will be shackled to the paradigms used by the FRC program. It is a new platform with different constraints and possibilities, all of which should be considered. Again, I will elaborate further on this in the other thread, but the real question is: Should FIRST be exploring an alternative ranking system? If you think that this is a worthy undertaking, you might want to send them an e-mail or add a note of support/dissent to this thread so that some sense of the community becomes apparent.

As I mentioned before, I would be happy to pass along the spreadsheets used to arrive at these results to anyone that is interested.
Attached Files
File Type: pdf FVC Simulation Summary 06-23-07.pdf (20.1 KB, 28 views)

Last edited by billw : 24-06-2007 at 21:55.
Reply With Quote