How does human processing integrate with computer processing?
Mechanical Turk has two redundant general-obstacle-avoidance systems. One is similar to the initial version of this system that was used in the first DARPA challenge. The second sends a camera feed back to a labVIEW dashboard which superimposes a grid on top. A human operator has a keypad, with each button corresponding to a square on the grid, and they push that button whenever an obstacle occupies the square.
The automatic system is designed to have very high specificity and moderate sensitivity–in other words, it is very unlikely to generate false-positives, but during normal use, the human operator will be responsible for detecting most of the obstacles. The automatic system serves as a backup in case the operator is distracted, unclear on instructions, etc.
The result, of course, from both of these systems is an approximate geometric position of all obstacles in the field of view, which the robot can use to modify its route plan.
For example, in Aim High in 2006 I would argue that auto-targeting bots could outshoot even the best driver’s aim. Tracking a known, invariant object has been done effectively using a variety of techniques in computer vision.
While this is certainly true(we did it with our 2009 bot) , there are a couple of things:
- Automatic tracking, even of a known invariant object, takes a significant amount of time for development, calibration, and debugging.
- Automatic tracking systems are more likely to fail than human vision processing. Case in point:the number of teams that needed to recalibrate cameras for field conditions
- Automatic and manual tracking systems are not mutally exclusive. For example, a system could be designed that uses automatic,high precision tracking normally, and switches over to operator-supplied data if its confidence rating drops below an arbitrary level.
In order to come up with an effective strategy, you need to know the full state of the field at a moment in time.
Do you? I think this is something to investigate further–what I observed in Lunacy at least was that there were a small number of “global variables”(real-time scores, posession of super cells, etc) that affected tactics, but most of the action at any given time seemed to cluster in smaller “cells” of 2-4 robots in a particular area. If it turns out that the amount of field state information is larger/needs to be updated more frequently than is feasible for the operators, it’s possible that FIRST could tweak the FMS system to make data that it already collects(i.e. real time scoring) availible to robots, especially if there were many teams interested in implementing this sort of system.
Regarding intent, this is a very tricky issue for AI systems to figure out, because fully understanding intent in a general sense requires a general theory of mind, which is an extremely hard problem. However, I think that in the restricted domain of an FRC game, judging intent is feasible. For example, to build predictive tracking into our Lunacy bot, we generated a Markov model that the robot could use when it did not have a line of sight to the trailer. The model was able to predict the basic physics, but also common evasive strategies. . In addition, if one can find ways to decrease the domain even more(i.e.concentrating on a game “cell” instead of the entire match), the problem becomes easier because the number of basic actions and combinations that can occur is decreased(which lends itself better to an expert system)
Okay, that was a ridiculously long post. But basically, this isn’t intended to be a guide on “how to build a semantic robot”–it’s really more intended to stimulate people thinking about it, since there are still so many unanswered questions. And I think that ultimately the only way to test some of these things will be to build a bot using them, and enter it in a competition.