It’s difficult to tell from your post if you are aware of how the Kinect system works and what the open source driver can and can’t do.
The Kinect hardware provides a 640x480 RGB image and a 320x240 monochrome depth image. It also has a motorized pivot and multi-array microphone. All of the gesture recognition is done in software that runs on the 360.
The open source driver currently in the wild only covers the two video streams so far from what I have read.
Using this device to control a robot would require writing a pretty hefty amount of image processing code to do the gesture recognition. Using it as a sensor for a robot on the other hand…