We've used the OpenCV library in the past. It has a number of utilities to do vision processing and identify/localize features. I was not involved in the programming, so I do not know any of the details. Here's the link to our
github repository we used in 2013 and 2014 to find goals; we had implemented it on a raspberry pi, as in 2012 our goal finder bogged the cRIO down too much between network traffic and CPU time. I understand that the latest raspberry pi offerings start at $5, and are more powerful than the ones we used in 2013 and 2014. You can connect the USB camera directly to the pi, and only send back the coordinates and key attributes of the goal/object over the network.
If you want an image to help drive, you can also so some vision processing to reduce the package being sent back. I had good luck with the "Canny Edge Detector" algorithm in ImageMagick when I played around with it last year; it sent back a useful schematic image in a very small number of kilobytes. I see that OpenCV does have the Canny algorithm as well.