Hello all,
Our team is working on various off season projects and one I would like to at least get a taste for is vision tracking with the Axis camera.
We were able to get the camera to display in the default dashboard this year but nothing more.
So to get the camera to track an object, how do you guys do it? We mainly use JAVA as our programming language, but have very little experience with camera tracking, Also I can’t seem to find much in the form of a full tutorial so one would be of great help.
The first thing I recommend is understand the concept.
Typically what you are going to do is teach the vision system to an image that it can track. You’ll use Vision Assistant for that. Then extract a pixel value. Take this pixel value, and apply it to the drive motors.
This is an AWESOME tutorial at NI about Vision Assistant
The basic concept is to extract the “X” value, typically left and right on the robot in relation to the camera, once you have that value, you determine an “error” value where the center of the pixels is, this error value is used to steer your robot to keep the camera tracking the part in the middle of the camera FOV (field of view). For example say 80 pixels is the middle of the vision frame. If your camera is mounted on your robot, and it sees the object, where is that object in the frame? If it’s located 80 pixels, then there is 0 error. If the image is located at 75 pixels, then there is a 5 pixel error. In order to correct that, you need to find some scalar that will take 5 pixels and interpret that into a drive motor value. This will begin to move the robot base to turn it towards the image, and correct the error back to 80 pixels again. Generally a simple “P” algorithm works good to get the concepts then work your way up from that.
Thanks for the info.
Looking at the vision assistant it seems to be only able to create scripts for Labview and/or C. Which is a problem since we use JAVA.
Are their any of solutions for JAVA teams?
we have net books for programming and this program is not on them. when opened though it says the resolution of the screen on the net book isnt high enough. we tried connecting it to a larger monitor, but it still was reading the netbook’s screen and not allowing it to open. any ideas??
Alas, I wish it was that easy. Our team uses Java because it is easier to teach it to people.
If you want samples of camera code, there are few sample projects included in the FRC plugins. I believe that they are based off the 2009 game Lunacy, where robots could track a pink/green target. When you create a project, instead of creating a new project, you can create a sample project.
Our team is (as I mentioned above) using Java this year. If you want to have a fast fps on the dashboard(greater than 20), you usually have to hook the camera through the switch and then modify the dashboard code (as opposed to the hooking the camera to the cRIO). Our team hasn’t figured out how to hook the camera to the switch, and still process images on the cRIO. If your team does, please tell us.
Every time we have sought to use the Axis camera, we have found that it is such a bandwidth hog that it screws up more than we could ever hope to gain by using it.
I’d like to think that it’s something with us, and not a fundamental flaw in the hardware or software, but we’ve tried just about everything you can imagine to get it to work, to no (well, little) avail.
In 2010 it could find the target and turn our robot (with the gyro) to shoot, but could not to do fast enough that a bump wouldn’t screw the whole thing up – and it caused all kinds of watchdog errors in the process.
Has anyone else had any problems like this? Is there a way to fix it? We’re thinking of using tracking in 2012 if it’s applicable, which it most likely will be.
The Java libraries use wrappers to access the underlying C/C++ implementation of the NIVision libraries. Some of the basic wrappers were built by Sun and WPI, but it is not complete. I believe it is available on FirstForge so that the community can contribute.
As to the original question. The camera is just a camera and doesn’t track. The vision processing code inspects images using the code you write and can tell you the location or other attributes of the features in the image. As mentioned by others, it is very easy to ask for more work than can be done, and image processing in general is always a challenging task. As mentioned, Vision Assistant is a great way to experiment with the tools to locate and process features in an image. While it is possible to generate code from Vision Assistant for C and LabVIEW, it is typically not that much code and is pretty easy to write by hand once you’ve used the assistant.
The netbooks have a screen resolution smaller than what Vision Assistant was designed for. At one point, the video driver had a “shrink to fit” option that made it have a virtual resolution larger than the physical limits of the screen. Last time I looked on a Classmate I wasn’t able to find it. The other options are to run with an external monitor if you have the hard shelled classmate, or to locate another computer to use for vision assistant.
If you have other questions about code performance or approach, please ask.
To get images from the camera, I would suggest writing your own code. In my experience the C++ WPILib functionality of getting images from the camera (the AxisCamera class, if I recall), was sketchy, slow, and didn’t always work last time I tried to use it.
To get images yourself, you can do exactly what your browser (firefox, IE, chrome, etc) does and use HTTP Requests. The format is pretty simple and you can even use wireshark to see what firefox sends and just send that exact request if you like. All you need to do is make a new thread, open a TCP socket to the camera, and in a loop just send a packet containing your HTTP request, then read the response which will be an image from the camera.
You then feed the image into the NI Vision functions of your desire as Greg mentioned. NI Vision Assistant is an excellent tool for testing out some functions. You can even generate C code with it that you can use. The code is makes is inefficient and clunky (because it has to be for the modularity required in its case…) but you can rip out all the excess and see exactly what function calls it is doing.
This method allows you complete control over timing, memory, etc.
Keeping in mind the apparent skill set of the original poster, I don’t think that’s a very helpful suggestion. If there are indeed problems with the provided library functions, it would seem best to find appropriate fixes for those problems.
Given that many teams had great success using the camera, I’m reluctant to accept that the problems you encountered were inherent in the WPIlib functions.
In my experience…
Would you care to give us some information about your experience, so we have a basis for evaluating your viewpoint? It’s hard to take an anonymous first-time poster named “moopydoopy” seriously, no matter how reasonable the message appears.
I can’t find the post, but I remember one team figuring out something counterintuitive about the camera last year. If you configure the image size to a small number of pixels in order to try to get frames faster and reduce camera lag, you’re likely to overload the cRIO’s image processing ability and actually end up making the lag worse. Maybe that could have something to do with what you encountered.
Not sure if this is the same issue, but I’ve seen far more jitter when the camera compression is set to the extremes. Generally I keep it in the center range from 20 to 70. Also, I found an odd pattern on the vxWorks memory manager. If you have a string buffer that resizes below, then above the 16kB mark, the block is relocated to the general heap manager, which seems to be a a pretty slow operation. It has been over a year, so I don’t remember the amount of time, but it was enough to miss frame(s). Some image size/compression combos aggravate that issue, with a size bouncing above and below that mark.
As with wheel count or wheel size, it is pretty easy to treat the camera as a mystical creature that you somehow coax into doing as you wish. In reality, it is a pretty complicated and semi-intelligent measurement system with a complex API. I’ve also struggled with it, trying to understand why the timings changed, only to discover that it was because I wore a dark shirt that day. The camera was facing me and I’d do the hand wave or finger snap to measure its response time. It didn’t occurred to me that my shirt was the primary element in the scene and was determining its auto-exposure settings.
Please post issues or questions so that we can all learn more about cameras and image processing.