Recently there has been some debate about whether or not a vision closed loop programming technique (where the position of an object in a frame directly drives a motor output) has benefits over a vision software that takes a single frame and calculates an error based on that frame and then drives some-other-sensor-based PID controller with a setpoint based on the frame’s error.
Benefits of Vision Closed Loop:
Little to no need for high level arithmetic
Can be used to lineup while moving as the robot constantly updates to the target
Downsides of Vision Closed Loop:
Needs to be run at low latency to be worth it as it is directly updating the motor controller
Benefits of Vision-fed Encoder/Gyro PID Closed Loop
Can be run on a slow <5fps processor
Less jitter as the loop can be run once to hit the target
Downsides of Vision-fed Encoder/Gyro PID Closed Loop
Requires higher level arithmetic to calculate the error to the image
If the target moves as a result of robot movement outside of the loop, the loop would target inaccurately.
This is a pure philosophical question rather than an implementation question, so, for a team with a coprocessor vs one that has to run all their vision on the RoboRIO, which is best? Discuss.
Vision fed encoder or gyro, hands-down. Using vision for feedback requires insanely low latency as well as dampened PID loops. It’s much better to just feed the vision into setpoints for encoders or gyros (with some latency compensation), then let the PID loop bring you to that.
Let’s play in the fantasy world where image processing takes 0 time.
With off the shelf webcams the fastest you’re likely to get is 30 fps. Compare at the update rate of the NavX which is configurable between 4 and 200Hz. The Gyro can be a faster source of updates.
I’ve always subscribed to the theory that running control loops on sensors that respond quicker was more successful.
Add onto this the fact that processing time will lower my effective update rate I feel it’s just much more likely to be successful with the Gyro.
EDIT: Encoders on a turret are generally even more accurate so I’d just rely on them. I’ve tried based on vision position (2006) and it was functional but not optimal.
Last year we captured through an Axis camera and did our processing on the driver station. This had way too high a latency to be usable as a feedback device even with significant dampening. Even with on-board processing I think that latency will always be too high to use vision exclusively as a feedback. The one exception I could see is a computing device with a built-in camera such as a Nexus. I know 254 used one for Stronghold and I would be curious to see what kind of update rate they got for vision data back to the Rio.
Trying to do PID on pixels is a recipe for tears and frustration. You will likely not have enough resolution to do proportional control without overshooting, let alone have a good derivative value.
Also, resolution isn’t the only issue - you’ll have some lag in your vision, which will hurt a ton if you want to go at a reasonable speed.
If you have the resources to do vision, you have the resources to do the angle conversion from the vision program to the gyro. It’s not hard - last year we just did an empirically found scaling factor. It took about 20 minutes to get the scaling factor. You need some sort of controller either way, so finding the scaling factor is basically the only “higher level arithmetic” that you need.
You can get away with doing vision based feedback if you align slowly, but in general, that isn’t an option for us (especially last year - our two ball’s main speed issue was with vision alignment). It’s not significantly harder to do gyro based alignment, so why not do it?
The Pixy we’re using is tracking at 50 frames per second and spitting out signal via analog output piped straight into the RoboRIO. We’re basically reading it like a potentiometer with another DIO channel used for target verification. Does anyone have any insight as to whether or not we should expect to have to use the gyroscope as described for tracking, or does that sound fast enough to use the analog signal as the input?
Don’t do this. The pixy makes no guarantee of frame processing time and you may see spurious detections. I would input that analog signal into a filter that makes sure it isn’t jumping around too fast (compared to how you know your robot is moving), the dump that signal into a controller which is based on the gyro or encoders. You are already going to want to tune that loop for any kind of autonomous driving, so it’s not that much more work.
Good to know, thank you. We’ve got a solution that so far has done a nice job of tracking only what we want but artifacts do still rarely pop up.
Does it make more sense to do filters on a hardware level or a software level for this application? I don’t know if we’ve ever implemented those before in our code but we do a decent amount of custom circuit soldering.
We had fair performance with a vision closed loop in 2013 and 2014, where the goals were several times wider than they were tall, but they were easily stymied by any sort of nudge by another robot or collision with an obstacle (pyramid or wall), and it wasn’t really fast (most of a second even when you were close), and didn’t work well for Stronghold. It might work this year for the peg, where you drive right up to the target, but not likely for the high goal.
The past two years we’re using the vision to drive wheel encoder loops. We may sanity check with an inertial system (accelerometer/gyro), though more likely that will be a driver function. Off season or next year we may go to the inertial system being an intermediate step; the slow vision controls the inertial loop at an intermediate speed, and the inertial loop controls the drive wheel encoders at high speed.
I believe the ps eye can get 60 fps, but I agree. Unless you have a very large input image (we’re talking like 1440p), then the sensor feedback will also be more precise too. So, the using a sensor will give you both more updates a second and have more accuracy, theoretically, that is a no brainer.
Getting a camera to run any faster than 30fps takes special HW and pretty careful code. If 33ms is an OK loop rate for your mechanism, it is a valid sensor, otherwise you might want to augment with something faster – and/or with less lag. In many situations, an ultrasonic, gyro, encoder, or pot makes a great compliment to the camera’s higher-level control feedback.
It is great that you are asking for advice, but there are teams who will make both solutions work. My advice it to figure out how you can do a quick prototype or test.