I have WPILib robot pose estimation running on a roboRIO v1. It’s slow and might actually be somewhat useful if running on the faster roboRIOv2.
The only use for OpenCV in my code is to display the tag orientation for “debug” purposes. Similarly, I used AdvantageScope to display the robot.
I don’t know to what extent, if any, that OpenCV is used by WPILib.
EDIT: To clarify that I over-answered OP’s question: I used WPILib for the entire robot pose calculation from AprilTag detection to the robot pose determination from a calibrated camera viewing the AprilTag that was detected in its known location. I believe OpenCV could have been used for much of that process but was not.
I’ve forgotten the details of my performance checks but I think the tag detection itself was nearly half of the CPU time required for the entire process (I might have posted that in another CD thread).
WPILib’s fork is here. Neither the upstream repo nor our repo uses OpenCV for the fiducial marker detection.
To be honest, the apriltag library is rather low quality. We’ve talked about rewriting it in OpenCV and Eigen instead. The compiler generates a lot of warnings we have to suppress, it’s really messy C code, and it’s not that performant. For example, the homography solver runs a constant number of iterations instead of exiting early if it converged; we patched that in our fork at least.
As mentioned, the AprilRobotics one does not use OpenCV’s detector. I’ve heard various opinions on whether the AprilRobotics or OpenCV detector implementations are more performant. There are other implementations as well, e.g. nvidia has an AprilTags library with GPU acceleration for their Jetson platform. There’s also a separate Eigen-using C++ implementation that is LGPL licensed–I don’t know if anyone has tried it to compare performance. We don’t want to use that one for WPILib because we’re BSD/MIT licensed.
AprilTag to robot pose WPILib Java code is here. I wanted my team to backup our LL 2+ or add a camera with roboRIO code without buying another LL.
My intention is to document what I knew, what I could figure out with plenty of help from others on CD and WPILib examples, and what I think but didn’t know for sure. To that end I’ve copied in code and comments from other sources, organized fairly well the blocks of computations, and selected proper variable names - not perfect but at least it all makes sense to me now. Suggestions for an improved training piece are welcome since that was my intention.
The code runs on the roboRIO and matches the LImeLight2+ pose as shown in AdvantageScope and my eyeball estimation of the pose.
Would an opencv program running on a coprocessor (rpi4 lets say) hypothetically run faster that photonvision on rpi4? I am looking to maybe take a look at writing an opencv program specifically for the April tags, but I don’t want to reinvent the wheel if photonvision would be the same speed.
What I’m talking about is a highly specialized type of open cv pipeline using one of their newer APIs. There’s a chance it could be faster but that depends on a lot of different factors. I don’t think the minor performance gain would be worth reinventing the wheel.
I probably know less about AprilTags than anyone else who cares about them but here is my simplistic take on what I see from a quick Google search of the topic.
So many people have looked at the code for detection you’d think it would be optimized by now. I also have significant experience with a variety of code produced in academia - students, faculty, staff - that isn’t high quality. Where does the code we are using fall in that possible range?
There are tradeoffs on speed, accuracy, range. As the old joke goes - pick any two.
Much discussion I see in (student?) papers is improving the fiducial tag itself beside finding a better algorithm to detect it. Some tag changes are significant in that although they may be minor changes for good improvements, they are detector-breaking changes. Some small change for potentially great improvement can be had for ad hoc usage where you (FRC) can control a non-standard environment. The most obvious simple thing requiring little or no coding change for detection is changing the borders of the tags. The bigger the white space the easier the detection. Even better, I would think, is put a very contrasting, easily detected border such a wide, bright green border and let relatively quick color thresholding find the likely tag location.
If you really want a fun, challenging project to attempt to improve the world, I’d suggest asking for specific direction from the several experts that have posted herein.
The library linked above is pretty much as bad as you’d expect from academics who weren’t software engineers and just needed something that worked.
Actually, using color would make the detector slower because it’s operating on more data. Color thresholding operates on three 8-bit channels (R, G, and B respectively) whereas AprilTag detectors typically operate on one 8-bit channel (grayscale). If you use a monochrome camera, you can even avoid an initial RGB-to-grayscale conversion.
The PhotonVision devs did some pipeline benchmarking, so it would be cool if they shared where the bottlenecks are.
EDIT: Here’s pipeline benchmarking data one of the PhotonVision devs sent me:
step no.
step name
step duration
cumulative pipeline duration
0
init
0.006000 ms
0.006000 ms
1
blur/sharp
0.004000 ms
0.010000 ms
2
threshold
14.918000 ms
14.928000 ms
3
unionfind
39.117000 ms
54.045000 ms
4
make clusters
74.681000 ms
128.726000 ms
5
fit quads to clusters
34.034000 ms
162.760000 ms
6
quads
4.620000 ms
167.380000 ms
7
decode+refinement
4.366000 ms
171.746000 ms
8
reconcile
0.038000 ms
171.784000 ms
9
debug output
0.006000 ms
171.790000 ms
10
cleanup
0.079000 ms
171.869000 ms
Looks like it was run on a pi based on the ssh prompt from the perf run, but idk which. I also couldn’t tell you what optimization flags were enabled, so take the results with a grain of salt.