I’m planning to set two limelights in the front of our robot to track AprilTags next season.
I will set the lowest vertical field parallel the ground to ensure that I have the highest rate to see the apriltag.
About X axis field, I have thought two options now, but I am not sure which one is better.
I’d wait until you see the game, field layout, and pick some of the robot design.
The overall goal should be to maximize the number of tags seen by both cameras while your robot is doing high-precision motion.
In 2023, most of this motion was while picking up or placing gamepieces. In these cases, there really were only one or two tags nearby, so a common strategy was to be “cross-eyed” - assume you have one tag you’re trying to localize off of, and point both cameras to see that one tag, just from different angles.
On the other hand, if there are a lot more tags this year, or if those tags aren’t right next to the pickup/drop-off locations where high precision is needed, a more “chameleon” technique of having the cameras observe different tags to increase information is useful.
For the 2023 game, the position of the various scoring positions versus the position of the AprilTag tended to favor the “cross-eye” arrangment:
But it all depends on what the field looks like and where, on the robot, you can mount the cameras to have an unobstructed view of the AprilTags in each of the relevant scoring locations.
I had not thought about cross-eyed (but OTOH, we only ran 1 camera this year). I can definitely see that it would provide some better localization.
However, there is one aspect which gives me some concern. My assumption on how to use this would be to take the position compute from each camera and separately feed them to the odometry.
However, each camera/image will (potentially) have 2 solutions. One can hope that one solution from each camera will roughly overlap and the others will be wildly different, so I guess that would allow selecting them. Is that what teams did? Did anyone study and write about this?
We wrote code about it, but had to use photonvision. That was the point of doing the pose disambiguation on the rio.
Among all the poses from all the targets from all the cameras, half should be tightly clustered together near the last estimate, and the other half will be scattered.
Average the cluster, discard the rest.
Limelight disambiguates on the hardware usually, which will work fine in lot of cases. However, in some multicamera scenarios, it would throw away useful information .