Limelight 2023.1 - MegaTag, Performance Boost, My Mistake

Correcting My Mistake

The default marker size parameter in the UI has been corrected to 152.4mm (down from 203.2mm). This was the root of most accuracy issues. My mistake! While it is sometimes acceptable to measure tags by their outermost border, the Limelight interface uses the detection corner distance (black box side length).

Increased Tracking Stability

There are several ways to tune AprilTag detection and decoding. We’ve improved stability across the board, especially in low light / low exposure environments.

Ultra Fast Grayscaling

Grayscaling is 3x-6x faster than before. Teams will always see a gray video stream while tracking AprilTags. While grayscaling was never very expensive, we want to squeeze as much performance out of the hardware as possible.

Cropping For Performance

AprilTag pipelines now have crop sliders. Cropping your image will result in improved framerates at any resolution. AprilTag pipelines also support the dynamic “crop” networktables key. In case you missed it last year, dynamic cropping with the “Crop” NT key was added at the request of one of the best teams in the world in 2022 to improve shoot-on-the-move reliability.
Note the framerate increase from ~55fps to ~80fps.

Easier Filtering

There is now a single “ID filter” field in AprilTag pipelines which filters JSON output, botpose-enabled tags, and tx/ty-enabled tags. The dual-filter setup was buggy and confusing.

Breaking Change

The NT Key “camtran” is now “campose”

Limelight MegaTag (new botpose)

My #1 priority has been rewriting botpose for greater accuracy, reduced noise, and ambiguity resilience. Limelight’s new botpose implementation is called MegaTag. Instead of computing botpose with a dumb average of multiple individual field-space poses, MegaTag essentially combines all tags into one giant 3D tag with several keypoints. This has enormous benefits.

The following GIF shows a situation designed to induce tag flipping:
Green Cylinder: Individual per-tag bot pose
Blue Cylinder: 2023.0.1 BotPose
White Cylinder: New MegaTag Botpose

Notice how the new botpose (white cylinder) is extremely stable compared to the old botpose (blue cylinder). You can watch the tx and ty values as well.

Here’s the full screen, showing the tag ambiguity:

Here are the advantages:

  • Botpose is now resilient to ambiguities (tag flipping) if more than one tag is in view.
  • Botpose is now more resilient to noise in tag corners if more than one tag is in view. The farther away the tags are from each other, the better.

This is not restricted to planar tags. It scales to any number of tags in full 3D and in any orientation. Floor tags and ceiling tags would work perfectly.

Here’s a diagram demonstrating one aspect of how this works with a simple planar case. The results are actually better than what is depicted, as the MegaTag depicted has a significant error applied to three points instead of one point. As the 3D combined MegaTag increases in size and in keypoint count, its stability increases.

Nerual Net upload is being pushed to 2023.2!


(Not) Benchmarks

What teams need to understand about the AprilTag algorithm is that the content of the image greatly affects performance metrics. If benchmarks are not collected with fixed, repeatable test environments and methodologies, they are meaningless. Here are two very clear examples of what I mean:

There are claims that Limelight2+ gets 15fps tracking AprilTags at 640x480 with zero downscaling using Limelight OS, and 45 fps with an open-source solution.

Here’s how easy it is to both get 45fps on Limelight 2+ with Limelight OS and then cut performance in half with a minor settings adjustment. Watch this video and note how performance drops despite only a minor change in image appearance.:

You’ll see this no matter what solution you use.
Here’s another example of performance sensitivity due to simply enlarging the tag:

Here’s a third video just to confirm that there is in fact zero downscaling:

Yes, this works on the old image as well:

Edit: To clarify, these aren’t benchmarks. I’m demonstrating why pure fps benchmarks aren’t very useful


We are getting an error when we try to download it…

This XML file does not appear to have any style information associated with it. The document tree is shown below.

1 Like

Fixed. In case the page is cached here’s the link



1 Like

Interesting and cool.

1 Like

This is extremely good stuff. Multi-tag localization and realm performance numbers are awesome. One quick thing, do you know what range you get with that resolution and framerate? There were a lot of things Photonvision tested to increase performance but some of them came at the cost of the ultimate benchmark of range. For example, there was an Aruco detector branch that processed images 2-3x faster, but cut the effective range in half.

As the manufacturer I think it’s nominally expected that performance data is published, just as it was with the older Limelight releases. If REV released a sprocket without dimensions, it wouldn’t be a very useful product. Here’s a set of test data across different resolutions with the last firmware release that differs greatly from your own, which is what I assume you are referencing:

I also talked to someone yesterday who said they saw no noticeable difference in performance between Photonvision and Limelight software on a LL2+. But your numbers look 2x faster than Photonvision. Clearly, there’s a big disconnect.

Hence, my ask about range. I know it’s yet another thing, but ultimately what teams will care about is “how fast can I resolve a target at X range”. I have very detailed numbers correlating camera selection, range, resolution, framerate, and latency on the Beelink Mini S and Orange Pi 5 computers with Photonvision. I would like to see a similar table directly from Limelight, because apparently any data from external sources is untrustworthy. I would never have expected black level offset to make such a dramatic difference, but I also don’t know what impact that setting has on range.


You say this and don’t show the same test with Photonvision? Really? For all we know your setup is really great for any product and you’d get 90fps with Photonvision.

If Limelight is too busy trying to get the LL3 out the door, I am perfectly willing to do all of these tests in the same conditions as my other tests to squeeze the most performance out of a LL2+ or LL3 (if I can get my hands on one) with both Photonvision and Limelight OS. But calling out peoples’ test data when there are no published test methodologies or specs from the manufacturer is strange.


Thanks for all the info. I’m just showing that it’s really easy to make benchmarks do whatever you want them to do with AprilTags. I would never run with images at that brightness level in a match. My point is that existing benchmarks showing that Limelight OS is 3x slower than other solutions are not meaningful at all. I’ll get back to you with how I want to demonstrate real-world capabilities after I finish a few things.

EDIT: I think you may have missed one of the earlier lines in the post where I mentioned that the current benchmark gives 45fps to the other solution. I’m showing that you max out performance by minimizing the meaninful content in the image. EG. it’s not “black level” - it’s “the image is darker”


I think there’s a disconnect here in general because Retro tracking is robust against basically everything and is not affected by the environment (minus the sun but let’s ignore that). The ambient light level among several other factors will change the settings you use for AprilTags which is why spreadsheets using “settings” as a test methodology don’t make sense.

You’re right that you can make benchmarks say whatever you want depending on settings, but that doesn’t mean that benchmarks are “not meaningful at all”.

The benchmarks that @asid61 linked were acquired after tuning the settings in the stock Limelight software and in PhotonVision for best performance. This was designed to be a best-case test and not some kind of bad faith manipulation. If you think there’s a problem with this methodology then it would be helpful to know, but I don’t see how ‘settings can affect performance’ invalidates this type of test.

The biggest issue with the benchmarks IMO is not whether they’re correct characterizations of best-case performance, but whether or not they’re useful as-is: the benchmarks don’t include max range, and as Anand pointed out, performance at a given max detection range is really what matters for teams. Hopefully that type of data will come out soon.


@Brandon_Hjelstrom This is unrelated, but do you guys have a timeline for updating the docs? No rush on this, just wondering.

1 Like

The videos posted in the “not benchmarks” section show a max performance that is 3x higher than what was found by whoever collected the data, so the LL was clearly not tuned properly for this type of test. The methodology was flawed for that specific benchmark because it focused on settings and not image appearance, so that specific benchmark which is still being posted is not meaningful.

I did mention the fact that the cause of the discrepancy in the benchmark was likely a darker image a few days ago, but this was ignored.

I agree with the idea that true best-case benchmarks are not useless :slight_smile:

Tonight. Mostly adding gifs and greater detail to the apriltag sections. Are there perhaps any parts of the docs that you would like me to pay extra attention to atm?


Excellent. Keep up the great work!


Don’t mean to drag this out any further than it has already been, but will this be reflected / explained more in the documentation? The docs currently recommend a 0 black level, which the testing was done with, but that doesn’t seem to be the best option for users now.

For completeness’s sake, this was message seen as well and we asked for clarification on what better settings to use for testing, but I think you missed the message (it came in a reply a few messages later down, so very easy to miss without a ping).

Don’t mean to come off as confrontational in any way, just hope to clear some things up, I’m a big fan of how Limelight has raised the floor for a lot of teams.

1 Like

Thanks, and I truly don’t want any confrontation either. The point is that there are no ideal settings. There is an ideal image appearance for benchmarks which in practice is “as much darkness and as little noise as possible.” Settings depend on factors such as your environment and whether you are using an illuminated tag (eg smartphone screen) or not.

To make this even worse, the best image appearance for benchmarks is almost certainly not the best image appearance for robot performance :laughing:. I don’t want to optimize docs for benchmarks!


Very true. This is a good reason to test maximum range vs performance (e.g. consistent detections at X range require settings that give Y FPS and Z latency). At the end of the day, I think the available options already perform at a level that should give teams good results—and that’s what’s important!


Thanks to all of the teams uploading images of game pieces to LIMELIGHT2023FRC - please keep them coming.

We’ve removed the email address form since we have another form of automatic verification. Hopefully this makes the process easier!


Hi Brandon,

Since you are outsourcing community data for the game pieces in order to create an Neural Network object detection model, would this model be limited to be able to run only on Limelights or could others use this model on non-Limelight devices? The documentation says that the limelight should be able to support any Tensorflow Lite model, but I wanted to know if this crowdsourced model you are generating and providing will be limited in use in some way, either by data format or by licensing.


The model will not be limited to Limelight OS. You will be able to run it wherever you want.