Getting camera pose from Apriltag pose

I’m attempting to do something which should be quite simple and easy. I have a system which can detect apriltags return their pose (position and orientation vectors) relative to the camera. Since I’m trying to eventually localize the camera on the field. I would like to know the pose of the camera relative to the apriltag.

The library returns the pose in x right y down z forward format. I’ve heard from many places that all I have to do is invert the pose matrix of the apriltag but, how do I actually do that given only the tag’s position and orientation?

For context, I’m using the isaac_ros_apriltag library running on a Jetson Orin Nano.

1 Like

One potential solution is in wpilib’s Translation3d source code - allwpilib/wpimath/src/main/java/edu/wpi/first/math/geometry/Transform3d.java at main · wpilibsuite/allwpilib · GitHub

You’ll need to get the opposite (unary minus) of both the translation and the rotation, undoing the rotation before undoing the translation.

However, this has some dependency on whether isaac_ros_apriltag’s pose representation matches wpilib’s… which I’m not sure about.

2 Likes

Thanks for replying. I ended up installing the python Transform3d library, constructing a transform from the tag pose, and inverting it via the library, which seemed to do the trick, and was a bit easier than doing it myself.

I wish I knew that library was a thing before I started doing this.

You can keep all the calculations in your Java Robot code to take advantage of the WPILib math utilities. You’ll need the field layout and a filtering mechanism. The code looks like this:

import edu.wpi.first.apriltag.AprilTagFieldLayout;
import edu.wpi.first.math.geometry.Transform3d;
import edu.wpi.first.math.geometry.Pose3d;
import edu.wpi.first.math.geometry.Rotation3d;

private static final AprilTagFieldLayout fieldLayout = AprilTagFields.kDefaultField.loadAprilTagLayoutField();
private final Transform3d cameraToRobot = ... // how the camera is installed on the robot

public Optinal<Pose3d> estimateRobotPoseFromSingleTagObservation(int tagID, double translationX, double translationY, double translationZ, double rotationX, double rotationY, double rotationZ) {
  final Transform3d cameraToTarget = new Transform3d(translationX, translationY, translationZ, new Rotation3d(rotationX, rotationY, rotationZ));
  final Optional<Pose3d> tagPoseOnField = fieldLayout.getTagPose(tagID);
  
  // if the tag is not in the field layout
  if (targetPoseOnField.isEmpty())
    return Optinal.empty();

  return Optinal.of(tagPoseOnField.get()
        .transformBy(cameraToTarget.inverse())
        .transformBy(robotToCamera.inverse())
  );
}

// Example filtering mechanism
public Optinal<Pose3d> filter(Optinal3d<Pose3d> estimation) {
  if (estimation.isEmpty())
    return estimation;
  
  // correct results should have z smaller than 0.5 (because robots can't fly)
  double heightOK = Math.abs(estimation.get().getZ()) < 0.5;
  // correct results should have pitch angle smaller than 5 degrees
  double pitchOK = Math.abs(estimation.getRotation().getY()) < Math.toRadians(5);
  // correct results should have roll angle smaller than 5 degrees
  double rollOK = Math.abs(estimation.getRotation().getX()) < Math.toRadians(5);
  if (heightOK && pitchOK && rollOK)
    return estimation;
  else 
    return Optinal.empty();
}



Here are some screenshots from PhotonVision(https://youtu.be/iV2v7F_9GwE?si=Lp_a6qWM02eg5DPQ)

4 Likes

I’d like to do as much as possible on the coprocessor and save the roborio some cpu cycles. I think that with multi-tag estimation, filtering, and matrix operations happening every second, that system would be best left to a machine with a GPU.

I don’t think performance is a concern in this matter. As long as you keep the solve-PNP running on the coprocessor, there should be no loop time issue. CPUs are FASSSST; applying some transformations to a few 3d coordinates would take less than a millisecond, even on that 800MHz CPU of the Rio. If you keep the pose estimation and filtering on the Rio, you can get Log Replay for your vision. That’s worth the effort, IMO, because you can fix your filter rapidly between matches.