Understanding vision processing from a birds eye view (45 Degree facing camera)

I basically should tag 254 in this question… because I am trying to adopt their very cool vision system.

I have a camera that is angled downward trying to track balls. I angle the camera downward at the front of the robot so I can basically only see a few feet of carpet… The camera is probably angled down 45 degrees (if 0 was straight on). From my understanding using X is left and right, Y is forward and Z is vertical… obviously this is all of reference… and it looks like in 254’s code they use Y as the left and right, X is forward (which seems to be held to 1) and Z is the height.

Once I detect a ball, I get the center of it… (X, Y) in Pixels. I guess translated to 254’s concept… my X is technically Y (left->right) and my Y is technically the X as it is the distance it is in the cameras field of view.

I know since my camera is facing ~45 degrees downward there is some sort of translation I need to do to make this into a Pose2D and proper translations… so basically I can find the angle the robot needs to turn… accurately.

So the TLDR of the question is I am basically trying to figure out how to translate it on the plane.
Am I understanding how 254 deals with these vectors correctly?

Ok, so first, full disclosure, I have no idea exactly how 254 did theirs and its been a while since I have messed with either vision processing, programming, or trig, but the fundamentals are fairly basic and we have google.

There are two ways of doing this, one which requires trig and another which doesn’t, the latter I’ll save for the end.

Now for the first method, I need to make an assumption: your camera is not centered above your robot’s center of rotation (ie: the point over which your robot will pivot if one side of your drivetrain is in full forward and the other is in full reverse). This is important, since the angle you want your robot to rotate to face a ball will not be the same as the angle require to rotate your camera to face the ball.

Looking at the picture, we can see that theta, the angle from straight ahead to the center of the ball, is not the same as phi, the angle from straight ahead to the center of the ball. We want to find phi given theta, so lets look at it in a more trigonometric kind of way.

If you have taken a geometry or maybe a pre-calc class, you may recognize this triangle can be solved if we know a couple of sides lengths and an angle. Remember, we are looking for phi, angel A in the diagram, given theta. Speaking of givens, lets list them.

  1. Theta = 45 deg
    The value of theta will change on the balls position, but let’s call it 45 degrees

  2. c = 12"
    This is the distance from the center of rotation to the center of the camera. Let’s say its 12 inches, but this will depend on your robot.

  3. a = 20"
    The distance from the ball to the camera, we will discuss how to find this later, but let’s call it 20 inches.

If you have ever solved triangles before, you may recognize that what we have here is an SAS, or side-angle-side, triangle. To solve this, we will use the Law of Cosines and Law of Sines. If you haven’t heard of these before, don’t worry, they are quite simple to solve in code. Let’s solve it:

  1. Since we know theta, we also know angle B by subtracting theta from 180:

    B = 180 - theta = 180 - 45 = 135 deg

  2. Now for the Law of Cosines to find side b (Note that I switched some variable names for this particular example, however the structure remains the same):

    b^2 = a^2 + c^2 - 2ac cos(B)

    or simplified to find b:

    b = sqrt(a^2 + c^2 - 2ac cos(B))

    b = 29.72"

  3. Now of Law of Sines to find angle A:

    sin(B / b) = sin(A / a)

    or simplified to find A:

    A = arcsin((a sin(B)) / b)

    A = 28.41 deg

Thus, from theta, we have found A, or phi.

Now, like I said previously, we have to find side a. Since you are already doing some vision processing, you will probably be able to measure the diameter of the ball and use its known diameter compared to its perceived diameter to find its distance from the camera. I have never done this before, but it can probably be done like this.

Since we have side a, we can find theta. If the center of the camera is facing straight ahead, we measure the horizontal distance (d) from the center of the ball to the center of the camera in pixels and convert it to inches or whatever using the processes in the link above. Then its just a simple matter of arcsin( a / d ) to find the angle theta.

And now for the second method. While method one was a lot of fun to figure out, this may be easier to implement. Basically, if the ball is to the left of center of your camera, rotate the robot counterclockwise until the ball is in the middle of the camera. Control loops may also be implemented to prevent overreactions and stalling.

Anyway, hope this helps!

This all makes sense! only difference is my camera is basically centered on the robot… but not centered in its view horizontally… meaning its pointed downards

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.