SolvePnP seems......sensitive

It’s the off season, and we’re trying to get things done that didn’t happen during the regular season. High on my list is improving autonomous programming and computer vision.

We’re using OpenCV on a Jevois camera. My main task right now is to use the vision targets to guide the robot to the right spot to place hatch panels. Fairly straightforward. Lots of teams have done it, and yet we didn’t get it done this season.

There are a few ways to approach this task, but here’s the one I chose:

Use an LED ring and low sensitivity settings on the camera to make finding the targets easy.
Within the image, find the two largest regions.
Draw an oriented bounding rectangle around each region. This bounding rectangle is assumed to be the location of the pieces of tape.
Find the corners of the rectangles in the images.
Associate the corners in the 2-D image with known objects in 3-D space, i.e. the known locations of the vision targets. I’m only using the highest two corners of each of the rectangles, although I could use more.
Use SolvePnP to determine camera position and orientation with respect to the known point locations.

I set the origin of the world coordinates to be centered between the two vision targets, at the level of the inner corners at the top of the rectangle.

My camera is using 320x240 resolution. I placed the camera 36 inches away from the targets, as close to facing straight ahead as I could. (The camera is on the side of the robot, and it’s difficult to place with extreme precision. I got what I think is close enough.)

So, I take some pictures, and find the points of interest in the image. The coordinates of the found points are

(105,194), (124,197),(195,194),(214,187)

Using the known coordinates of the tops of the tape, and the camera matrix and distortion coeefficients provided with the Jevois camera, SolvePnP tells me I am 35.85 inches away. Not bad.

Unfortunately, not every frame comes out the same. The next frame had coordinates

(104,194), (124,197),(195,194),(214,188)

Those coordinates are nearly identical. Two of the corners have a one pixel difference. With that one pixel difference on two out of the four points, SolvePnP says that the camera is 16.86 inches away, and all the angles are totally wrong.

Is there some trick to using SolvePnP? I know other teams have used it. I’ve used it myself without problems on robots outside of FIRST, and I’ve never noticed this extreme sensitivity to variation. (My experience with it is pretty limited. Just a few sample programs on a Raspberry Pi robot.) I have heard that there is some iinstability in the algorithm, and that SolvePnPRansac might do better. I’m planning on trying that next, but if anyone has some relevant experience to lend me on the subject, I would be grateful. That much variation due to a one pixel difference in lcation would render it pretty useless on a moving robot.

When I was initially trying to put together solvePnP on our jevois, the readings were garbage. When I increased the resolution from 320x240 to 640x480, without changing the algorithm, the readings were nearly spot on. Try running your camera at a higher resolution and see if the values it gives you get any better

Fundamentally, solving for full 6dof position with only one reference point/perspective is a game of resolution. Lower resolutions just can’t provide this. If only shutters were easier to synchronize…

We managed to do this this season, with what sounds like roughly your setup, but it was a little tricky.

  1. You say you are using the top 4 corners of the tape. I suspect that this is your biggest problem. Those 4 corners are nearly in a straight line, so you really are only feeding in a 1 dimensional object into the algorithm; it really needs a proper 2d shape. We used the 4 outer corners and got good results. The outer corners have significant distance in both dimension on the image. Also, even when a hatch is placed in the target, the outer corners are, most of the time, not obscured.
  2. With this setup, we usually got good results for the distance and the “turn angle” of the robot (angle to turn to point straight at the target). Those two quantities are “easily” tied to the width of the target and the center point in the image. The other angle (rotation of target w.r.t. robot), however, was quite sensitive and we originally had it bounced +/- 15 degrees. We solved this by using minAreaRectangle() to “fit” the two regions, instead of approxPolyDp(). It sounds like you are doing this, but it is worth a mention.
1 Like

I suspected this might be part of the problem. I’m at the build room now, and that was next on my list of things to do, but I thought I would check the thread before I jumped in.

I did find, in some other work, that minAreaRectangle tended to be better than approxPolyDp, so that’s what I’m using now. approxPolyDp sometimes cut an area too short.

I will also try increasing resolution, but I wanted to avoid that, because I’m planning on doing some more things, and processing time will definitely become an issue. I was just extremely surprised at just how extreme the value change was with such a small change in input. One pixel difference on one corner changed the value by three inches, but one pixel on two corners just made the numbers garbage.

Thanks to everyone who replied. I’ll post the results after I rerun the tests.

Yes, I think that was the root of our problem with the 2nd angle fluctuating dramatically. A 1-2 pixel change on the size of one of the strips would lead to a big change in the angle.

After much effort, I’m getting some pretty good numbers. Adding the bottom corners to the calculation helped immensely. I was still getting wrong numbers, though…until, after checking all sorts of possibilities, I redid my world calculations and realized I had used a tape length of 5 inches when calculating the position of the bottom corners. (The actual tape length is 5.5 inches.)

Now I still have to make it more robust and tie it into the actual motion, but at least I’m getting a good set of stable calculations.

Just when I thought it was safe to drive a robot autonomously………

I was pretty happy when my numbers were good, as I said above, but then I started actually paying attention. The distances were correct, but the angles have issues.

People actually use this function to control their autonomous routines, right? Here’s what’s happening when I try.

I get the tape locations and calculate their positions as world coordinates. I find the corresponding spots in image coordinates. For both world and image coordinates, I’m using the X axis to the right, and the Y axis down, so the Z axis is forward. Now, I put the camera at a 30 degree angle, pointing toward the targets. The algorithm finds the targets, and matches image points to object points correctly. What I am expecting to get out of SolvePnP is a rotation vector (0, pi/6, 0) (Of course it isn’t exactly that, but it ought to be close.) Instead, what I get is (-pi, 0, pi/6)

The distances, in the returned translation vector, seem very good, within 2 inches of reality, and varying in the correct way if I move the camera around.

I think, but I am not certain, that what is happening is that there are multiple solutions to the SolvePnP problem, and I’m getting an alternate solution, with the camera placed “behind” the targets. I can’t explain the difference between the y and z axis that way. I read about the “multiple solution” problem after I couldn’t figure out why it really wanted to flip some axes about. A professor was explaining it on youtube, and noting the problem was especially bad for coplanar points, which is what we are dealing with when we use tape targets.

One disturbing thing is that one way I check for accuracy is reprojecting points. In other words, I do this:

            cv::solvePnP(objPoints, imagePoints, cameraMatrix, distCoeffs, rvec, tvec);
            vector<Point2f> reprojectedPoints;
	projectPoints(objPoints, rvec, tvec, cameraMatrix, distCoeffs, reprojectedPoints);

When checking the values of “reprojected points”, with the answer from SolvePnP, it doesn’t return the original points. However, if I put in my expected answer (0, pi/6,0) with the expected translation values, it returns points that are almost exactly the values in the input image. It seems like even if I were getting an alternate solution, it would still project to the correct points.

I’ve tried SolvePnP and SolvePnPRansac, with or without an extrinsic guess. I also upgraded the jeVois to the latest, which has Python 3.7 and OpenCV 4, hoping that there was something pathological about earlier versions of OpenCV.

We used openCV two years ago. One thing to check early on is to verify you are only finding two rectangles. Our robot did much better on the practice field when we closed the window blinds on a sunny day for our Limelight. Seeing something besides the tape will corrupt the position.

It’s finding the right rectangles at the right spaces. That part is fine. It just doesn’t return what I expect from solvePnP.

One thing I discovered is that I was misinterpreting the output rVec, but the result in this specific case would be unaffected (Euler angles versus axis/rotation representation). Tomorrow, I’ll translate my results to Euler angles and see if that makes a difference. I doubt it will fix the problem, but hope springs eternal. Even if the problem is not fixed, at least I will know a bit more about the problem than I did yesterday.

I suggest you don’t try using the “rVec” directly. You should unpack it using the routine Rodrigues() to a 3x3 rotation matrix. That worked correctly for us.

You can see our Python code to go from rVec to rot_matrix to individual angles here:
https://github.com/ligerbots/VisionServer/blob/master/server/rrtargetfinder2019.py
Look in routine compute_output_values().

We also wrote a white paper:
A Step by Step Run-through of FRC Vision Processing
One of the sections explains how we find the angles from the solvePnP output.

1 Like

Thanks. I had figured that out. Is there a way to do anything about the “flip” problem? It’s not truly a problem. It just occurs because if the targets are coplanar, which they are, the camera could be in front of them or behind them. There’s really no way to know. What I found was that if solvePnP returned the solution for in front of the camera, everything I did worked. If it returned the “mirror”, nothing worked, so I had to do Rodrigues and extract the angles.

Anyway, it’s working pretty well now. I’ll be posting what I did (for the benefit of anyone who finds the thread via a search) Real Soon Now.

I am not sure why you are seeing a “flip” (do kind of understand what you are saying). The position of the target relative to the camera is actually indicated by the tVec value, and I think it “knows” that a seen object is always in front of the camera. The rVec value only indicates the rotation between the two coordinate systems. In my experience, it has never been out of whack, but I have always unpacked with Rodrigues(), so maybe that removes any ambiguity.