Computer Vision Processing - Team 5450

Hello, my name is Brandon Trabucco. I am the lead programmer of FRC Team 5450 SHREC. This is a guide to our team’s computer vision algorithm. We are using Java with opencv version 3.1.0 on the roborio.

Installation Notes

Here are some steps to go about installing opencv on the roborio. Be sure to call System.load(“opencv .so library file”) to load the static library that gets installed on the roborio file system. You can use an FTP client such as File Zilla to find the file. Our team found the file under “/usr/local/lib/libopencv_java310.so”. PuTTY is a good ssh client for issuing the necessary opkg update to install the library

Necessary Setup

As I stated above, be sure to call System.load to load the library before you try to use any of the features of opencv. I would highly suggest implementing a thread specifically for the vision processing due to the memory intensive nature of image processing. The rest of this guide will assume that you already have a working understanding of Java, OpenCV, and trigonometry. See here for documentation on OpenCV.

Step One, Filtering the Image

In order for the computer to accurately identify the desired target, it is necessary to apply a filter to the image. I would suggest translating the image into HSV format and then setting opencv to save the filtered image to a file that can be opened up from an FTP client. This way, as you tweak the threshold values, you can see the live effect on the filtered binary image. Filter values range from 0 to 255.

You will also need to remove some of the remaining bits of noise from your filtered image busing the erode and dilate functions in opencv.

Step Two, Locating the Target

Opencv comes with its own methods for locating and mapping objects in a camera feed. You can use the findContours() and contourArea() functions to find the corners and areas of objects that match your threshold. Since findContours() returns a list, our team used logic to select the object with the largest area; however, different selection methods are possible. Afterwards, you can call boundingRect() using the selected object as an argument to get the bounding rectangle and x-y coordinates of the vision target.

Step Three, Determining Knowns and Unknowns

By this point, it is important to define the variables that we currently know–even the ones derived from our image search. These are the x and y positions of the target (the high goal tape), the height and width of the target, and the ratio of the found height and width. We also need to define some static variables: the actual height and width of the goal. Depending on the camera used, you will have a different field of view to work with. Our team uses a Microsoft Lifecam HD 3000. We found the specifications for it here.

The goal of the vision processing is to calculate the hypotenuse distance to the goal as well as the horizontal distance to the goal and the angle viewing from. All of these values can be derived by applying trigonometry to the vision target as in the next few step.

Step Four, The Trig

We can now use the variables that we defined to make some calculations. Since we know that viewing angle will affect the x and y dimensions of an object’s apparent size, we can estimate the viewing angle of the vision target. The formula is as follows: viewingAngle= arccos( apparentHeightInPixels / actualHeightInPixels ). It is important to note that the apparent height and actual height must both be in pixel units for this to work. You can estimate the actual height based on the apparent width and the known aspect ratio between the actual width and height: actualHeightInPixels = apparentWidth * (physicalHeight / physicalWidth). This method assumes that there is little to no perspective distortion to the apparent x dimension of the vision target–this is true for distances larger than a meter away. It is possible to represent the relationship between the changing apparent height and width as a function, but that is a challenge that I encourage you to try and solve on your own.

We now know a viewing angle and compensated for perspective distortion.

With trigonometry, we can also use triangulation to find the distance to the goal from the camera. Be sure to keep all of your measurements in the same units. With a know physical width of the goal and an angle from the center of the goal to the outer left/right edge as seen by the camera, we can set up a relationship: distanceToGoal = physicalWidth / tan(theta). This is all good, but what is theta? We can infer the theta from the center to the edge of the goal using a proportion since we know the FOV of the camera, the pixel dimensions of the image, and the corrected size of the target. This looks like: theta = k * (apparentWidth / imagePixelWidth) * FOV. Because all cameras have an amount of lense distortion, it is necessary to add a constant k to approximate that distortion. This constant can be tuned after the program is fully set up to improve accuracy.

Lastly, the ground distance to the bottom of the vision target can be approximated using the viewingAngle and distance to the goal. This takes the form: groundDistance = distanceToGoal * cos(viewingAngle). This ground distance can be used to determine if your team’s robot is close enough to the goal to fire the boulder using a known maximum shooting distance.

What Can I Use These Calculations For

One good use of these angles and distances is to automatically calculate an ideal angle to fire the boulder at to go through the goal. Our team accomplished this using a parabolic model to predict the boulder’s flight path as a function of the shooting angle theta. We then used Newtons Method to iterate through angles and find the smallest possible angle to fire a boulder in order to shoot into the high goal. I challenge you to create your own innovative methods to accomplish this.

I hope this is a help to your robotics team.

Best Regards
Brandon Trabucco
Lead Programmer
FRC Team 5450

Are you using a windows computer?

We are using windows 10. We installed opencv 3.1.0 onto the C drive of our laptop as well as onto the roborio. It is imported into eclipse as an external library pointing to the compiled jar and dll files.

I see you linked the data sheet for the lifecam but something I noticed is this only specifies the diagonal field of view. Did you calculate/find the horizontal and vertical field of view? If so, how/where?

I think it would be more reliable to calculate distance based on the y position of the goal in the image rather than trying to use it’s apparent width. It would be much more robust when the camera isn’t face-on to a goal, though it assumes a known camera height and angle.

We found the horizontal and vertical fields of view using right triangle trig and proportionality. We can measure the pixel dimensions of the incoming image from the camera. Assuming that the image is not stretched or distorted from the lens by a significant amount, the diagonal, horizontal, and vertical fields of view make a similar triangle to that of the whole image.

We considered mounting our camera at a stationary angle; however, we found the camera to have a limited range of finding the goal that changed depending on how the camera was angled. We found that having it mounted on our pivoting shooter arm was a bit more flexible.

nice writeup!

if anybody wants an actual example of what this code might look like, you can see team 3019’s vision code here the github project runs on the driverstation but can still be used for what op describes in their post.

What you need to know to calculate distance from the goal:

  • Height of the camera image
  • Vertical FOV
  • Y-distance from the center of the image to some known point on the goal (such as the center of the U)
  • The distance from that point to the ground
  • The distance from the camera to the ground
  • The angle from the camera to the ground

From these, you can calculate the angle of the target point above the horizontal and the distance from the camera to the target point by constructing a right triangle, with the height difference between the camera and the target image as the vertical leg, and the angle you just found as the angle opposite that leg.

Sorry about the poor formatting, I’m on mobile