ML Vision with High Goal

While in the offseason, the programming team of 834 was able to create machine learning vision to run on basically any Linux system. WPI recently did documentation about creating machine learning vision. They included a link on Supervise.ly for ~4500 images. The members of the team working on annotation have gotten around 1000 images done. Overall, there’s 2000 balls (more than enough to get a start) and only about 100 goals. I understand that the images were intended for power cell training, but it would be nice to have goal tracking as well. Are there a lot of goal pictures later in the images, or is there anyplace I can find more pictures of the goal? Has anyone else tried to track the high goal with vision?

1 Like

I believe that most people would rather use the reflective tape to track the high goal. I bet the reason they have so many power cell photos to train with is because they aren’t reflective.

2 Likes

Hi, I am a part of the WPILib machine learning team.

We chose not to label the inner port, as it’s kind of hard to get a good view of the inner port in many orientations of the robot. Also, you can use some trigonometry/linear algebra (depending on what kind of vision algorithm you are using) to determine the relative location of the inner port.

@GoalkeeperBoss you are correct that we labelled the Power Cells because they aren’t reflective. We also want to see teams possibly add another layer of autonomy on their robots, having not only autonomous scoring with “traditional” CV but autonomous intake using machine learning.

Hope this is insightful.

1 Like

Out of curiosity, what are the benefits of using ML vs traditional CV?

Are the compute times faster? Is it less susceptible to lighting changes?

Don’t you need a good GPU for such tasks (CUDA)?

ML/AI is an area I want to get into but I just don’t know much about it in terms of real world applications

I cannot speak to the speed as we are just getting started with ML, but a good GPU does help. That is what the Google Coral does. It works as a GPU specifically for the purposes of processing the video feed on the robot.

Sometimes. Lots of people use a GPU for training a model and a CPU for executing it. Just really depends on the code and task.

Quote from the WPILib blog.

Advantages of using machine learning over other methods are that it is easy to recognize objects without retroreflective tape, work under a variety of lighting conditions without tuning, and recognize multiple types of elements at the same time.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.