23,000+ Image Note Dataset

Dataset Colab is a software that allows FRC teams to collaborate on a large object detection / vision dataset. It currently has a over 7,000 image dataset that are free to download. This makes it really easy for teams to train custom game object detection models!

Users can upload their Roboflow or COCO datasets which are then automatically combined into the large community dataset (after review). Dataset Colab is open source and can be contributed to at GitHub - Team4169/datasetcolab: A dataset collaboration tool for FRC.

19 Likes

Neat.

Sent you a PR to make creating accounts a bit easier (I get real irked when Chrome doesn’t suggest passwords)

4 Likes

What is the difference between the “note” and “orange-donut” labels?

If they imported the RoboFlow RobofFlow dataset (not a type, first instance refers to the account owner) the notes were tagged as “orange-donut” not note.

Hello,

I saw that you asked a question about the different between “note” and “orange-donut” labels. I was wondering if you were just asking generally or you saw both labels within the Dataset Colab dataset?

Thank you,
Sean

We saw both. The coco categories in the datasetcolab dataset our script saw were orange donut, note, and robot

I just found what you are referencing, I am working to resolve it now. Thanks again for bringing it up.

1 Like

This should be fixed now. The only categories are “note” and “robot” which are within the “objects” supercategory.

Update: Dataset Colab now has over 23,000+ images in the public FRC 2024 dataset. Signup to download this dataset and train custom note detection models! Users can also contribute to this dataset by uploading Roboflow or COCO datasets which are automatically combined (after review).

3 Likes

This is awesome! Do you think this could be published on https://www.kaggle.com/? It’s a website for deep learning where users can share datasets and use cloud GPUs for free.

1 Like

We could possibly post on Kaggle. What are the benefits of posting on Kaggle rather than just hosting all the datasets on our website?

Update: Thanks to @GearFox and @goran25, you can now download datasets from Dataset Colab in the COCO, YOLO, and TFRecord formats. In addition, Dataset Colab now has over 25,000+ images that can be used to train powerful object detection models!

Coming soon:

  • Pretrained models will be available to download.
  • Improved dataset viewing with an image grid and other related css improvements ot the website.
  • We will be setting up a test server so the production server will not be down when we are implementing a new feature.
3 Likes

I think publishing on Kaggle would reduce the barrier to entry for students who would like to experiment with creating deep learning models but don’t have access to GPUs. I may just be projecting here but I don’t have a GPU and the only way I can train models right now is through GPUs on the cloud through websites like Kaggle.

But honestly, I can just use the curl command to download the dataset and use it in a Kaggle notebook, so I guess there may not be much of a benefit to publishing it. It also might be an extra hassle to have to keep updating the kaggle dataset when new images are added.

Regardless of if you end up publishing it, thanks for the great work you’ve been doing!

hello, YOLO type label files all look empty

Hi Gordon, I agree completely that being able to train a model without a GPU is vital. I can look into automatically uploading to Kaggle. I’ll keep you posted. Thank you for the suggestion!

Thank you for letting us know! Everything should be fixed now.

Awesome! Thanks!

Came here to ask the same… any preferred training techniques or libraries?

We poked around at doing this this past weekend… and found ourselves down a rabbit hole fast of figuring out how to create some sort of tensorflow model to run on an orangepi/raspberry pi

Hi @arguablybetter and @joemost,

We are currently working on implementing a new page where you will be able to download pretrained models to be used directly on the orange/raspberry pi, google coral, and Luxonis OAK-D. With this we will also be adding a documentation page with more information about how we trained each of these models and steps to follow. I’ll keep you posted on the progress of this, but we are hoping to finish it by the end of the week!

Thank you,
Sean

3 Likes

that’d be sick! Thank you!