Object Detection in nvidia Digits

Hi all,

I’m working through some modeling for object detection using nvidia Digits and a dataset for the FRC Power cube (some of DV8’s dataset from last year with some supplimental images from the web and some that I’ve taken). Right now it’s just learning in preparation for next year, but I’m having problems getting precision above 40-45% and mAP above 20%. I haven’t started any custom coding yet, just working with datasets, and the default DetectNet and GoogleNet models (and the Detetnet-Camera code on the TX2).

Does anyone have any guidance on how to improve precision, or what would be considered acceptable values? Any thoughts would be appreciated.

Regards,
Rich Meesters

What’s the size of your dataset?

I have 633 images in the train_db and 98 in the val_db

Regards,
Rich.

And the size of your images? And the size of the objects in them? Pretty sure it is designed for a fixed input size, so if yours don’t match that size you might have to resize them when creating your dataset. Lots of info here, for example : https://github.com/NVIDIA/DIGITS/issues/980

This is awesome, thanks for using the dataset!

You shouldn’t worry too much about all the stats in the DIGITS graph, since they depend on a lot of factors specific to your dataset and model. Just know that if loss is going down and mAP is going up you are on the right track.

For me, 200 epochs were enough where I felt comfortable deploying on the robot, but since you have more images than what I used you should be able to get a good result in less. The best thing to do is to try different models and loss functions and just test them out, so you know what works and what doesn’t. This too is different for every model you will train, and is part of the process of training a good model. You can stage training sessions in a queue with DIGITS so that you can train a bunch of different configured models overnight if you have the power, or just keep DIGITS running for a few days.

Here’s what my training graph looked like for reference, digits bugged out in the early parts where it gets bumpy but you get the idea:

Since you are using DIGITS the images should be automatically resized for you, no need to set that up. I am working on an easy library for deep learning in FRC, so stay tuned for that was well. Good Luck!

Thanks for the reference, I’ll have a look!

Regards,
Rich

Thanks Issac. It was definitely helpful to have a default dataset to start learning with. Once I got over issues with tags, I was able to get object detection going :slight_smile: I’m seeing similar graphs, and have been able to pull the model down to the TX2 and run using the detectnet-camera code (provided in the jetson-inference package - two days to a demo). I’m just concerned about false positives in the image, although adding to the dataset and running some longer learning (more Epochs) seems to have addressed some of that.

I’ll keep testing. Would love to collaborate a bit as we get closer to the season :slight_smile:

Regards,
Rich.