Is anyone else having issues with the accuracy of a trained model or is it just me (Note: i didn’t include the Filming Day 1 Images section because of the difference in resolution). Also the confidence levels are ranging from 30%-90%. I also didn’t use AWS because of the error. I used Google Cloud Vision to create a tensorflow lite model.
Disclaimer: not a programmer, don’t really know what I’m talking about
From my understanding of how machine learning works, the accuracy of a specific ML program is a function of the quantity and quality of images you put in. (From my understanding) if you are not getting adequate results, you need to feed your program more images. In particular, figure out what is causing the low % certainty and feed it a lot of those images.
Again, I’m not sure if what I’m saying is nonsense, as I do a lot more of the mechanical stuff, but this is what makes sense from my understanding of how ML works.
Need more information. How many images did you train your network on? What resolution are those images? What’s your train/test split? What problem are you even trying to solve with ML?
Will also want to know what kind of validation method is being used. Holdout? Cross validation? Mostly we’re looking to learn what size of data you’re actually working with here. Low volume in test can just lead to low confidence as well.
My plan is to label more images, I just would rather have an easier fix.
Apparently you’re in company with Randall Munroe.
- 210 images
- 80% of images are used for training.
- 10% of images are used for hyper-parameter tuning and/or to decide when to stop training.
- 10% of images are used for evaluating the model.
- Detecting the yellow balls
Try decreasing your training set and increasing your validation set. 10% validation seems way too low in my experience. Your model also might be overtuned to a specific situation that most of your dataset’s pictures follow (same background/lighting or taken with a specific camera), in which case you will have very bad accuracy for any input image that isn’t in that same environment.
I typically use 25% if there’s enough data available. If you can tag more images that would obviously be best…
That was the default for Google Cloud Vision, and that’s why I used it. Do you think that 210 images would be enough for the 25% validation
I’m also retraining with the entire dataset, but only 31 images have labels, the other 386 have no objects.
Thats your problem. You need much more than 31 labeled images. Most if not all of your dataset should contain at least one object.
It was 108 when i properly did it, but for anyone in the future who needs the labeled balls: https://app.supervise.ly/share-links/QGCRJNevlgQwDmBv0QaEVT2f4ITKahti7J8hqI2J0z6x81mQSDAQyalP9NJHFc94
Have you experimented with simpler methods? Neural networks are amazing tools, but many times (especially with small data sets) problems can be tackled as well or better by simpler methods, such as conventional computer vision techniques or (for non-CV tasks) linear/logistic regression. Neural networks can model very complex relationships in data, but present opportunities to overfit, which is when a model memorizes the training data instead of learning the underlying patterns.
What is your neural network architecture? It may be worth playing with the number of hidden layers, how many nodes are in each layer, etc. The more layers and nodes you have, the more complex relationships your network can express, which can improve accuracy or introduce overfitting.
How similar is your dataset to the testing set and the actual conditions where the model will be applied? Your testing error is only meaningful if it’s relevant to the application.
I would recommend generating more samples if possible. If you’re working with images, 210 is WAY smaller than the number of features (the number of pixels in the image) you’re working with.
Cool that you’re using machine learning!
I’d recommend using more images to try to get a more comprehensive model.
Yeah, you need a ton more images. We have over 2000 balls labeled for ours, and it works well. We use a different vision system though. The WPILib system seems to have several problems and things that just don’t make sense. I think they rushed development too fast. If it doesn’t turn out for you, please stay open to other solutions.
That’s what I did, but our team labeled the entire dataset. Link
What’s something that doesn’t make sense? I can look into it if you give me a direction.