Plastic Wastes Detection using YOLO

I’m now working on the project “PlasticAI” which is aiming for detecting plastic wastes on the beach. As an experiment for that, I have trained object detection system with the custom dataset that I collected in Expedition 1 in June. The result of this experiment greatly demonstrates the power of the object detection system. Seeing is believing, I’ll show the outcomes first.


the resulted images with predicted bounding boxes

The trained model precisely predicted the bounding box of a plastic bottle cap.

Training Dataset

Up until I conducted a model training, I haven’t been sure about whether the amount of training dataset is sufficient. Because plastic wastes are very diverse in shapes and colors. But, as a result, as far as the shape of objects are similar, this amount of dataset has been proved to be enough.

In the last expedition to Makuhari Beach in June, I’ve shot a lot of images. I had no difficulty finding bottle caps on the beach. That’s sad, but I winded up with 484 picture files of bottle caps which can be a generous amount of training data for 1 class.

The demanding part of preparing training data is annotating bounding boxes on each image files. I used customised BBox-Label-Tool [1].

a sample of bounding-box annotation

Just for convenience, I open-sourced the dataset on GitHub[2] so that other engineers can use it freely.

marine_plastics_dataset
https://github.com/sudamasahiko/marine_plastics_dataset

Training

Training was done on AWS’s P2 instance, taking about 20 minutes. Over the course of the training process, the validation loss drops rapidly.

One thing that I want to note is that there is a spike in the middle of training, which probably means that the network escaped from local minima and continued learning.

The average validation loss eventually dropped to about 0.04, although this score doesn’t simply tell me that the model is good enough to perform intended detection.

Test

In the test run, I used a couple of images that I have put aside from the training dataset. It means that these images are unknown for the trained neural network. The prediction was done so quickly and I’ve got the resulted images.


the resulted images with predicted bounding boxes

It’s impressive that the predicted bounding-boxes are so precise.

Recap

The main takeaway of this experiment is that detecting plastic bottle caps just works. And it encourages further experiments.

Reference

[1] BBox-Label-Tool
https://github.com/puzzledqs/BBox-Label-Tool

[2] marine_plastics_dataset
https://github.com/sudamasahiko/marine_plastics_dataset

Train Object Detection System with 3 Classes

Deep-learning-based object detection is a state-of-the-art yet powerful algorithm. And over the course of the last couple of years, a lot of progress has been made in this field. It has momentum and huge potential for the future, I think.

Now is the high time for actual implementation to solve problems. The project “Microplastic AI” is aiming for building the AI that can detect plastic debris on the beach. And object detection is going to be a core technology of the project.

As in the last article, training YOLO with 1 class was a good success (Train Object Detection System with 1 Class). But in order to delve into this system even deeper, I extended the training dataset and ended up to have 3 classes. Just for your convenience, I open sourced the training data as follows.

GitHub repository:
jp_coins_dataset

The training data has 524 images in total. In addition to that, the dataset has text files in which bounding boxes are annotated. And some config files also come with. At first, I had no idea if this amount of data has enough feature information to detect objects, but the end result was pleasant.

Here’s one of the results.

An interesting takeaway is the comparison between the model trained with 1 class and the one with 3 classes. Prediction accuracy of the model with 3 classes obviously outperforms. I think this is because a 1 yen coin and a 100 yen coin have similar color, and having been both classes trained, the neural network seems to have learned a subtle difference between those classes. This means that if you’re likely to have similar objects

Let’s say you have an image that has an object that you’re going to detect, and visually similar objects may be in the adjacent space. In a situation like that, you should train not only your target object but also similar objects. Because that would allow you a better detection accuracy.

With this experiment done successfully, the microplastic AI project has been one step closer to reality.

References:
Darknet official project site:
https://pjreddie.com/darknet/yolo/
Github repository
https://github.com/AlexeyAB/darknet

Expedition #1

In Makuhari beach, 28th June 2019.

It’s imaginable that learning plastic fragments is challenging for the AI in many ways because plastic wastes, in general, are very diverse in shapes or colors, that makes harder to obtain the ability to generalize what plastic waste should look like. So it’ll be a good approach to split up the problem into several stages. For a starter, I’ll be focusing on detecting plastic bottle caps.

Both luckily and sadly, the beach that day was full of plastic bottle caps. And I took pictures of them with the digital camera and my smartphone, which winded up with about 500 images in total, which can be a generous amount of training data for 1 class of object.

It’s still hard for me though to tell if this works until I train the neural network. Let’s give it a shot.