If you're into deep learning, you're probably aware of the state-of-the-art object detection system YOLO. It's name stands for 'You Only Look Once'. They recently released V3.
Ok, so what's different about YOLO?
Well, according to the creators, other detection systems apply a model to the image at multiple locations and at different scales. Regions with higher scores are taken as detections.
On the other hand, YOLO applies a single neural net to the entire image, which divides the full image into regions and makes predictions for each region. And the predicted bounded boxes are compared to the predicted probabilities. In their words:
"Our model has several advantages over classifier-based systems. It looks at the whole image at test time so its predictions are informed by global context in the image. It also makes predictions with a single network evaluation unlike systems like R-CNN which require thousands for a single image.
This makes it extremely fast, more than 1000x faster than R-CNN and 100x faster than Fast R-CNN. See our paper for more details on the full system." [source]
So, in this third version they made a few tricks to optimize the training and to increase the performance of the system. The full details, for those interested in technicalities are in the paper below. And here's the release video:
YOLOv3 definitely takes computer vision to a superior level, if you ask me.
To stay in touch with me, follow @cristi
Cristi Vlad Self-Experimenter and Author
Very interesting! Object detection has gotten quite a bit more sophisticated over the last couple of years.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
YOLO, you only look ones. Thanks for the information
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
That eagle-dog classification is scary xD
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit