Gao and colleagues (2017) recently published a paper on the topic of computer vision and pattern recognition about object classifiers capable of recognizing 100,000 objects in a single convolutional neural network.
They think that such classifiers are currently needed for modern visual search and also for mobile applications. They mention that training networks at such a vocabulary scale is a challenge, the biggest barriers being slow training speed and very large model sizes.
One solution would be to train multiple networks, each one on a specific topic, like classifying cars, pets, household items, and so on. However, the complexity would then fall on combining these expert networks, which might also lead to issues of scalability, latency, and computational burden.
Their response to this is:
"To address these challenges, we propose a Knowledge Concentration method, which effectively transfers the knowledge from dozens of specialists (multiple teacher networks) into one single model (one student network) to classify 100K object categories." [source]
Three major points in their method are:
- multi-teacher single student framework for knowledge distillation
- self-paced learning allowing the student to learn from multiple teachers at different paces
- an optimization of the network architecture (structurally connected layers) with limited parameters.
This paper is technical and I'd recommend it as a read to practitioners in the field and also to those who are particularly interested in the topic. I wouldn't recommend it to the lay public though as it may come as hard to grasp the concepts discussed. Anyhow, you can read it in full by following the link below. It's not an easy Sunday read :)
To stay in touch with me, follow @cristi
Cristi Vlad Self-Experimenter and Author
That's amazing network
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit