The global sales volume of smart phones can reach more than one billion units each year, which has become the largest entrance to AI hardware. The voice assistant, one of the key forms of AI landing, is stepping into the mobile phone entrance.
On the one hand, the Internet giants are pushing for the proliferation of voice and occupying the largest entrance to intelligent hardware. On the other hand, mobile phone manufacturers are using AI capabilities as the key point for mobile phone iteration innovation and enhanced user experience. Voice assistants are being driven by the two forces and are becoming The standard phone.
Through the observation of the mobile voice assistant market and interviews with industry leaders, we found that in the market of billions of mobile phone voice assistants, behind these three forces are actually leading, and they constitute the core strength of mobile voice assistants.
Three categories of voice assistants gather in mobile phones
In 2017, the world’s top ten smart phone manufacturers accounted for nearly 70% of the global market share. The top ten mobile phone manufacturers, regardless of their own research or cooperation with third parties, are equipped with voice assistants in mobile phones, such as Apple’s Siri, Samsung’s Bixby. Huawei's Xiao E et al.
The market research organization IDG report shows that in 2017, global smartphone shipments totaled 1.462 billion units. Based on this data, the top ten mobile phone manufacturers in the world have shipped more than one billion units, which means that in 2017 alone, there will be one billion mobile phone voice assistants.
Some other niche mobile phone brands, such as Meizu, 360, and Sharp, also have voice assistants. The founder of the hammer cell phone, Luo Yonghao, has also vigorously promoted the launch of a new generation of voice interactive systems. The voice assistant has already landed on major mobile phone brands for a while, becoming the standard for mobile phones.
At present, voice assistants in mobile phones can be roughly divided into three categories:
The first type is the system-level voice assistant. It is directly deployed by the handset manufacturers in the system. For example, Apple's Siri can achieve the linkage with the commonly-used APP in addition to the daily life information, and has the strongest functionality.
The second category is third-party voice assistant APP (non-system layer), such as the degree secret assistant and the aurora rhyme assistant, generally require the user to manually install, you can get life services, encyclopedia questions, news and other information via voice.
The third category is functional voice assistants, which are often built into other apps with large traffic. For example, Taobao, Jingdong, and Baidu have built-in voice assistants to assist users in acquiring services or information conveniently. In addition, as Xiao Bing Microsoft also appeared in WeChat and Weibo, the main function is chat.
With the popularity of voice technology and the prevalence of voice interactions, mobile phone voice assistants will become increasingly important in mobile phones. They will gradually be led by mobile phone manufacturers to lay out the voice portal from the overall system. Non-system-level third-party voice assistants APPs are likely to change to the first- or third-class voice assistants. The third-party voice assistants are slightly weak whether they are from the wake-up level or the scheduling of different mobile applications. The functional voice assistant will continue to exist in more APPs, providing users with more convenient and multiple interactive ways to provide APP user experience.
In these three types of mobile phone voice assistants, this article focuses on the voice assistant at the system level, which is dominated by mobile phone manufacturers. This will be the mainstream form of mobile voice assistance. The phone voice assistant can be roughly disassembled into three parts: voice technology, content service, and system optimization. Voice technology, including technologies such as telephony, voice recognition, semantic understanding, and speech synthesis, is mostly implemented by technology companies; on the one hand, content and services, such as search, weather, and information, are mostly provided by content services. Provided on the other hand is the optimization of the system layer, through the voice portal to achieve more and more open and linkage between APP, mostly by the mobile phone manufacturers to complete.
Although the major mobile phone manufacturers have introduced their own voice assistants, the names are varied, small E, Xiao Ai, Xiaoxi, Xiaoou ... ... but behind their voice technology is indeed from another allocation.
The player behind the mobile voice assistant
By observing the top ten mobile phone manufacturers in the world and some mobile phone manufacturers in China in 2017, we discovered that the voice technology behind mobile assistants only a handful of internet giants will make mobile phones from their own research voice technology and more mobile phone manufacturers. It will adopt a cooperative model and use third-party voice technology to lay out the mobile voice assistant.
Currently, the voice technologies behind major mobile phone manufacturers are mainly provided by three types of manufacturers. One is Internet giants such as Google, Amazon, and Baidu, which often set up a complete set of voice interaction technologies and content services in mobile phones. For example, Amazon and Google's competition around voice portals abroad is fierce, and the mobile phone entrance is also an important battleground. At present, Amazon’s Fire Phone, MOTO, Coolpad, and Huawei all have access to Alexa, and Google is not weak. Such as their own Pixel, LG and other mobile phones have access to Google Assistant.
In terms of the layout of domestic mobile voice portals, Baidu's DuerOS is the fastest. Baidu has announced partners such as HTC, vivo, Huawei, and Xiaomi. According to a person in charge of vivo, they are working with Baidu DuerOS to create voice interactions in a car scene.
The second category is mature voice technology vendors such as Nuance, HKUST, and Sogou companions. In the early days, Apple, Samsung, and others all used Nuance's speech recognition technology. However, with the maturity of deep learning technology and the emphasis placed on voice technology, each has embarked on its own path.
In China, the two largest providers of voice technology are HKUST News and Baidu, and HKUST has started earlier in speech technology. BAT voice technology was provided by CTF as early as 2010. A person in charge of HKUST’s internal communications told Zhizhi that currently about 80% of the domestic mobile phone's voice technology is provided by HKUST News. Major customers include Huawei, vivo, OPPO, Meizu, and Jinli. The main form of cooperation is mainly technical authorization. The products of mobile phone manufacturers are mainly voice assistants and voice camera functions in cameras.
The third category is about start-up companies that engage in voice interactions. With the prevalence of voice interactions and the importance of mobile phone manufacturers to voice assistants, these companies have begun to secretly make efforts around their own core technologies, such companies have Orion Star, think of Chi , triangle animals and so on. For example, in the just-released Xiaomi MIX 2S, “Little Love Classmates” used Orion's speech synthesis technology. A technical leader of Orion Star told Zhizhi that they are currently negotiating with other mobile phone manufacturers. Still not disclosed.
Through the real technology providers behind these three types of mobile phone voice assistants, we can see that under the prevalence of voice interaction and huge mobile phone entrances, the Internet giants have accelerated the ability to complete the set of voice interactions and seize service entrance; the old-fashioned voice technology providers are also With its own advantages in speech recognition, speech synthesis, and other aspects, it lays the groundwork for technology entry; and startups are also attempting to use their own advantages in speech recognition, semantic understanding, or speech synthesis in the market under the huge entrance of mobile phones. A cup of tea.
Mobile phone voice assistant spring
Since Apple introduced mobile phone assistant Siri in 2011, major mobile phone manufacturers have also launched voice assistants, which can be set at the system level to wake up specific keys or open by voice assistant APP, but the usage rate is low. This is because the previous mobile assistant has a lower level of intelligence, which is more like a simple overlay of voice recognition and search functions, and has poorer interactive capabilities and user experience. On the other hand, the voice assistant has a single function, such as when you want to listen to music. At most, it is to help you open the music app. It's better to touch the screen and interact with each other in one step. The function is also more tasteless.
In the past two years, with the relatively mature voice interactive capabilities and the fieryness of smart speakers, voice interaction has begun to be favored by the industry and is being deployed as the next generation of human-computer interaction. With Internet giants such as Amazon and Google setting off an entrance battle around voice interaction, mobile phones are one of the most important entry points for competition. At present, Amazon Alexa has landed phones such as Fire Phone, MOTO, Huawei, and Cool, and Google has also placed Google Assistant on mobile phones such as Pixel and LG, and has opened access rights to Android 5.0 or later mobile phones.
In the relatively saturated mobile phone market, the market is slightly weak, and product innovation is insufficient, AI has become a key point for product innovation iterations and user experience upgrades. On the one hand, mobile phone manufacturers will use AI to visually place mobile phones to launch object recognition, smart beauty, etc. On the other hand, they will create more intelligent voice assistants and introduce voice interaction into mobile phones.
Under the joint promotion of these two forces, many mobile phone manufacturers have been creating new mobile voice assistants. For example, Samsung introduced a new Bixby last year, including voice, visual, and reminding functions.
At the end of March this year, millet MIX 2S released by Xiaomi is also equipped with Xiao Ai’s voice assistant. The person in charge of Vivo also disclosed to Chih-shing that it is currently developing a new generation of voice assistants, and it will develop voice interactions with DuerOS to develop car scenes. It is expected to be launched soon. In addition, the hammer mobile phone will also release a voice interactive system in May this year, for which Luo Yonghao has done propaganda for half a year.
It can be seen that mobile phone manufacturers are enthusiastically embracing voice interactions, and voice assistants have ushered in another spring. In the wisdom of things, the new mobile voice assistant has several major changes compared to previous years:
One is the further strengthening of the voice portal. Voice is no longer an application in the mobile phone operation interface, but gradually evolves into the same entry level as the operation interface. Users can use this portal to complete some services more conveniently, such as finding a food and then hitting a car with one click.
The second is more intelligent. In the aspect of voice, voice can be directly awakened, and the dimension of voice interaction is more in-depth. For example, Samsung Bixby can send micro-channels and circle of friends by voice. Although the experience is also constantly uttered by users, visual support mostly supports intelligent identification and intelligent translation.
The third is richer content services. The core of the voice assistant is to provide users with more convenient services, rather than repeat the simple APP open operation of the voice assistant in the past. The current voice assistants cover high-frequency applications such as Didi taxi, Alipay, WeChat, Weibo, Taobao, and QQ, making it easier to implement cross-scene linkage through voice.
Its four scenes are deeply optimized and customized. For example, interacting with a mobile phone via a keystroke in a driving environment is a very dangerous matter. Vivo X21 in the driving mode, navigation and incoming calls in the form of a floating window, if you further add voice interaction, depth optimization for the car environment, the experience will be better.
Conclusion: Voice interaction is fully occupying mobile phones
The emergence of smart speakers will polish the voice interaction technology to a relatively mature state. While the smart speakers are tapping into cities in the world, voice assistants are also falling onto the scenes of household appliances, vehicles, and lighting. At present, voice assistants are also entering the entrance of billions of mobile phones, and voice manufacturers are welcoming the best opportunities in the era. Whether it is the Internet giants or voice technology vendors are embracing the mobile phone entrance and seizing development opportunities.
On the other hand, as an important representative of AI technology, the Matthew effect of the voice assistant is becoming more and more obvious. You can see behind the major mobile phone manufacturers to create voice assistant, everywhere are Amazon, Google, Baidu and other large Internet giants figure, voice assistant has become a giant's contest.