Домой United States USA — mix Alice, the making of: Behind the scenes with the new AI assistant...

Alice, the making of: Behind the scenes with the new AI assistant from Yandex

374
0
ПОДЕЛИТЬСЯ

Did you ever wonder what it’s like to build an AI personal assistant, or to bridge the language gap? Hint: There’s big data and machine learning involved.
Siri vs. Google Assistant vs. Bixby
Today, another AI assistant is joining the party with Alexa, Google Assistant, Siri, Viv, and the gang. Her name is Alice, and she comes from Russia. Yandex, the Russian internet giant, has big plans for the future and Alice is a key part of those.
Read also: Russia sentences hackers from Humpty Dumpty ring | Facebook, Google, Twitter execs to testify at Russia hearings | Did Russia’s election hacking break international law? Even the experts aren’t sure
Recently, Yandex celebrated its 20 years in Moscow, and the celebration was an opportunity to visit Yandex HQ, converse with some of its top minds, and get the lowdown of what’s cooking and how things work behind the scenes.
Like most high-tech vendors these days, Yandex is using tons of data and advanced machine learning (ML) to develop its products and services. In this case, there is the additional twist of locality and language that Yandex has to cater for, and looking at how Google and Microsoft alumni do it at Yandex illuminates the state of the art.
When ZDNet visited Moscow in late September, it was not just another day in the office for the nearly 3,000 people working in the massive Yandex HQ. It’s not every day that a company celebrates 20 years, and Yandex is alive and kicking, dominating in a market that includes Russia and its peripheral countries.
Those were the days of the final sprint for Alice’s release, too, but Yandex people were feeling confident enough about her already to showcase her to a special guest: the Russian President Vladimir Putin. Admittedly, that is more than the usual stress of releasing new products, but it all worked out in the end.
If anything, as Misha Bilenko jokingly commented during our chat the next day, you don’t want to miss a product release deadline after you’ve committed to it to Putin. Bilenko joined Yandex as head of Machine Intelligence and Research (MIR) after a long stint at Microsoft, and has been heavily involved in the making of Alice, among other things.
That speaks volumes on the building blocks used to create Alice, but Bilenko was definitely not the only one involved. Currently, Alice integrates Yandex services such as Search, News, Weather, Music, and Maps. Alice is available in the Yandex Search app on iOS and Android. There is a beta version for Windows, and Yandex Browser and other Yandex products will soon follow.
Denis Filippov, head of Yandex Speech Technologies, said that Alice provides advanced digital functionality to accomplish tasks with a single tool by centralizing a number of market-leading products. Filipov is in charge of SpeechKit, Yandex’s proprietary speech recognition toolkit that Alice’s voice recognition and synthesis relies on.
Filipov, however, points out that Alice is built on a stack of search, speech, and dialogue technologies: Voice activation, speech recognition, text-to-speech, natural language deficiencies, entity recognition, dialog management, contextual support, search, object answers, and others.
Most of these technologies are based on deep learning (DL), so a vast amount of training data is needed to train them to superior quality. For Filipov, however, this is not a problem: «We have a great source of data since Yandex has the most popular search, geo, and taxi services in Russia and dozens other mobile apps. We also use Yandex Toloka, our crowdsourcing platform, to collect training data.»
As for the future? Ultimately, Yandex wants Alice to become a basic platform to organize interaction between people and devices on all possible surfaces such as smartphones, desktops, smart homes, cars, and any others, Filipov said. Sounds Google-ish. But what about voice data retention and processing — will that be Google-ish, too?
Filipov said that voice requests to Alice are processed by Yandex servers in the cloud: «We retain some of them to widen our training set data to provide our users with better speech recognition quality. It is crucial for us to provide the highest level of privacy to our users, so we retain completely anonymous voice data without any associations with users’ accounts.»
Filipov added that Alice works as a part of other Yandex apps and can’t be exploited to control any other smartphone features, in response to concerns about recently uncovered vulnerabilities in other assistants. Ultimately, though, how does Alice rank compared to other assistants? How does one even begin to make such comparisons actually?
One way of doing that would be to come up with a way of measuring IQ, as was recently done in a comparative test. Filipov, however, seems to favor another metric: WER (Word Error Rate). «We wanted Alice to interact with users more like a human, so that users don’t need to adapt their requests,» he said.
«In developing Alice, we leveraged our speech technologies, which currently provide the world’s most accurate Russian language recognition. Based on WER measurements, Alice demonstrates near-human levels of speech recognition accuracy. Alice uses a hybrid dialog technology with context support, it is a mix of goal-oriented and general conversation models.
Read also: Putin says Russia doesn’t hack others, but patriots might have | Trump fires FBI director James Comey amid ongoing Russia probe | Russia joins arms race to produce cannon-shot swarm bots
«For general conversation Alice was trained not only with predefined answers, which is a common approach for virtual assistants at the moment, but also we went further and rolled out a neural network conversation model, which was trained on tremendously huge amounts of text dialogs from the internet.
«Alice, as all Yandex products, is a user-centric product, so we use some general set of metrics to evaluate it on the high level like daily active audience, users retention, requests per user, and others.
«Russian language offers a unique set of challenges with its grammatical complexities and reliance on tone of voice. Yandex’s focus and expertise in the Russian language allowed us to train Alice to have a superior understanding of users and their various accents.»
That’s all fine and well, but if you’re not a Russian speaker, what’s it to you? At this point, not much, admittedly. But that may well change in the not so distant future, if David Talbot and his team have anything to say about that. People are already using Yandex’s translation combined with OCR, for example, so making the voice connection does not seem like a far cry.
Talbot is leading Yandex’s Machine Translation unit (MTU), after having spent about a decade working on machine translation at Google. His team’s work at this point is not focused on spoken word, but things like natural language processing and entity recognition are both core to their work and part of Alice’s building blocks.
So, if you’re hoping to use Alice in English in the future, Talbot’s team may have to be even bigger and busier than it already is now. Thirty people may sound like a lot, like a startup within a corporation, but getting to know their herculean work may leave you wondering whether that’s even enough.
Talbot and his team had just returned from an international workshop on machine translation when we had our discussion, and that was a good opportunity to get an inside view on the latest developments in the field as well as what is used in practice.
Talbot only joined Yandex recently, but the team he now leads has been working since 2011, initially triggered by fixing misspelled user queries. Since that is in some sense close to being a translation problem, they were the ones to call when Yandex decided to make the non-Russian web transparent for its users.
This is MTU’s mission — to enable Yandex users to interact transparently with any part of the web in their native language. English-born Talbot, a Russian speaker himself, said that although this provides a specific focus on their translation work, MTU uses the techniques that everyone else in the field is using. MTU takes pride in claiming to be the best around when it comes to translation to and from Russian.
And what might these techniques be? A whole lot of ML and DL, basically. Talbot explains there have been huge changes in the field in the last couple of years, owing mostly to progress in DL: «Statistical models have dominated for a long time, but now neural machine translation is the thing.

Continue reading...