Technology, AI & Jeff Bezos, Amazon

How did Amazon win the voice-based search with Amazon Echo

Jeff Bezos, inspired by the original star trek series, came up with the idea of talking to a device, and the device talks back to you. He assembled a team and gave them 6 months to pull the product.

Saranjeet Singh

--

Voice-based search is on rising and so far Amazon is killing over 40 million users (in 2018). This number is estimated to rise to 130 million by the end of 2025. Amazon Echo is used in about 70% of smart speakers. However, its journey from inception to full product is also very interesting.

In 2010, Jeff Bezos started working with the engineers at lab216, Amazon’s silicon valley R&D subsidiary. Here Amazon developed its first gadget, Kindle Tab. Kindle has given a Project A name, fire phone Project B, and a desk lamp-shaped device, designed to project holograms onto the table, has given the name Project C (it never got launched). During the production of Kindle, Bezos insisted to keep the microphone.

Bezos had star trek vision, where he can imagine people talking to the device, and the device talks back. That inspired him to create Amazon Echo, which Alexa’s voice. The project was given the name of Project D. He made Greg Hart, his personal tech advisor, head of project D. He emailed his AWS executives about his latest idea of a device with a brain, that’s totally controlled by voice but costs only $20. Later, he collected the team, funded the project with hundred’s of millions of dollars & attended meeting every other day to discuss the progress

Greg Hart, head of the project, started looking for prospective hires and started sending emails with the title “how would you design a kindle for the blind”? John Thimsen, Amzn exec, signed on as director of engineering. Gave the name Doppler ( for project D ) to the project. Bezos wanted to launch the Echo in 6–12 months. On Oct 4, 2011, Apple launched the iPhone 4s with Siri. Hart was both relaxed as it validated the idea and anxious as it got very mixed reviews. A feeling started to settling that their product most probably going to fail.

Apple had bought Nuance, a Boston-based startup to make Siri. And Nuance had bought many big and small startups to create their voice-based product. Doppler team started looking for other startups in the US and in Europe. The first company Amazon bought for $25 million was YAP, from N. Carolina. It builds tools to translate human speech to text. Amazon discarded most of the tech used but got the engineering talent that helped Doppler to convert user’s speech into text.

The project was so secret when YAP’s engineers went to Italy for a conference, they were asked to pretend they don’t know any Amazon executive so no one could find Amazon’s interest in speech technology. The second bought was a Polish company, IVONA, it generated computer-synthesized speech that resembles a human voice. It was founded by a CS student Lukasz Osinski & he created a text-to-speech product for the visually impaired.

By 2006, the company expanded by adding 20 other languages, and in 2016, Amazon bought it for $20 million. Bezos wanted a single empathetic voice with warmth and trustworthiness which more commonly associated with the female voice. Amazon kept its Alexa’s voice a secret till now and rumors spread that it was Nina Rolle, a Boulder-based singer. She is still now allowed to talk about her voice given to Alexa. Any request to speak to Nina has been refused by Amazon.

Bezos wanted it to be better than just playing music by voice. This made team very frustrated as they blamed Bezo’s lack of taste in music. Bezos from the beginning wanted it to be a voice-based computer, a two-way system to answer the questions asked by the user. Many Amazon employees got the opportunity to test the Beta. They were asked to sign the confidential document that they will not bring to their friends or relatives. However, the reviews unanimously were, “slow and dumb”.

Then third acquisition, an England-based AI company, EVEE. Founder William T Pedoe created an Android and iOS app where users ask questions by speech and unlike Siri reading from the web for an answer, it used technology called knowledge graph. It gave Alexa a new brain. Amazon hired Rohit Parsad, from Ranchi India to work on the Doppler project. He spoke against the knowledge graph technology and forced on Deep learning models as the foundation of Alexa’s brain. The problem arose due to Amazon’s lack of data even though they got enough hardware.

Bezos was furious about the speed of the project. In a heated argument, Bezos told to Parsad, “in order for it to make successful instead of 40, it will take 20 years”. He ended the meeting by saying “you guys are not serious about the product”. By 2014, Amazon speech data increased 10,000X. That made Bezos happy and he replied in a meeting “ Now I know you guys are serious”. The product was almost ready and no one had come up with the name. All agreed with Amazon Flash and just before the launch, the name was scrapped.

At last, Echo was the name all agreed upon but it was a little late for the product launch. So the first few products delivered without any label. The echo was a huge success, unlike the Fire phone. A whole class of speech apps came to existence just because of Alexa and made Amazon a strong company in a newly created industry by Amazon. In the 2015 holidays, 1 million Echo devices sold.

--

--

Saranjeet Singh

I write tutorials on Python, JavaScript, React, and Django. I write for InPlainEnglish and TheStartUp .