Speech recognition

Six 'Speech recognition' stories May 2012 - June 2017

Google’s speech recognition is now almost as accurate as humans

Abner Li Jun 1 2017 - 9:56 am PT

At I/O 2017, Sundar Pichai noted that computers are getting better at understanding voice input, with Google having achieved “significant breakthroughs” in speech recognition. In fact, Google’s machine learning systems are now nearly on par with humans.

Expand
Expanding
Close

Google posts 8 minute video showing how far we’ve come with speech recognition

Mark Hearn Oct 17 2014 - 12:47 pm PT

0 Comments

https://www.youtube.com/watch?v=yxxRAHVtafI&feature=youtu.be

Behind the Mic: The Science of Talking with Computers

Language. Easy for humans to understand (most of the time), but not so easy for computers. This is a short film about speech recognition, language understanding, neural nets, and using our voices to communicate with the technology around us.

Just in case you didn’t get the memo, Google is really big on voice search. The company’s voice command-friendly Google Now tech is available across multiple platforms and according to some recent research, teenagers are crazy about talking to their smartphones, but how does it all work?

Speaking to your mobile devices are starting to become more commonplace, however there’s a lot of behind the scenes work that goes into developing speech recognition.

Domino’s Pizza launches voice ordering in mobile apps powered by Nuance

Jordan Kahn Jun 16 2014 - 9:34 am PT

0 Comments

Dominos announced today that its launching a new voice ordering feature in its iPhone and Android apps that is powered by Nuance’s Nina Mobile voice speech recognition, speech synthesis and natural language understanding technology. The company says the feature will provide “a human-like, conversational customer service experience that allows users to speak an order and quickly add items to their cart.” Imagine opening the app and placing your order by saying, for example, “I’d like a large pizza with extra cheese, pepperoni and onions” or “I’ll take a 14-piece order of Hot Wings”.

“There will be a day where typing on keyboards or with thumbs on mobile devices will come to a close; we want to be the ones who continue to advance the technology experience – hand-in-hand with our customers,” said Patrick Doyle, Domino’s Pizza president and CEO. “Our mobile app users who are a part of this launch are truly helping set the foundation for the innovations of today, that will soon enough become the standards of tomorrow.”

The platform, in partnership with Nuance, will redefine technology convenience – and puts Domino’s at the forefront of an intuitive ordering method that is a true first within both traditional and e-commerce retail.

With the updated app rolling out today, you’ll also be able to browse menus, coupons and navigate through the app using your voice. The feature rolls out in beta today and is available in the updated Domino’s Pizza app for iOS and Android now.

NYT: X Lab Googlers built a ‘brain’ that identifies cats in YouTube videos

Élyse Betters Jun 26 2012 - 2:25 pm PT

0 Comments

Google X Laboratory scientists have worked on a simulation of the human brain for the last few years, and now they are using it to indentify cats.

According to The New York Times, Google researchers created “one of the largest neural networks for machine learning by connecting 16,000 computer processors, which they turned loose on the Internet to learn on its own.” More specifically, Google turned the “brain” to 10 million images found in YouTube videos about cats:

The neural network taught itself to recognize cats, which is actually no frivolous activity. This week the researchers will present the results of their work at a conference in Edinburgh, Scotland. The Google scientists and programmers will note that while it is hardly news that the Internet is full of cat videos, the simulation nevertheless surprised them. It performed far better than any previous effort by roughly doubling its accuracy in recognizing objects in a challenging list of 20,000 distinct items.

The research is representative of a new generation of computer science that is exploiting the falling cost of computing and the availability of huge clusters of computers in giant data centers. It is leading to significant advances in areas as diverse as machine vision and perception, speech recognition and language translation.

Google’s brain eventually constructed a digital patchwork of a cat by cropping general features from the millions of images that it identified. The method could eventually prove useful in image search, speech recognition, and language translation. The Googlers maintained caution, however, about whether their research is, as The New York Times put it, “the holy grail of machines that can teach themselves.”

The research project is no longer a part of Google X laboratory, but rather search business and related services.

Expand
Expanding
Close

Nuance overhauls Swype for Android with word prediction, letter tracing, and Dragon voice-recognition [Video]

Élyse Betters Jun 20 2012 - 8:45 am PT

0 Comments

[youtube=http://www.youtube.com/watch?v=BCTjgbEtYKY&feature=player_embedded#]

Nuance just released a significant update to its Swype keyboard software that it acquired last year.

The entire UI is revamped. Users can still quick-swipe for input, but the keyboard now offers multimodal support with the option to press keys and initiate Nuance’s “Next Word Prediction” technology. Swype’s built-in dictionary actually learns over time and crops words from emails and texts for easier communication. Users also have the ability to handwrite or trace letters, words, or symbols, or they can click the Dragon Diction button to launch integrated voice-recognition.

The latest version of Swype is now in beta, but it is not compatible with all Android devices.

Visit beta.swype.com for more information.

Expand
Expanding
Close

Samsung blocking Vlingo-powered ‘S-Voice’ feature for non-Galaxy S III users

Jordan Kahn May 21 2012 - 7:08 am PT

0 Comments

Shortly after the new Siri-like voice assistant feature of Samsung’s new Galaxy S III, known as “S-Voice,” made its way to other Android devices via an available APK, reports claimed Samsung began to block non-S III users from accessing the service. XDA Developer community members confirmed (via TNW) that Vlingo, the company behind the voice recognition technology used in S-Voice, is now blocking users from trying to access its servers with devices other than the S III.

In December of last year, Nuance, the company currently powering speech in Apple’s Siri app on the iPhone 4S, acquired Vlingo. Samsung previously collaborated with Vlingo for the Voice Commander feature for the Galaxy S II. We expect Nuance has improved Vlingo since the acquisition, while Apple’s relationship with Nuance is not stopping Samsung from using the Vlingo technology. In an interview in October, Norman Winarsky, co-founder of Siri, told us Vlingo was originally used in Siri when it first developed, but noted it is rather easy for apps like Siri to implement new speech recognition technology if it comes along.

Expand
Expanding
Close