Google’s Cloud AutoML enters beta w/ custom translation & natural language neural networks

Abner Li | Jul 24 2018 - 10:10 am PT

Earlier this year, Google launched Cloud AutoML Vision in alpha after unveiling at I/O 2017 as a way for neural networks to design better neural nets. At Cloud Next 18, Google is launching AutoML Natural Language and AutoML Translation in beta, along with other updates to core services.

The AutoML suite is aimed at making the benefits of artificial intelligence widely available to all developers, even those that don’t have a machine learning background.

A significant gap exists between the extremes of what’s currently possible with machine learning. At one end, experienced practitioners such as data scientists use tools like TensorFlow and Cloud ML Engine to build custom solutions from the ground up. At the other end, pre-trained machine learning models like Cloud Vision API deliver immediate results with minimal investment and technical proficiency.

Google is targeting the middle ground of developers, noting that “Many have needs beyond what’s available with pre-trained models, but don’t have the skills or resources to build their own custom solutions.”

AutoML Natural Language helps you automatically predict custom text categories specific to domains our customers desire.
AutoML Translation supports the upload of translated language pairs to train your own custom translation model.

Along with AutoML Vision, all three are now available in public beta.

Meanwhile, after announcing the third-generation of Cloud TPUs at I/O 2018, developers can now access the new liquid-cooled processors in alpha. These Tensor Processing Unit pods have eight times the performance of the prior generation, with speeds reaching 100 petaflops.

Other updates to Google’s core machine learning APIs include:

Cloud Vision API now recognizes handwriting, supports additional file types (PDF and TIFF) and product search, and can identify where an object is located within an image.
Cloud Text-to-Speech improvements include multilingual access to voices generated by DeepMind WaveNet technology and the ability to optimize for the type of speaker from which your speech is intended to play.
Cloud Speech-to-Text can identify what language is spoken as well as different speakers in a conversation, word-level confidence scores, and multi-channel recognition so you can record each participant separately in multi-participant recordings.

Check out 9to5Google on YouTube for more news: