Skip to main content

Google explains how Recorder Speaker Labels work, plans to use Tensor TPU to save power

As part of December’s Pixel Feature Drop, Google’s excellent Recorder app gained Speaker Labels that can identify multiple people. As with previous editions, the team behind it is out with an explanation of how the feature came to be.

Speaker Labels are powered by Turn-to-Diarize, Google’s new speaker diarization system. There are three main components to it that “run fully on the device”:

  • Speaker turn detection model that detects a change of speaker in the input speech
  • Speaker encoder model that extracts voice characteristics from each speaker turn
  • Multi-stage clustering algorithm that annotates speaker labels to each speaker turn in a highly efficient way

Our speaker diarization system leverages several highly optimized machine learning models and algorithms to allow diarizing hours of audio in a real-time streaming fashion with limited computational resources on mobile devices.

Google notes that audio recordings from the Recorder app can be “as long as up to 18 hours,” and that more audio means greater “confidence on predicted speaker labels.” As such, Recorder will “occasionally make corrections to previously predicted low-confidence speaker labels,” while users can manually make edits and split the transcript. 

The current system mostly runs on Tensor’s CPU, with both the first generation and G2 supported across the Pixel 6, 6 Pro, 6a, 7, and 7 Pro. For the future, Google is “working on delegating more computations to the TPU block, which will further reduce the overall power consumption of the diarization system.” At the moment, Recorder 4.2 contains warning text about how Speaker Labels will not work if your “Device is too hot.” 

Another future work direction is to leverage multilingual capabilities of speaker encoder and speech recognition models to expand this feature to more languages.

More on Google Pixel:

FTC: We use income earning auto affiliate links. More.

You’re reading 9to5Google — experts who break news about Google and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Google on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel

Comments

Author

Avatar for Abner Li Abner Li

Editor-in-chief. Interested in the minutiae of Google and Alphabet. Tips/talk: abner@9to5g.com