It emerged yesterday that Google hires people to transcribe Assistant queries from Home and other smart devices. The company today described the practice as “critical” to bringing Assistant to other languages, while promising an investigation into the leak of customer data.
According to Google, “language experts” that transcribe a “small set of queries” are a “critical part of the process of building speech technology.” These human transcribers for Google Assistant have a better grasp on the “nuances and accents of a specific language,” and only review 0.2% of audio snippets.
Audio snippets are not associated with user accounts as part of the review process, and reviewers are directed not to transcribe background conversations or other noises, and only to transcribe snippets that are directed to Google.
Beyond the anonymized nature of the short recordings, yesterday’s report detailed how smart devices often picked up other people speaking. Google says that it’s only transcribing the intended question or command, and not background noise.
Additionally, Google noted a “number of protections” to prevent “false accepts” — noise or words in the background that get interpreted as the “Hey Google” hotword to enable the microphone.
According to the company, it takes a “wide range of safeguards to protect user privacy throughout the entire review process.” However, it’s clear that those policies aren’t stringent enough given that over 1,000 recordings were leaked to the Belgian publication behind yesterday’s story by a transcriber working on Google Assistant.
Google confirmed the breach of its data security policies and leak of “confidential Dutch audio data,” but not the scale. Its internal teams are investigating this specific case, with safeguards being reviewed to “prevent misconduct” in the future.
We just learned that one of these language reviewers has violated our data security policies by leaking confidential Dutch audio data. Our Security and Privacy Response teams have been activated on this issue, are investigating, and we will take action. We are conducting a full review of our safeguards in this space to prevent misconduct like this from happening again.