In the days since I/O 2018, critique has primarily been lobbied against Google Duplex. There are problems with the feature, but to me — at least — it’s clear that the company thought about some of the societal impacts associated with a nearly indistinguishable human voice out in the world.
However, I think another feature Google announced deserves a similar level of criticism due to its broad impact. Enter John Legend.
At I/O, Sundar Pichai announced that WaveNet’s advancements towards more natural speech is already powering the Assistant’s six new voices. As this DeepMind technology can shorten the studio recording time required, Google began exploring whose “amazing voice” they could capture.
Google and John Legend have had a close relationship beginning at CES 2018 where he played at a Made by Google event. That was followed by him headlining an Oscars ad, with the artist then using the Pixel 2 to record his latest music video.
As such, Legend’s selection as that “amazing voice” is not all too surprising. What followed at the keynote was a whimsical video with Legend shown recording various phrases in a reduced amount of time due to WaveNet.
This feature is objectively whimsical and right up Google’s intersection of technological advancement and fun. It speaks to a geeky fantasy of talking computers — which we’ve had for a few years now — and immense customization, be it HAL 9000 or Knightrider.
However, after that initial amusement, which will definitely provide users with that bit of intended whimsy, there should be some thought to how a readily available voice generator is ripe for abuse and doctored speech. For example, imagine some nefarious party passing off a Legend response from Assistant as a supposedly “leaked” voicemail to a tabloid.
Like with Duplex, I think the team developing this effort foresaw this. The Google CEO distinctly noted that Legend’s voice would be used to reply to users “in certain contexts.”
Onstage, Sundar just showed it being used for Assistant’s good morning Routine and briefing about the day ahead, while another alluded to wishing somebody happy birthday in the singer’s voice.
An easy and obvious safeguard would be not allowing Legend’s voice to be used with the “Custom response” Routine feature where users can essentially program Assistant to say anything.
However, I can already foresee how the most basic reverse engineering could have Legend say anything. In the example involving Sundar’s calendar, just change the name of the appointment to whatever users want and keep doing so until the desired phrase can be recorded and stitched together.
I’m sure that before launch additional safeguards — like having a system that analyzes what’s being said ahead of time — could be put into place after a wider post-I/O feedback, but this trouble warrants asking whether this specific feature is even worth it.
There are a host of other fictional characters that Google could tap into instead to create other whimsical voices for Assistant. In fact, those voices could be used in unlimited contexts throughout the system if a user so desires.
Looking even further into the future, this technology draws discussions about being able capture the likeness of a person and in turn licensing it. One recent example was in the Star Wars film Rogue One where Carrie Fisher’s Princess Leia and Peter Cushing’s character of Grand Moff Tarkin was not just visually recreated, but also had spoken lines.
This is not the first time Google has captured a likeness for use in a commercial product. Waze has long created voice packs for navigation from famous individuals. However, Assistant is so much more broad and widely available, with effectively an unlimited vocabulary.
This — along with Duplex — is the antithesis of Google’s stated goal of “responsibility” at this year’s I/O. Allowing Duplex to make calls on people’s behalf is making people’s lives easier. Google has yet to fully tap or even discuss the accessibility benefits of this feature for those with speech impediments and social anxiety.
However, with celebrity-based Assistant voices, I fail to see how this is anything beyond whimsy or, more cynically, to create a competitive advantage as a result of more advanced underlying technology to draw consumers in.
I get how fun is often at the core of Google, but I think that — in this case — the coolness factor here is overshadowing the good focus on responsibility.