With the Pixel 2, Google introduced a very impressive Now Playing feature that uses on-device neural networks to always recognize what’s playing in the background. Google is now improving Assistant’s and the Google app’s native Sound Search feature with the same underlying technology.
Introduced “recently,” the new version of Sound Search is faster and provides more accurate results than before on every Android device. Found on the Google Search app as a widget and Assistant when asking “Hey Google, what’s playing,” Sound Search works server-side with every snippet of audio sent to the cloud for analysis.
In comparison, Now Playing is completely offline with convolutional neural networks used to turn a few seconds of audio into a unique “fingerprint.”
This fingerprint is then compared against an on-device database holding tens of thousands of songs, which is regularly updated to add newly released tracks and remove those that are no longer popular.
Bringing Now Playing’s fingerprinting technology to Sound Search poses the challenge of having to recognize tens of millions of songs rather than thousands. As such, the fingerprinting technology had to be scaled up, but fortunately running on a server removes on-device processing and storage constraints.
Google was able to boost the initial matching phase — where the algorithm finds good candidates — by leveraging more neural networks.
In the second matching phase, a detailed analysis of each candidate is performed to find the correct one. For Sound Search, Google increased the density of embeddings — or the projected representation of the musical fingerprint — from 1s to every .5s to improve the match. Google also worked to lower the matching threshold of popular songs so that more obscure tunes can be added to the database without slowing down recognition speed.
Moving forward, Google is working to improve matching in very quiet and noisy environments, while making the system faster.