From detailed voice guidance in Maps to Android 10’s upcoming Live Caption capability, Google has a slew of accessibility features. The latest in Chrome can automatically create descriptions for images on the web that lack any identifying labels.
Those who are blind or have other vision impairments use screen readers to get spoken feedback or Braille output when reading online. While there is an increased push for sites to label images, there are still many pictures on the web that lack alt text. As a result, screen readers, like ChromeVox, just say “image,” “unlabeled graphic,” or the file name.
Chrome’s new solution sends unlabeled images to Google servers. Several machine learning models — including those that look for text, identify objects, and capture the main idea — work to analyze the photo.
Some models look for text in the image, including signs, labels, and handwritten words. Other models look for objects they’ve been trained to recognize—like a pencil, a tree, a person wearing a business suit, or a helicopter. The most sophisticated model can describe the main idea of an image using a complete sentence.
The outputs are ranked, with Google only returning annotations that are useful and descriptive. In most cases, the simplest answer will be provided to the user’s screen reader. If the ML models cannot accurately and confidently quantify an image, “No description available” will be returned.
Image descriptions automatically generated by a computer aren’t as good as those written by a human who can include additional context, but they can be accurate and helpful.
This “Get Image Descriptions from Google” feature has been in testing for the past several months, and the company has created 10 million descriptions with hundreds of thousands added daily. Full instructions on how to enable Chrome image descriptions are available here.