Google has trained a neural network, named PlaNet, to figure out where an image was taken down to the city and even exact street level. The machine only needs to analyze a photo’s pixels in order to accomplish the task.
Specifically, PlaNet is “able to localize 3.6 percent of the images at street-level accuracy and 10.1 percent at city-level accuracy.” It has an even better rate (28.4%) of determining what country and continent (48%) a photo is taken in. Its accuracy rate was determined by running 2.3 million geotagged photos from Flickr.
The work was done by a team of Googlers led by Tobias Weyland. They first created a world grid with 26,000 squares of varying size based on how many photos were taken in the area. 91 million images with location data were then analyzed by the machine and placed in the correct grid. A further 34 million photos were used to validate the neural network. Surprisingly, the 126 million images with Exif data only totaled 377MB in size.
The team created a game, available to the public, that takes a random place from Google Street View and has users drop a pin on a world map. It will then tell users how close their actual guess is to the image’s actual location. When pitted against 10 humans, PlaNet was able to beat them in 28 out of 50 rounds.