Today’s addition of 13 new languages brings the total number of languages Google Translate understands to 103. The update allows 120 million more people around the world to use the service and will be rolling out over the next couple of days to the web and mobile apps.
This year marks the 10 year anniversary of Google Translate launching. The 103 languages understood by the service covers 99% of the online population according to Google. The new languages are: Amharic, Corsican, Frisian, Kyrgyz, Hawaiian, Kurdish (Kurmanji), Luxembourgish, Samoan, Scots Gaelic, Shona, Sindhi, Pashto and Xhosa.
While a lot of machine learning is used to learn new languages, Google also relies on the Translate Community to improve current languages and add new ones. Over 3 million people have translated approximately 200 million words. Google has listed the locations of the new languages and some fun factoids about them:
- Amharic (Ethiopia) is the second most widely spoken Semitic language after Arabic
- Corsican (Island of Corsica, France) is closely related to Italian and was Napoleon’s first language
- Frisian (Netherlands and Germany) is the native language of over half the inhabitants of the Friesland province of the Netherlands
- Kyrgyz (Kyrgyzstan) is the language of the Epic of Manas, which is 20x longer than the Iliad and the Odyssey put together
- Hawaiian (Hawaii) has lent several words to the English language, such as ukulele and wiki
- Kurdish (Kurmanji) (Turkey, Iraq, Iran and Syria) is written with Latin letters while the others two varieties of Kurdish are written with Arabic script
- Luxembourgish (Luxembourg) completes the list of official EU languages Translate covers
- Samoan (Samoa and American Samoa) is written using only 14 letters
- Scots Gaelic (Scottish highlands, UK) was introduced by Irish settlers in the 4th century AD
- Shona (Zimbabwe) is the most widely spoken of the hundreds of languages in the Bantu family
- Sindhi (Pakistan and India) was the native language of Muhammad Ali Jinnah, the “Father of the Nation” of Pakistan
- Pashto (Afghanistan and Pakistan) is written in Perso-Arabic script with an additional 12 letters, for a total of 44
- Xhosa (South Africa) is the second most common native language in the country after Afrikaans and features three kinds of clicks, represented by the letters x, q and c