AI tools, love them or hate them, have been a big deal in coding and app development, and Google is now actively testing out what the best tools are for Android app development – here’s the full list.
The new “Android Bench” is a leaderboard of the best AI models to use for making Android apps. Google actively checks the top AI LLM models against a benchmark of tests that aim to figure out how these tools can handle building Android apps. Google says that it looks at how the models can work with Jetpack Compose for UI, Coroutines and Flows for asynchronous programming, room for persistence, and hilt for dependency injection. Other points include “navigation migrations, Gradle/build configurations, or the handling of breaking changes across SDK updates,” while Google says that it also measures how these tools work with core and more niche parts of Android such as camera, system UI, media, foldable adaptation, and more.
Google says that its goal is to show which AI models work best for Android app development, as existing benchmarks don’t cover the challenges a developer might face while working on Android apps.
AI-assisted software engineering has seen the emergence of several benchmarks to measure the capabilities of LLMs. Android developers face specific challenges that aren’t covered by existing benchmarks, so we created one that focuses on Android development.
With the methodology out of the way, what is the best AI model for Android app development?
In what shouldn’t be a surprise, Google says that Gemini 3.1 Pro Preview is the top of the class with a score of 72.4% in the benchmark. Second was Claude Opus 4.6, followed by OpenAI’s GPT 5.2 Codex. The lowest score came from Gemini 2.5 Flash, at just 16.1%.
Best AI for Android app development, according to Google
- Gemini 3.1 Pro Preview: 72.4%
- Claude Opus 4.6: 66.6%
- GPT-5.2 Codex: 62.5%
- Claude Opus 4.5: 61.9%
- Gemini 3 Pro Preview: 60.4%
- Claude Sonnet 4.6: 58.4%
- Claude Sonnet 4.5: 54.2%
- Gemini 3 Flash Preview: 42%
- Gemini 2.5 Flash: 16.1%
Google says that, by publishing these numbers and rankings, it hopes to “encourage LLM improvements for Android development” while also helping developers be “more productive” and, ultimately, deliver “higher quality apps across the Android ecosystem.”
More on Android:
- Google starts calling out Android apps that drain your battery before you download them
- AirDrop over Android Quick Share support will roll out this month, Oppo says
- Google vs Epic Games dispute ends as ‘registered’ Android app stores, lower fees roll out this year
Follow Ben: Twitter/X, Threads, Bluesky, and Instagram
FTC: We use income earning auto affiliate links. More.

Comments