Oh no, another AI article? Well, yes, and a useful one.
Since (like it or not) we are now living in the age of industrial-grade AI, if you are building mobile products, it’s something you might have to deal with hands on. AI is already inside the apps on your phone, shaping how they work, what they know about you, and how fast they respond.
So yes, no one is passing by the AI topic, and neither is our content manager. And today we are talking about what LLMs are doing in mobile and actual apps people use every day.
What has changed in 2025? What works in production? Where is the real value? And what does it take to make it work on-device?
The Numbers (Still) Don’t Lie
Let’s start here: people want smart apps and that is as clear as the fact that the Earth is round.
As of mid-2025, there is a continued growth in downloads and usage for mobile apps with integrated AI features. The hype might have cooled a bit, but the real use hasn’t.
Apps with AI in their metadata were downloaded over 17 billion times in 2024 (like, wow), and momentum hasn’t slowed in 2025, especially in education, health, finance and productivity categories. But don’t let the buzzwords fool you. Not every app with “AI” in its metadata is meaningfully powered by LLMs. The fact is: users are seeking out smarter apps, and companies are rushing to meet that demand.
So, How Are LLMs Actually Used?
There are three common patterns developers follow:
1. Cloud-based APIs (most common):
Apps connect to services like OpenAI, Gemini, Claude or Mistral APIs. It is fast, flexible and constantly improving. And, it powers everything from ChatGPT’s 1+ billion daily queries to AI-backed email tools, document editors and customer support chat.
2. On-device models:
Smaller models like Gemma 2B, LLaMA 3 8B-Instruct or Apple’s on-device Apple Intelligence system are optimized to run locally on mobile chips. Google’s AI Edge Gallery lets developers deploy directly to Android phones with no roundtrip to the cloud required.
3. Hybrid systems (cloud + edge):
This is where things get interesting. Lightweight tasks (autocomplete, summarization, basic classification) happen on-device. Heavier lifting (like RAG pipelines or reasoning) happens in the cloud. This setup gives you better battery life, lower latency and more privacy control without losing firepower.
Real Implementations in the Wild
Educational apps are leading the way.
Duolingo, for example, uses LLMs to generate content dynamically, adapt difficulty and power roleplay conversations using OpenAi models. Not to mention, spaced repetition logic fine-tuned with AI. That’s hard to replicate manually at scale.
Productivity and enterprise apps aren’t far behind. Grammarly embeds AI suggestions straight into mobile keyboards. Notion lets you rephrase or summarize notes with one tap.
In internal enterprise:
- Instacart uses an internal LLM called Ava for dev workflows;
- Grab automates reporting with RAG-powered assistants;
- Royal Bank of Canada implemented a RAG system called Arcane to solve the challenge of accessing and interpreting complex investment policies and procedures.
These are ideas that deliver results.
Another thing: companies investing in AI see revenue growth of 3-15%. LLM integrations and tools like ChatGPT could increase customer service productivity by 30-45%. Looks good, right?
What About the Tech Limits?
Yes, LLMs are computational beasts. But compression and quantization are changing the game. You can run a Gemma 3B model under 600MB and push 2,585 tokens/second on modern mobile GPUs. That’s practical.
And on the cloud side, cost and latency are the new constraints. Devs are tweaking prompt caching, switching models dynamically and offloading tasks based on complexity just to stay within API budgets and still deliver fluid UX.
The UX Impact
Let’s zoom out. What are users getting out of this?
In a word: personalization. LLM-backed apps can actually adapt. Your fitness app remembers how you train. A shopping app predicts what you will reorder. A language app adjusts the tone of a lesson if you’re struggling.
And users respond. Apps powering personalisation see 40% higher revenue and ads in particular can be up to 5x more effective.
Crucially, this isn’t about replacing human interaction. It’s about enhancing it. For instance – 67% of users think they could benefit from receiving medical guidance from an AI. But it doesn’t mean your kid doesn’t have to finish medical university to become a surgeon!
What’s Next
LLMs are getting multimodal, multilingual and a lot more agent-like. The latest generation of models handle not just text, but voice, images, PDFs and spreadsheets. All in a single interface. The UI implications here are massive.
This year 95% of customer interactions are expected to involve AI in some form, and by 2028, a third of enterprise apps will include autonomous agents, up from less than 1% in 2024.
Plus, there are agent workflows. Not just suggesting next steps, but autonomously carrying them out (with your consent). Think: drafting + sending emails, creating slide decks, or booking travel. With you phone.
Maybe, now is the time to plan for all the shifts that might happen with a blink of an eye.
Final Thoughts
So, are LLMs “changing everything”? Not in some flashy sci-fi way.
But yes, they are reshaping how apps work, how features are built and what users expect, especially in mobile where speed and utility matter most.
And yes, some of this tech is still rough around the edges. But with billions of real-world use cases, rapidly evolving infrastructure, and tangible business results…
It’s safe to say: smart apps are here, in the same room with us. And they’re powered by LLMs you probably don’t even notice.