Apple hasn’t talked too much about AI so far — but it’s been working on stuff. A lot of stuff.
It would be easy to think that Apple is late to the game on AI. Since late 2022, when ChatGPT took the world by storm, most of Apple’s competitors have fallen over themselves to catch up. While Apple has certainly talked about AI and even released some products with AI in mind, it seemed to be dipping a toe in rather than diving in headfirst.
But over the last few months, rumors and reports have suggested that Apple has, in fact, just been biding its time, waiting to make its move. There have been reports in recent weeks that Apple is talking to both OpenAI and Google about powering some of its AI features, and the company has also been working on its own model, called Ajax.
If you look through Apple’s published AI research, a picture starts to develop of how Apple’s approach to AI might come to life. Now, obviously, making product assumptions based on research papers is a deeply inexact science — the line from research to store shelves is windy and full of potholes. But you can at least get a sense of what the company is thinking about — and how its AI features might work when Apple starts to talk about them at its annual developer conference, WWDC, in June.
Smaller, more efficient models
I suspect you and I are hoping for the same thing here: Better Siri. And it looks very much like Better Siri is coming! There’s an assumption in a lot of Apple’s research (and in a lot of the tech industry, the world, and everywhere) that large language models will immediately make virtual assistants better and smarter. For Apple, getting to Better Siri means making those models as fast as possible — and making sure they’re everywhere.
In iOS 18, Apple plans to have all its AI features running on an on-device, fully offline model, Bloomberg recently reported. It’s tough to build a good multipurpose model even when you have a network of data centers and thousands of state-of-the-art GPUs — it’s drastically harder to do it with only the guts inside your smartphone. So Apple’s having to get creative.
In a paper called “LLM in a flash: Efficient Large Language Model Inference with Limited Memory” (all these papers have really boring titles but are really interesting, I promise!), researchers devised a system for storing a model’s data, which is usually stored on your device’s RAM, on the SSD instead. “We have demonstrated the ability to run LLMs up to twice the size of available DRAM [on the SSD],” the researchers wrote, “achieving an acceleration in inference speed by 4-5x compared to traditional loading methods in CPU, and 20-25x in GPU.” By taking advantage of the most inexpensive and available storage on your device, they found, the models can run faster and more efficiently.
Source : https://www.theverge.com/2024/5/5/24147995/apple-siri-ai-research-chatbot-creativity