Apple’s AI strategy

I am old enough to remember when virtually every graphic artist chose Macintosh workstations over the less sophisticated Windows PCs. Now, Apple’s M5 Series Macs are the AI machines to best.

Mar 11, 2026

MacBook Pro M5 Max could change AI economics

Source: newsletter@thedeepview.co

Why this interests me.

In December, 2025, I bought a new Mac Mini M4 series computer, a separate 27-inch LG screen and a SanDisk 1 terabyte SSD external drive.

These days, I have been preparing to restructure and offload near 600 GB of cloud storage of media and document files and to distribute and restructure these to the Mac Mini HD and the SanDisk External SSD. It's lots of work🥺

Why bother?

With a properly structured information library, I can set up a local LLM system at home on this setup. It will become my personal research assistant when I begin to rewrite my second ebook Digital Direct Democracy.

After nearly three years using up to six different LLM tools for a variety of use cases, I am convinced of the increased creativity and productivity possible from this investment for this use case as well as many others. I already use it for medical and health-related advice which is an essential resource in this era of a less than ideal access to a family physician or reliable, trustworthy information.

Besides, working with AI is fun and it is infinitely more convenient than relying on scheduling meetings with “the experts” 🤓

After reading the article below, I will likely upgrade the M4 for an M5 Mac Mini later this year. I consider this strategy by Apple to be brilliant. It may be a good time to buy Apple shares. 👀😁

Jason Hinder offers his experience and thoughts below.

I’ve been testing Apple’s MacBook Pro M5 Max for the past week, and the real story isn’t how crazy fast it is, but how it’s going to change the game for AI inference.
The model I’ve been testing is a 16-inch MacBook Pro with an 18-core CPU and 40-core GPU M5 Max chip, and 128GB of unified memory. This is a $6,000 machine. Previously, that would have sounded nuts, unless you were a video or audio producer or a 3D graphic artist.
But the AI boom has provided a new reason and a new clientele for a machine like this.
It’s all about running AI inference locally. That’s become especially urgent recently because AI inference costs have been running out of control and threatening to derail the AI revolution, because it makes it very difficult to run AI profitably. This gained additional attention this week when entrepreneur Chamath Palihapitiya reported that AI inference costs at his startup have tripled over the past three months, while productivity and profitability have not increased.
Apple did two things to make the M5 Max MacBook Pro a monster at running AI models locally:
Expanded memory bandwidth means the GPU can now access all of the onboard memory, so you can run much larger parameter models (up to 70B)
The new “Neural Accelerators” in every GPU core will speed up the performance of local LLMs
This MacBook Pro could break one of the key pillars of the current AI ecosystem: paying for inference tokens for models hosted in the cloud. That’s what Palihapitiya was waving the red flag about. These token costs get insanely expensive very fast. Entrepreneurs running AI agents like OpenClaw say the token costs of running OpenClaw to build software and apps can end up costing more in monthly token fees than the salary of a human developer.
On the MacBook Pro M5 Max, you can run very powerful models locally for free using several different options:
Apple MLX is Apple’s lesser-known open-source command-line tool for running language models directly on Apple silicon chips
LM Studio is a popular GUI tool for downloading the latest models and running them from your own computer
Ollama is the preferred tool for developers to manage various models, and it’s free when you run it on your local system
All of these platforms allow you to freely download and locally run the latest open-source models, including Qwen, DeepSeek, GPT-OSS (OpenAI), Gemma (Google), Llama (Meta), Nemotron (Nvidia), MiniMax, GLM (Z.ai), Mistral, and more.
Beyond the cost savings, running these models locally also has important benefits for security, privacy, and data sovereignty.

Since the end of 2025, I’ve been hearing from executives and leaders about the out-of-control costs of AI inference and their efforts to reduce them by using smaller models, domain-specific models, and open-source models. When applied smartly to specific use cases, leaders reported that this approach can deliver better performance, fewer hallucinations, and lower costs. I now fully expect that running models locally on machines like the MacBook Pro M5 Max will become another part of the answer. I’ll continue testing the models locally on this machine and share what I’m learning. You can find me on X/Twitter at x.com/jasonhiner to follow my updates in real time.

My Life Lens

Discussion about this post

Ready for more?