Running Kimi K2.5 on my desk.
Runs at 24 tok/sec with 2 x 512GB M3 Ultra Mac Studios connected with Thunderbolt 5 (RDMA) using @exolabs / MLX backend.
Yes, it can run clawdbot.
M4 Mac Mini AI Cluster
Uses @exolabs with Thunderbolt 5 interconnect (80Gbps) to run LLMs distributed across 4 M4 Pro Mac Minis.
The cluster is small (iPhone for reference). It’s running Nemotron 70B at 8 tok/sec and scales to Llama 405B (benchmarks soon).
AGI at home
Running DeepSeek R1 across my 7 M4 Pro Mac Minis and 1 M4 Max MacBook Pro.
Total unified memory = 496GB.
Uses @exolabs distributed inference with 4-bit quantization.
Next goal is fp8 (requires >700GB)
🚀 DeepSeek-R1 is here!
⚡ Performance on par with OpenAI-o1
📖 Fully open-source model & technical report
🏆 MIT licensed: Distill & commercialize freely!
🌐 Website & API are live now! Try DeepThink at chat.deepseek.com today!
🐋 1/n
Market close: $NVDA: -16.91% | $AAPL: +3.21%
Why is DeepSeek great for Apple?
Here's a breakdown of the chips that can run DeepSeek V3 and R1 on the market now:
NVIDIA H100: 80GB @ 3TB/s, $25,000, $312.50 per GB
AMD MI300X: 192GB @ 5.3TB/s, $20,000, $104.17 per GB
Apple M2
Running DeepSeek R1 on my desk
Uses @exolabs with Thunderbolt 5 interconnect (80Gbps) to run the full (671B, 8-bit) DeepSeek R1 distributed across 2 M3 Ultra 512GB Mac Studios (1TB total Unified Memory).
Runs at 11 tok/sec. Theoretical max is ~20 tok/sec.
"Somebody got one of the small versions of Llama to run on Windows 98...We could've been talking to our computers in English for the last 30 years" - @pmarca
It was me! I got Llama running on a Pentium II machine with 128MB RAM running Windows 98. Details below.
Apple have given me early access to 2 maxed out M3 Ultra 512GB Mac Studios ahead of the public release.
I will run the full DeepSeek R1 (8-bit) using @exolabs or die trying.
The 1TB(!!) of Unified Memory should be enough for all 671B parameters + context.
If you’re a talented engineer affected by the H-1B changes, come build with us in London @exolabs
- SF-level comp (270K-360K base + equity)
- Best talent from Europe
- Hardcore build culture
- Build something important with massive distribution
Email jobs at exolabs dot net