Alex Cheema (@alexocheema) / X

Alex Cheema

5,716 posts

Alex Cheema

@alexocheema

building @exolabs | prev @UniOfOxford We're hiring: exolabs.net

github.com/exo-explore/exo

Joined October 2017

Pinned
Alex Cheema
@alexocheema
Jan 28
Running Kimi K2.5 on my desk. Runs at 24 tok/sec with 2 x 512GB M3 Ultra Mac Studios connected with Thunderbolt 5 (RDMA) using @exolabs / MLX backend. Yes, it can run clawdbot.
00:00
3M
Alex Cheema
@alexocheema
Nov 9, 2024
M4 Mac Mini AI Cluster Uses @exolabs with Thunderbolt 5 interconnect (80Gbps) to run LLMs distributed across 4 M4 Pro Mac Minis. The cluster is small (iPhone for reference). It’s running Nemotron 70B at 8 tok/sec and scales to Llama 405B (benchmarks soon).
00:00
3.5M
Alex Cheema
@alexocheema
Jan 21, 2025
AGI at home Running DeepSeek R1 across my 7 M4 Pro Mac Minis and 1 M4 Max MacBook Pro. Total unified memory = 496GB. Uses @exolabs distributed inference with 4-bit quantization. Next goal is fp8 (requires >700GB)
00:00
1.9M
Alex Cheema
@alexocheema
Jan 20, 2025
I will run AGI at home or die trying. DeepSeek R1 should run fast on these macs. They have a total of 896GB unified memory @ 3557GB/s
DeepSeek
@deepseek_ai
Jan 20, 2025
🚀 DeepSeek-R1 is here! ⚡ Performance on par with OpenAI-o1 📖 Fully open-source model & technical report 🏆 MIT licensed: Distill & commercialize freely! 🌐 Website & API are live now! Try DeepThink at chat.deepseek.com today! 🐋 1/n
1.4M
Alex Cheema
@alexocheema
Jan 27, 2025
Market close: $NVDA: -16.91% | $AAPL: +3.21% Why is DeepSeek great for Apple? Here's a breakdown of the chips that can run DeepSeek V3 and R1 on the market now: NVIDIA H100: 80GB @ 3TB/s, $25,000, $312.50 per GB AMD MI300X: 192GB @ 5.3TB/s, $20,000, $104.17 per GB Apple M2
1.2M
Alex Cheema
@alexocheema
Mar 12, 2025
Running DeepSeek R1 on my desk Uses @exolabs with Thunderbolt 5 interconnect (80Gbps) to run the full (671B, 8-bit) DeepSeek R1 distributed across 2 M3 Ultra 512GB Mac Studios (1TB total Unified Memory). Runs at 11 tok/sec. Theoretical max is ~20 tok/sec.
00:00
992K
Alex Cheema
@alexocheema
Jan 15, 2025
"Somebody got one of the small versions of Llama to run on Windows 98...We could've been talking to our computers in English for the last 30 years" - @pmarca It was me! I got Llama running on a Pentium II machine with 128MB RAM running Windows 98. Details below.
00:00
766K
Alex Cheema
@alexocheema
Jun 28, 2025
fml linkedin is unusable
689K
Alex Cheema
@alexocheema
Nov 12, 2024
Backdoor attempt on @exolabs through an innocent looking PR. Read every line of code. Stay safu.
1.2M
Alex Cheema
@alexocheema
Dec 25, 2024
Raspberry Pi 5 is a bit smaller than NVIDIA Jetson Orin Nano Super
474K
Alex Cheema
@alexocheema
Jun 13, 2024
Connecting a bunch of iPhones, iPads and MacBooks together over a local network to make one big GPU. Uses Apple’s open source ML library, MLX
Mohamed Baioumy
@mo_baioumy
Jun 13, 2024
One more Apple announcement this week: you can now run your personal AI cluster using Apple devices @exolabs_ h/t @awnihannun
00:00
756K
Alex Cheema
@alexocheema
Jul 24, 2024
2 MacBooks is all you need. Llama 3.1 405B running distributed across 2 MacBooks using @exolabs_ home AI cluster
00:00
1.3M
Alex Cheema
@alexocheema
Mar 11, 2025
Apple have given me early access to 2 maxed out M3 Ultra 512GB Mac Studios ahead of the public release. I will run the full DeepSeek R1 (8-bit) using @exolabs or die trying. The 1TB(!!) of Unified Memory should be enough for all 671B parameters + context.
Alex Cheema
@alexocheema
Mar 11, 2025
Package acquired.
1.2M
Alex Cheema
@alexocheema
Sep 20, 2025
If you’re a talented engineer affected by the H-1B changes, come build with us in London @exolabs - SF-level comp (270K-360K base + equity) - Best talent from Europe - Hardcore build culture - Build something important with massive distribution Email jobs at exolabs dot net
612K