Pinned
Modal
1,529 posts
AI infrastructure that developers love 💚
Run inference, sandboxes, batch processing, training, and many other things on Modal
- Our new Auto Endpoints feature is powered by a new Modal primitive: Modal Servers. In this blogpost, we walk through design principles and detailed architecture: @EnvoyProxy, @googlecloud Spanner config store, and a @Cloudflare Pingora-based custom proxy.
00:00 - Modal repostedStill don't think people fully appreciate how big dflash can be for inference latency/throughput. Genuine game changer for latency-sensitive workloads.Modal Auto Endpoints provide state-of-the-art open source inference perf with a click. Learn how we developed our low latency inference playbook with @DecagonAI, delivering responses 60ms faster than the best proprietary provider. modal.com/blog/achieve-s…
- Modal repostedThe no-longer-secret ingredient is DFlash by @zhijianliu_ and @jianchen1799. If you train a custom DFlash speculator on your data, you can get to lower latencies than any generic inference API can achieve. That's the benefit of owning your inference!Modal Auto Endpoints provide state-of-the-art open source inference perf with a click. Learn how we developed our low latency inference playbook with @DecagonAI, delivering responses 60ms faster than the best proprietary provider. modal.com/blog/achieve-s…
- Modal Auto Endpoints provide state-of-the-art open source inference perf with a click. Learn how we developed our low latency inference playbook with @DecagonAI, delivering responses 60ms faster than the best proprietary provider. modal.com/blog/achieve-s…
- Modal repostedYou no longer have to pick between the performance of a black box API and the flexibility and control of @modal. Auto Endpoints give you both. We're unlocking frontier performance for everyone without having to talk to sales or an FDE. More cooking here, stay tuned.
- Modal repostedManaged private LLM endpoints, now available for everyone in @modal. Deploy in a few clicks with the UI or a few keystrokes with our CLI. The coolest thing is that these are not black boxes – customers have full access to the code underneath.
- 📢 We're partnering with @modal to offer a new development and exhibition opportunity for artists with sustained engagements in artificial intelligence and the arts. This global open call seeks proposals for creative projects that demonstrate the intentional use of AI to further
00:00 - Sandbox startup latency and scaling can make or break your RL training run. Great post breaking this down, shown using Modal Sandboxes.RL Systems Mind the Gap: Matching Trainer and Generator Throughput RL Training Infrastructure, GRPO, PipelineRL, Async RL, Policy Staleness, RL Sandbox Infra, CPU Requirements, TCO Analysis, Thinking Machines Tinker newsletter.semianalysis.com/p/rl-systems-m…
- Modal repostedOur sandbox team has been on a crusade against every millisecond of latency and it's paying off. More cool results coming very soon!














