Baseten (@baseten) / X

Baseten

2,481 posts

Baseten

@baseten

Inference is everything.

San Francisco and New York

Joined March 2021

Pinned
Baseten
@baseten
May 13
Intelligence should be defined by the people closest to the work. Intelligence should be owned by all of us. Let’s build a many model future!
Tuhin Srivastava
@tuhinone
May 13
Article
A many model future
Obsessives have always moved the world forward. They are responsible for our most beloved products, proudest scientific achievements, most moving art, the greatest leaps in what we're capable of....
35K
Baseten
@baseten
6h
Excited to power GLM-5.2 on @cline! How to use it in about 10 seconds:
00:00
3.1K
Baseten reposted
Madison Kanna
@Madisonkanna
Jun 24
With the launch of GLM 5.2 this week, I see everyone asking "have open models caught up to closed models?" The more interesting question that's getting missed: what can you do with an open model that you can't do with a closed one? You can specialize them. And when you do, the
00:00
27:58
Madison Kanna
@Madisonkanna
Mar 26
What is AI inference engineering, why is it such an in-demand skill, and how do you break into the field? With author of Inference Engineering @philipkiely and head of training at Baseten @oneill_c 0:00: What is inference? 2:47: History of inference 4:59: Downstream effects
11K
Baseten
@baseten
Jun 24
"Frontier models for the hardest general intelligence and post-trained open source for high-volume and specialized workloads... Many specialized models, serving many specialized workflows, inside many specialized products." Thank you, Apoorv, for taking the time to write about
Apoorv Agrawal
@apoorv03
Jun 22
Article
Why we are doubling down on Baseten
We backed Baseten in Q4 2025, and I wrote up the thesis then. Six months on, it has only gotten more obvious to us, and faster. By the end of Q1, Baseten had already surpassed the full-year CY26...
4.1K
Baseten reposted
Alex Ker 🔭
@thealexker
Jun 24
Article
How to run GLM-5.2 in any harness
GLM-5.2 is this year’s DeepSeek moment. It’s already shifting the trajectory of how we interact with and consume intelligence. As we and our agents continue to tokenmax, tokenonomics and performance...
24K
Baseten
@baseten
Jun 24
You can now access our GLM-5.2 API through the Merge Gateway! GLM-5.2 matches frontier model intelligence while running 4x+ faster and at 1/5th the cost. Try it out: merge.dev/gateway
4.3K
Baseten
@baseten
Jun 23
"That's when they come to open-source models, that's when they come to Baseten, that's when they come to post-train models on Baseten, to be able to do it better, faster, and cheaper. That's when you get both intelligence everywhere and unit economics that make sense for your
Tuhin Srivastava
@tuhinone
Jun 23
Thanks to @EdLudlow for having us on Bloomberg Tech yesterday to talk about our latest fundraise and the growing number of companies owning their open and specialized models.
00:00
4.4K
Baseten
@baseten
Jun 23
Excited to be a day 0 launch partner for BioNeMo, NVIDIA's new, fully-open agent toolkit for scientific workflows! All 10 BioNeMo NIMs are available in our model library. Learn more in our announcement: baseten.co/blog/nvidia-bi…
00:35
NVIDIA Healthcare
@NVIDIAHealth
Jun 23
Science is entering a new era - one where AI agents can do scientific work. 🧬 Today NVIDIA is launching the BioNeMo Agent Toolkit - an open, agent-ready toolkit that gives any AI agent callable tools for protein structure prediction, molecular docking, generative chemistry,
4.5K
Baseten reposted
Philip Kiely
@philipkiely
Jun 23
Article
How we built the world’s fastest API for GLM-5.2
GLM-5.2 is the biggest news in open models since DeepSeek-R1. It’s easy to see why. GLM-5.2 delivers comparable performance to GPT 5.5 and Opus 4.8 at a fraction of the cost, generally 70-80% less...
516K
Baseten reposted
Alex Ker 🔭
@thealexker
Jun 22
Tutorial on how to use GLM-5.2 in Claude Code (bookmark this) ~4.5x faster & ~5x cheaper compared to Opus 4.8! 1. Install the latest Claude Code npm install -g @Anthropic-ai/claude-code 2. Create an account at baseten.co. 3. Grab an API Key from
40K
Baseten reposted
Amir Haghighat
@amiruci
Jun 22
We have the fastest GLM-5.2 deployment on the market: >280 tok/s and <0.8s ttft, according to Artificial Analysis. This same performance carries across all post-trained variants. These aren’t vanity metrics. Optimizations like these save our customers tens of millions of dollars
Amir Haghighat
@amiruci
Jun 22
We closed our Series F today at a $13B valuation. Our inference business grew 20x in the last year. I want to explain why: The growth comes from a shift I think is permanent: companies want to own their intelligence layer. Instead of relying exclusively on closed models, teams
92K
Baseten reposted
Tuhin Srivastava
@tuhinone
Jun 22
The GLM moment is going to be bigger than the DeepSeek moment. Baseten has the fastest inference on the best open-weight model. >280 tps and <0.8 ttft.
Tuhin Srivastava
@tuhinone
Jun 22
Article
Announcing our Series F
Today, we are thrilled to announce Baseten’s $1.5B Series F, led by Altimeter Capital, Conviction Partners, and Spark Capital, co-led by Sands Capital and Wellington Management, with participation...
23K
Baseten
@baseten
Jun 22
The best open model with the best performance: GLM-5.2 runs at >280 TPS and <0.8s TTFT on Baseten. Try it here: baseten.co/library/glm-52/
00:00
Tuhin Srivastava
@tuhinone
Jun 22
Article
Announcing our Series F
Today, we are thrilled to announce Baseten’s $1.5B Series F, led by Altimeter Capital, Conviction Partners, and Spark Capital, co-led by Sands Capital and Wellington Management, with participation...
30K