Comet (@Cometml) / X

Comet

3,487 posts

Comet

@Cometml

Comet provides an end-to-end model evaluation platform for AI developers, with best in class LLM evaluations, experiment tracking, and production monitoring

New York, NY

Joined October 2017

Comet
@Cometml
Jun 9
You're spending ~30% of your coding agent tokens on misconfiguration. Bloated context, unused skills, idle MCPs. We just launched Cost Intelligence in Opik — cuts that waste 20-30% with one click. Native to Claude Code + Codex 🔗globenewswire.com/news-release/2…
192
Comet reposted
Rajesh M
@Rajesh7113
May 21
AI agent debugging is a COMPLETE mess right now. You fix one issue… and another workflow randomly breaks. You change a prompt. Tool calls start behaving differently. You improve latency. Accuracy drops somewhere else. Most teams are basically duct taping evals, traces,
850
Comet
@Cometml
May 12
Our Head of Research Doug Blank headed to Boston for his 3rd annual talk at @MITDeepLearning. He took Asimov's laws of robotics & applied them to agentic AI -- proposing his own three laws of AI and sharing how we're thinking about AI safety at Comet.
451
Comet
@Cometml
May 8
We're hiring across the team 🎉 If you know any rockstars (or are one yourself), we'd love to chat with you! 🔗 comet.com/site/about-us/…
246
Comet reposted
Paul Iusztin
@pauliusztin_
May 6
I just interviewed the former CTO at IBM and Chairperson of NodeJS. Here's what I learned: Michael @maximilien spent 12 months shipping production RAG to multiple customers. In our discussion, he told me that nothing on a leaderboard can predict what works until you evaluate
682
Comet
@Cometml
May 2
"Until you evaluate on your data, nothing else matters."
Paul Iusztin
@pauliusztin_
May 1
I’ve spent the last week interviewing @maximilien, former CTO at IBM and Chairperson of NodeJS Foundation, who has shipped production RAG to multiple customers over the past year. The lesson he kept circling back to is that until you evaluate on your customer’s data, nothing else
673
Comet reposted
Gideon M
@gidim
Apr 23
As your agent matures, something shifts. You stop writing code, and start editing prompts, tweaking params, trying new tools, etc. The tooling for this phase sucks. Today, we’re fixing that. Announcing Agent Configuration + Agent Playground in Opik. 🧵
29K
Comet reposted
Gideon M
@gidim
Apr 22
Shared by a customer. Ollie just made their slack bot 52% faster and 98% cheaper. With test suites no regressions either
311
Comet
@Cometml
Apr 23
Third and final day of "What we've been building" launch week: Agent Playground Your agent isn't just one prompt. It's a complex system of models and parameters working together. It's time to have a workflow that treats it as such.
152
Comet
@Cometml
Apr 23
We're launching the Agent Playground so you can test your full agent configuration from the UI. Tweak prompts and swap models without touching your code. See how the entire agent responds and only save what works.
Introducing the Opik Agent Playground
From comet.com
128