Autoscaled RL data
for frontier agents

Post-training data that scales without armies of human annotators

Explore our research

We are an applied research lab working on the next generation of data harvesting techniques. Hand-built environments don't scale, and every task and reward costs valuable human hours.

As agents become longer horizon, these costs accelerate further.

To push the frontier, we need environments that can be scaled autonomously and targeted to specific capabilities.

We provide that data.

Unsupervised environment design

We let environments adapt to the agent, generating tasks at the frontier of its current ability. This follows a curriculum learning design that discovers difficulty instead of guessing it, in the lineage of PAIRED and regret-based UED.

Open-endedness

We're building toward environments that keep producing novel, learnable challenges rather than saturating human-curated benchmarks. An agent should never run out of things to learn.

Self-evolving benchmarks

We use coding agents as world-builders that construct environments, tasks, and their verifiers, then rebuild them as models improve. This lets a benchmark grow alongside the models it measures, instead of being outgrown by them.

Modalities

Tool-Use

Agents that call APIs and chain tools (ex: ITSM, healthcare, finance, customer support, etc.)

Computer Use

Agents that navigate real UIs end-to-end (ex: e-commerce, web research, SaaS workflows, etc.)

Learn More

Modalities

Tool-Use

Agents that call APIs and chain tools (ex: ITSM, healthcare, finance, customer support, etc.)

Computer Use

Agents that navigate real UIs end-to-end (ex: e-commerce, web research, SaaS workflows, etc.)

Book a meeting

14K

Evaluations

Born from Ragas

Built by the team behind Ragas, the open-source evals framework used by 80% of the Fortune 100.

Explore Ragas

Research

Tau2-Infinity

A tool-use miner that harvests tasks inside a target model's pass@k window, with verifiers and environments

Tau2-Infinity

A tool-use miner that harvests tasks inside a target model's pass@k window, with verifiers and environments

Tau2-Infinity

Ecom-Bench

An adversarial miner surfacing hard web-agent e-commerce tasks on live storefronts, with drift-proof cart verifiers

Ecom-Bench

An adversarial miner surfacing hard web-agent e-commerce tasks on live storefronts, with drift-proof cart verifiers

Ecom-Bench

Cloning Bench

A benchmark testing how well coding agents visually clone real web apps through iterative pixel-diff feedback

Cloning Bench

A benchmark testing how well coding agents visually clone real web apps through iterative pixel-diff feedback

Cloning Bench

Vibrant Labs is proudly backed by

Autoscaled RL data for frontier agents

Post-training data that scales without armies of human annotators

Unsupervised environment design

Open-endedness

Self-evolving benchmarks

Modalities

Tool-Use

Agents that call APIs and chain tools (ex: ITSM, healthcare, finance, customer support, etc.)

Computer Use

Agents that navigate real UIs end-to-end (ex: e-commerce, web research, SaaS workflows, etc.)

Modalities

Tool-Use

Agents that call APIs and chain tools (ex: ITSM, healthcare, finance, customer support, etc.)

Computer Use

Agents that navigate real UIs end-to-end (ex: e-commerce, web research, SaaS workflows, etc.)

Born from Ragas

Built by the team behind Ragas, the open-source evals framework used by 80% of the Fortune 100.

Research

Tau2-Infinity

Tau2-Infinity

Ecom-Bench

Ecom-Bench

Cloning Bench

Cloning Bench

Autoscaled RL data
for frontier agents