A Very Big Video Reasoning Suite

We bet on a future that video reasoning is the next fundamental intelligence paradigm, after language reasoning, where spatiotemporal embodied world experiences could be more naturally captured.

Data Engines

View All

hit_target_after_bounce

GitHub

Knowledge training set

Prompt

The scene shows a ball with an arrow indicating its initial direction, and several empty target positions (hollow circles) on the right side. Simulate the ball moving along this direction and bouncing off walls following the law of reflection (the angle of reflection equals the angle of incidence). The ball will follow a complete trajectory and eventually align exactly with and completely overlap one of the target positions.

First Frame

Last Frame

Video

sequence_completion

GitHub

Abstraction in-domain testset

Prompt

The scene shows a color_cycle sequence. Elements are arranged horizontally from left to right. The last position contains a question mark (?) indicating a missing element. Observe the pattern: the colors follow a cyclic order that repeats after a certain number of elements. Determine the element that should replace the question mark to complete the sequence according to the established pattern.

First Frame

Last Frame

Video

grid_avoid_red_block

GitHub

Spatiality training set

Prompt

The scene shows a 10x10 grid with a blue start square (containing a yellow circular agent), a green end square, and multiple red filled rectangles indicating obstacles. Starting from the blue start square, the agent can move to adjacent cells (up, down, left, right). The goal is to move the agent to the green end square along the shortest path without entering any cells containing red filled rectangles.

First Frame

Last Frame

Video

symbol_edit

GitHub

Transformation out-of-domain testset

Prompt

The sequence currently has 1 of symbol △. Constraint: at least 4 of symbol △. Insert 3 △ symbols at positions 1, 3, and 7 to satisfy the constraint.

First Frame

Last Frame

Video

animal_size_sorting

GitHub

Perception out-of-domain testset

Prompt

Animal faces of different sizes are scattered randomly on the canvas. Sort them by size from smallest to largest and align them horizontally at the bottom baseline.

First Frame

Last Frame

Video

Inference Results

View Full Bench

Domino Chain Prediction - Samples

Task Domains 1/5

Domino Chain Prediction

Knowledge in-domain testset

Draw Next Sized Shape

Abstraction out-of-domain testset

LEGO Construction

Spatiality in-domain testset

2D Geometric Transform

Transformation out-of-domain testset

Nearest Square Rectangle

Perception out-of-domain testset

Prompt

Ground Truth

First

Final

Model Outputs

VBVR-Wan2.2

CogVideoX 1.5

Kling 2.6

LTX-2

Runway Gen-4

Sora 2

Veo 3

Wan 2.2 I2V

Hunyuan I2V

Seedance 2.0

Prompt

Ground Truth

Model Outputs

VBVR-BAGEL

BAGEL

SenseNova-U1

VBVR-ThinkMorph

ThinkMorph

GPT Image 2

Nano Banana

Leaderboard

Modality

Split

Type