A Very Big Video Reasoning Suite

We bet on a future that video reasoning is the next fundamental intelligence paradigm, after language reasoning, where spatiotemporal embodied world experiences could be more naturally captured.

Data Engines

View All
hit_target_after_bounce
GitHub
Knowledge training set
The scene shows a ball with an arrow indicating its initial direction, and several empty target positions (hollow circles) on the right side. Simulate the ball moving along this direction and bouncing off walls following the law of reflection (the angle of reflection equals the angle of incidence). The ball will follow a complete trajectory and eventually align exactly with and completely overlap one of the target positions.
First Frame
Last Frame
sequence_completion
GitHub
Abstraction in-domain testset
The scene shows a color_cycle sequence. Elements are arranged horizontally from left to right. The last position contains a question mark (?) indicating a missing element. Observe the pattern: the colors follow a cyclic order that repeats after a certain number of elements. Determine the element that should replace the question mark to complete the sequence according to the established pattern.
First Frame
Last Frame
grid_avoid_red_block
GitHub
Spatiality training set
The scene shows a 10x10 grid with a blue start square (containing a yellow circular agent), a green end square, and multiple red filled rectangles indicating obstacles. Starting from the blue start square, the agent can move to adjacent cells (up, down, left, right). The goal is to move the agent to the green end square along the shortest path without entering any cells containing red filled rectangles.
First Frame
Last Frame
symbol_edit
GitHub
Transformation out-of-domain testset
The sequence currently has 1 of symbol △. Constraint: at least 4 of symbol △. Insert 3 △ symbols at positions 1, 3, and 7 to satisfy the constraint.
First Frame
Last Frame
animal_size_sorting
GitHub
Perception out-of-domain testset
Animal faces of different sizes are scattered randomly on the canvas. Sort them by size from smallest to largest and align them horizontally at the bottom baseline.
First Frame
Last Frame

Inference Results

View Full Bench
Domino Chain Prediction - Samples
00
01
02
03
04
Task Domains 1/5
Domino Chain Prediction
Knowledge in-domain testset
Draw Next Sized Shape
Abstraction out-of-domain testset
LEGO Construction
Spatiality in-domain testset
2D Geometric Transform
Transformation out-of-domain testset
Nearest Square Rectangle
Perception out-of-domain testset
Prompt
Loading...
Ground Truth
First
First Frame
Final
Final Frame
Model Outputs
1/
VBVR-Wan2.2
VBVR-Wan2.2
CogVideoX 1.5
Kling 2.6
LTX-2
Runway Gen-4
Sora 2
Veo 3
Wan 2.2 I2V
Hunyuan I2V
Seedance 2.0

Leaderboard

Modality
Split
Type
Category
2026-04-28