Pinned
We just trained DeepSeek-V3 671B benchmark in 2 minutes.
671 billion parameters. 8,192 @nvidia Blackwell Ultra GPUs. A 2 minute time-to-train, the fastest DeepSeek-V3 run ever recorded in MLPerf®.
Check out the final results. utm.io/uqDci
GIF

















