Log inSign up
Junyi Zhang
228 posts
Image
user avatar
Junyi Zhang
@junyi42
CS Ph.D. Student @Berkeley_AI. B.Eng. @SJTU1896 CS. previous with @GoogleDeepMind, @MSFTResearch. Vision, generative model, robotics.
junyi42.com
Joined July 2022
559
Following
2,829
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • Pinned
    user avatar
    Junyi Zhang
    @junyi42
    Mar 9
    𝗢𝗻𝗲 𝗺𝗲𝗺𝗼𝗿𝘆 𝗰𝗮𝗻’𝘁 𝗿𝘂𝗹𝗲 𝘁𝗵𝗲𝗺 𝗮𝗹𝗹. We present 𝗟𝗼𝗚𝗲𝗥, a new 𝗵𝘆𝗯𝗿𝗶𝗱 𝗺𝗲𝗺𝗼𝗿𝘆 architecture for long-context geometric reconstruction. LoGeR enables stable reconstruction over up to 𝟭𝟬𝗸 𝗳𝗿𝗮𝗺𝗲𝘀 / 𝗸𝗶𝗹𝗼𝗺𝗲𝘁𝗲𝗿 𝘀𝗰𝗮𝗹𝗲, with
    Image
    00:00
    561K
  • user avatar
    Junyi Zhang
    @junyi42
    Oct 7, 2024
    Excited to share MonST3R! -- a simple way to estimate geometry from unposed video of dynamic scene We achieve competitive results on several downstreams (video depth, camera pose) and believe this is a promising step toward feed-forward 4D reconstruction monst3r-project.github.io
    Image
    00:00
    132K
  • user avatar
    Junyi Zhang
    @junyi42
    Feb 11, 2025
    MonST3R is accepted by ICLR'25 as Spotlight! We have also added a fully feed-forward reconstruction mode that runs in real-time for video input (samples at: monst3r-paper.github.io/page0.html), check more details here: github.com/Junyi42/monst3…
    Image
    00:00
    22K
  • user avatar
    Junyi Zhang
    @junyi42
    Apr 21, 2025
    Introducing St4RTrack!🖖 Simultaneous 4D Reconstruction and Tracking in the world coordinate feed-forwardly, just by changing the meaning of two pointmaps! st4rtrack.github.io
    Image
    00:00
    52K
  • user avatar
    Junyi Zhang
    @junyi42
    Oct 21, 2024
    Code for inference, visualization, training, and evaluation is released! -
    user avatar
    Junyi Zhang
    @junyi42
    Oct 7, 2024
    Excited to share MonST3R! -- a simple way to estimate geometry from unposed video of dynamic scene We achieve competitive results on several downstreams (video depth, camera pose) and believe this is a promising step toward feed-forward 4D reconstruction monst3r-project.github.io
    Image
    00:00
    Image
    GitHub - Junyi42/monst3r: Official Implementation of paper "MonST3R: A Simple Approach for Estima...
    From github.com
    22K
  • user avatar
    Junyi Zhang
    @junyi42
    May 21, 2025
    Very impressive! At VideoMimic.net, we already: learn from 3rd-person human videos + RL -- for locomotion. Excited to see where this path goes next!
    Image
    00:00
    user avatar
    Milan Kovac
    @_milankovac_
    May 21, 2025
    One of our goals is to have Optimus learn straight from internet videos of humans doing tasks. Those are often 3rd person views captured by random cameras etc. 

We recently had a significant breakthrough along that journey, and can now transfer a big chunk of the learning
    18K
  • user avatar
    Junyi Zhang
    @junyi42
    May 7, 2025
    Humanoids need to perceive the environment in the real world Using 4D reconstruction techniques, we turn casual human videos into training data for an environment-aware humanoid policy Super excited to share: VideoMimic.net
    user avatar
    Arthur Allshire
    @arthurallshire
    May 7, 2025
    our new system trains humanoid robots using data from cell phone videos, enabling skills such as climbing stairs and sitting on chairs in a single policy (w/ @redstone_hong @junyi42 @davidrmcall)
    Image
    00:00
    11K
  • user avatar
    Junyi Zhang
    @junyi42
    Jun 11, 2025
    Just arrived at Nashville for #CVPR25! 🥰 I'll present St4RTrack tomorrow morning (10:30–12:30) at the 4D Vision Workshop, poster #137 in Hall 104 B. Feel free to come and chat!
    Image
    Image
    01:57
    user avatar
    Junyi Zhang
    @junyi42
    Apr 21, 2025
    Introducing St4RTrack!🖖 Simultaneous 4D Reconstruction and Tracking in the world coordinate feed-forwardly, just by changing the meaning of two pointmaps! st4rtrack.github.io
    8.9K
  • user avatar
    Junyi Zhang
    @junyi42
    Mar 21, 2024
    🚀Introducing “Telling Left from Right” at #CVPR2024 -🔍Identify the problem 𝐠𝐞𝐨metry-𝐚𝐰𝐚𝐫𝐞 semantic correspondence (SC) -📐Evaluate foundation model features’ geometric awareness -🏆Achieve SOTA with a lightweight post-processor 🔗 (w/ code!): telling-left-from-right.github.io
    Image
    00:00
    9.6K
  • user avatar
    Junyi Zhang
    @junyi42
    Jun 16, 2024
    On my way to Seattle ✈️ for my first ever #CVPR! Excited to meet old and new friends. 😄 I'll be presenting our work telling-left-from-right.github.io on Wed. (19th) morning at #284. If you're interested in how a plug-in processor can enhance the Geo-aware SC of SD+DINO, please stop by.
    Image
    7.1K
  • user avatar
    Junyi Zhang
    @junyi42
    Apr 24, 2025
    I'll be presenting MonST3R at ICLR! 🇸🇬 Friday 25th, 10am-12:30pm Hall 3+2B #97 Come by if you are interested!
    user avatar
    Junyi Zhang
    @junyi42
    Feb 11, 2025
    MonST3R is accepted by ICLR'25 as Spotlight! We have also added a fully feed-forward reconstruction mode that runs in real-time for video input (samples at: monst3r-paper.github.io/page0.html), check more details here: github.com/Junyi42/monst3…
    Image
    00:00
    3.1K
  • user avatar
    Junyi Zhang
    @junyi42
    Nov 28, 2024
    The results are so cool! 4D reconstruction is a very challenging task - I tried to explore it before MonST3R but couldn't make it work. I'm thrilled to see MonST3R contributing a part to this reconstruction pipeline!
    user avatar
    Rundi Wu
    @ChrisWu6080
    Nov 28, 2024
    🚀 Introducing CAT4D! 🚀 CAT4D transforms any real or generated video into dynamic 3D scenes with a multi-view video diffusion model. The outputs are dynamic 3D models that we can freeze and look at from novel viewpoints, in real-time!
Be sure to try our interactive viewer!
    Image
    00:00
    5.2K
  • user avatar
    Junyi Zhang
    @junyi42
    Oct 7, 2024
    Replying to @junyi42
    Hard to see the details in the figure? Check it out for yourself 😍: monst3r-project.github.io/page1.html We’ve created an interesting 4D online demo that you can easily explore!
    Image
    00:00
    7K
  • user avatar
    Junyi Zhang
    @junyi42
    Mar 31, 2025
    Nice work! Very cool results by carefully-designed generative inpainting on MonST3R's partial pointmaps. Glad to see MonST3R/dynamic 3d reconstruction is playing an important role.
    Image
    Image
    00:22
    user avatar
    Tianqi Liu
    @TianqiLiu664
    Mar 30, 2025
    🔥Free4D creates explicit 4D Gaussian scene representations from a single image, enabling high-quality, controllable, and real-time rendering. 👉Project (with interactive demo): free4d.github.io Paper: arxiv.org/abs/2503.20785 Code (open-sourced): github.com/TQTQliu/Free4D
    5.2K