Skip to content
View jzh15's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report jzh15

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jzh15/README.md

🌟 Jian Zhang | 张舰

Typing SVG

Homepage Google Scholar CV Email


Jian Zhang

🎓 Graduate Student
Xiamen University

🚀 Research Vision

My long-term vision centers on comprehensive 3D spatial understanding: building 3D vision-language models, semantic reconstruction systems, and embodied agents that can reason about and interact with the physical world.

🎯 Current Focus Areas

  • 🔬 3D Spatial Understanding
  • 🧭 3D Vision-Language Models
  • 🤖 Embodied Intelligence Agents

🎓 Education

  • M.S. Information & Communication Engineering | Xiamen University (Sept 2023 - Present)
    Advisors: Prof. Yue Huang & Xinghao Ding
  • B.S. Data Science and Big Data Technology | Nanchang University (Sept 2019 - June 2023)
    Advisor: Prof. Li Zhu

💼 Experience

  • Remote Research Assistant | PHAI Lab, Texas A&M University (May 2025 - Aug 2025)
    3D Vision & Embodied Intelligence
  • Remote Research Assistant | VITA Group, University of Texas at Austin (Jan 2024 - May 2025)
    3D Semantic Reconstruction

📚 Featured Publications

🔥 Recent Highlights

🧭 SpatialStack: Layered Geometry-Language Fusion for 3D VLM Spatial Reasoning

CVPR 2026 | Jian Zhang*, Shijie Zhou*, Bangya Liu*, Achuta Kadambi, Zhiwen Fan

Fuses layered geometry-language features for 3D spatial reasoning.

Paper Code Project Model Data


🌟 VLM-3R: Vision-Language Models Augmented with 3D Reconstruction

CVPR 2026 | Zhiwen Fan*, Jian Zhang*, Renjie Li, Junge Zhang, Runjin Chen, Hezhen Hu, Kevin Wang, Huaizhi Qu, Shijie Zhou, Dilin Wang, Zhicheng Yan, Haozhe Xu, Jan Theiss, Tianlong Chen, Junyi Li, Zuxuan Tu, Zhangyang Wang, Rakesh Ranjan

Aligns VLMs with 3D reconstruction for spatial-temporal reasoning.

Paper Code Project Demo


🧠 Thinking in Dynamics: Multimodal Reasoning in Physical 4D Worlds

CVPR 2026 | Yuzhi Huang*, Kairun Wen*, Rongxin Gao*, Dongxuan Liu, Yibin Lou, Jie Wu, Jing Xu, Jian Zhang, Zheng Yang, Yunlong Lin, Chenxin Li, Panwang Pan, Junbin Lu, Jingyan Jiang, Xinghao Ding, Yue Huang, Zhi Wang

Benchmarks MLLMs on dynamic 4D physical reasoning.

Paper Code Project


🌍 DynamicVerse: Physically-Aware Multimodal Modeling for Dynamic 4D Worlds

NeurIPS 2025 | Kairun Wen*, Yuzhi Huang*, Runyu Chen, Hui Zheng, Yunlong Lin, Panwang Pan, Chenxin Li, Wenyan Cong, Jian Zhang, Junbin Lu, Chenguo Lin, Dilin Wang, Zhicheng Yan, Haozhe Xu, Jan Theiss, Yue Huang, Xinghao Ding, Rakesh Ranjan, Zhiwen Fan

Builds physical-scale 4D world modeling data from real videos.

Project Paper Code Demo


🏆 Large Spatial Model: End-to-end Unposed Images to Semantic 3D

NeurIPS 2024 | Zhiwen Fan*, Jian Zhang*, Wenyan Cong, Peihao Wang, Renjie Li, Kairun Wen, Shijie Zhou, Achuta Kadambi, Zhangyang Wang, Danfei Xu, Boris Ivanovic, Marco Pavone, Y. Wang

Maps unposed images directly to semantic 3D representations.

Paper Code Project


⚡ InstantSplat: Sparse-view Gaussian Splatting in Seconds

ArXiv 2024 | Zhiwen Fan*, Wenyan Cong*, Kairun Wen*, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, Zhangyang Wang, Y. Wang

Reconstructs sparse-view scenes with fast pose-free Gaussian splatting.

Paper Code Project



🌟 Open for Opportunities

🧭

3D Vision-Language Models
Reasoning over geometry and language

🔬

3D Spatial Understanding
Developing comprehensive 3D perception

🤝

Research Collaborations
Building the future of 3D AI together

Particularly interested in opportunities that bridge cutting-edge research with real-world applications.


📫 Contact

Email


Thanks for visiting!

Building the future of 3D AI, one breakthrough at a time

Pinned Loading

  1. VITA-Group/VLM-3R VITA-Group/VLM-3R Public

    [CVPR 2026] VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction

    Python 409 28

  2. NVlabs/LSM NVlabs/LSM Public

    [NeurIPS'24] Large Spatial Model: End-to-end Unposed Images to Semantic 3D

    Python 234 9

  3. NVlabs/InstantSplat NVlabs/InstantSplat Public

    InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds

    Python 1.7k 152

  4. SpatialStack SpatialStack Public

    [CVPR 2026]SpatialStack: Layered Geometry-Language Fusion for 3D VLM Spatial Reasoning

    Python 27 3