Vincent Sitzmann (@vincesitzmann) / X

Vincent Sitzmann

946 posts

Vincent Sitzmann

@vincesitzmann

Building AI that learns by interacting with the world. Associate Professor @ MIT, leading the Scene Representation Group (scenerepresentations.org).

Cambridge, Massachusetts

Joined February 2016

Pinned
Vincent Sitzmann
@vincesitzmann
Jun 8
Introducing MilliVid, our new method for long-context video generation! MilliVid creates videos that are consistent over long time spans, without using retrieval heuristics or 3D maps! (1/n) davidcharatan.com/millivid/#
00:00
54K
Vincent Sitzmann
@vincesitzmann
Jun 7, 2021
In personal news, I’m thrilled to announce that I’ll be joining @MIT as tenure-track assistant professor in July 2022! My lab will investigate neural scene representations, inverse graphics, neural rendering, and their applications in vision, graphics, robotics, and AI! (1/n)
Vincent Sitzmann
@vincesitzmann
Jun 18, 2020
Excited to share our work on "Implicit Neural Representations with Periodic Activations" vsitzmann.github.io/siren We show how to fit complex signals, such as room-scale SDFs, video, & audio, and supervise implicit reps via their gradients to solve boundary value problems! (1/n)
00:00
Vincent Sitzmann
@vincesitzmann
Jun 24, 2020
We released the code for SIREN! vsitzmann.github.io/siren We also wrote a comprehensive Colab notebook with a no-frills implementation that reproduces image, audio, and poisson experiments, and explores initialization- and shift-invariance properties!
colab.research.google.com
explore_siren.ipynb
Run, share, and edit Python notebooks
Vincent Sitzmann
@vincesitzmann
Apr 24, 2024
Introducing “FlowMap”, the first self-supervised, differentiable structure-from-motion method that is competitive with conventional SfM like Colmap! cameronosmith.github.io/flowmap/ IMO this solves a major missing piece for internet-scale training of 3D Deep Learning methods. 1/n
00:00
129K
Vincent Sitzmann
@vincesitzmann
Dec 10, 2021
Introducing “Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation”! yilundu.github.io/ndf/ (w/ video!) NDFs are an object representation for robotic manipulation enabling imitation of pick-and-place tasks with pose generalization guarantees (1/n)
00:00
Vincent Sitzmann
@vincesitzmann
Dec 28, 2020
Implicit neural representations have recently gotten a lot of attention. I have compiled a reading list that I give students to get started in this area, inspired by the awesome-computer-vision list with extra commentary & notes. Check it out!
GitHub - vsitzmann/awesome-implicit-representations: A curated list of resources on implicit neural...
From github.com
Vincent Sitzmann
@vincesitzmann
Jun 7, 2021
Introducing "Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering"! vsitzmann.github.io/lfns (w/ video!) LFNs are the first fully implicit neural scene representation with real-time rendering, without post-processing / hybrid data-structures! (1/n)
00:00
Vincent Sitzmann
@vincesitzmann
Nov 2, 2021
I am hiring graduate students for my new lab at MIT, where I will start as faculty in July 2022! If you want to push what's possible with neural scene representations & inverse graphics please apply under: gradapply.mit.edu/eecs/apply/log… Deadline is Dec 15th!
Vincent Sitzmann
@vincesitzmann
Jul 3, 2024
Introducing Diffusion Forcing, a new way of training sequence generative models that unifies next-token prediction (think LLM) and full-sequence diffusion (think video diffusion models)! I’m super excited about this - it has a number of unique skills! (1/n)
Boyuan Chen
@BoyuanChen0
Jul 3, 2024
Introducing Diffusion Forcing, which unifies next-token prediction (eg LLMs) and full-seq. diffusion (eg SORA)! It offers improved performance & new sampling strategies in vision and robotics, such as stable, infinite video generation, better diffusion planning, and more! (1/8)
00:00
64K
Vincent Sitzmann
@vincesitzmann
Aug 8, 2024
Introducing Neural Jacobian Fields, robot 3D kinematic models learned only from vision! They can model & control robots from just a single RGB camera, even those w/ intractable kinematics & no embedded sensors such as soft, 3D-printed pneumatic hands! sizhe-li.github.io/publication/ne… 1/n
00:00
54K
Vincent Sitzmann
@vincesitzmann
Jun 8, 2023
Introducing “FlowCam: Training Generalizable 3D Radiance Fields w/o Camera Poses via Pixel-Aligned Scene Flow”! We train a generalizable 3D scene representation self-supervised on datasets of raw videos, without any pre-computed camera poses or SFM! cameronosmith.github.io/flowcam 1/n
00:00
89K
Vincent Sitzmann
@vincesitzmann
Aug 25, 2023
Introducing “Diffusion with Forward Models”, 𝗮 𝗺𝗼𝗱𝗲𝗹 𝘁𝗵𝗮𝘁 𝗰𝗮𝗻 𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗲 𝗱𝗶𝘃𝗲𝗿𝘀𝗲, 𝗿𝗲𝗮𝗹 𝟯𝗗 𝘀𝗰𝗲𝗻𝗲𝘀 𝗳𝗿𝗼𝗺 𝗮 𝘀𝗶𝗻𝗴𝗹𝗲 𝗶𝗺𝗮𝗴𝗲, 𝘁𝗿𝗮𝗶𝗻𝗲𝗱 𝘄𝗶𝘁𝗵 𝗶𝗺𝗮𝗴𝗲𝘀 𝘄/𝗼 𝗮𝗻𝘆 𝟯𝗗 𝗱𝗮𝘁𝗮! …ffusion-with-forward-models.github.io 1/n
00:00
89K
Vincent Sitzmann
@vincesitzmann
Jun 2, 2022
NeRFs will transform computer graphics. But we need to be able to edit them! In “Decomposing NeRF for Editing via Feature Field Distillation” we use Image and Image/Language foundation models for easy, query-based editing via language- and patch queries! pfnet-research.github.io/distilled-feat…
00:00