Jie-Ying Lee 李杰穎

Research Assistant @ NYCU CS
Software Engineer @ Google Pixel Camera

Email: [email protected]

CV / Scholar / LinkedIn / GitHub / X / Threads / Blog

avatar

About Me

I’m a research assistant in Computer Science at National Yang Ming Chiao Tung University, working with Prof. Yu-Lun Liu, and a Software Engineer on Google’s Pixel Camera Team. I work on 3D scene synthesis, generative models for vision, and embodied AI, particularly focusing on Neural Radiance Fields, 3D Gaussian Splatting, vision-language navigation, and on-device perception.

I received my B.S. in Computer Science from National Yang Ming Chiao Tung University, with an exchange semester at ETH Zurich. My industry experience includes internships at Google (Pixel Camera Team), Microsoft, and Appier.

I am actively seeking research collaborations. If you are interested in working with me, don’t hesitate to reach out.

Google Google SWE (2025 - Present)
NYCU NYCU Research Assistant (2025 - Present) B.S in Computer Science (2021 - 2025)
ETH Zurich ETH Zurich Exchange Student (2024 - 2025)

News

Publications

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery teaser image ▶ Hover / Tap
We present Skyfall-GS, a framework that synthesizes photorealistic, city-block scale 3D urban scenes from satellite imagery using diffusion models, eliminating the need for expensive 3D scanning and manual annotation while enabling real-time exploration.
BRDFusion: Physics Meets Generation for Urban Scene Inverse Rendering teaser image ▶ Hover / Tap
We present BRDFusion, an inverse rendering framework for urban scenes that unifies physics-based rendering with generative diffusion priors to recover explicit material and lighting properties from video, enabling high-quality relighting and simulation.
Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion teaser image ▶ Hover / Tap
We present Pantheon360, a controllable 360° video diffusion framework for digital twin generation that synthesizes high-fidelity videos from sparse 360° inputs, letting the diffusion model focus on photorealistic texture refinement while a 3D Cache enforces global geometric consistency.
LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal teaser image ▶ Hover / Tap LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal hover image
We present LightsOut, a diffusion-based outpainting framework that enhances lens flare removal by reconstructing off-frame light sources. Our approach combines a multitask regression module with LoRA fine-tuned diffusion models to produce realistic and physically consistent results.
See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation teaser image ▶ Hover / Tap
We present See, Point, Fly (SPF), a training-free framework for aerial vision-and-language navigation. By leveraging vision-language models and reformulating navigation as a 2D spatial grounding task, SPF enables universal unmanned aerial navigation without task-specific training.
AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting teaser image ▶ Hover / Tap
We introduce AuraFusion360, a reference-based 360° scene inpainting method with three key innovations: depth-aware occlusion identification, Adaptive Guided Depth Diffusion for zero-shot point placement, and SDEdit-based enhancement for multi-view coherence.
SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes teaser image ▶ Hover / Tap
We present SpectroMotion, the first 3D Gaussian Splatting method capable of reconstructing photorealistic dynamic specular scenes. By combining 3DGS with physically-based rendering and deformation fields, we achieve high-quality synthesis of challenging real-world dynamic reflective surfaces.
BoostMVSNeRFs: Boosting MVS-based NeRFs to Generalizable View Synthesis in Large-scale Scenes teaser image ▶ Hover / Tap
We present BoostMVSNeRFs, a method that enhances rendering quality for MVS-based NeRFs in large-scale scenes. Our approach addresses key limitations including restricted viewport coverage and artifacts from limited input views, enabling generalizable view synthesis in complex environments.

Service

Misc.

Beyond research, I’m passionate about staying active through badminton and hip-hop dance. I also enjoy capturing moments through photography.

Music-wise, I’m into Taiwanese indie and hip-hop, frequently listening to artists like Gummy B, 草東沒有派對 (No Party For Cao Dong), and 國蛋 GorDoN.