AI Efficiency · Robustness · Temporal Reasoning
Ph.D. Student · UMD GAMMA Lab
I build AI systems that stay reliable under the pressures of real-world deployment: distribution shift, quantization constraints, and temporal complexity. My current focus is on quantization and temporal understanding in Vision Language Models. I'm a Ph.D. student at UMD's GAMMA Lab, working with Prof. Ming Lin.
I build AI systems that stay reliable when they leave the lab: on edge devices, under distribution shift, and under the compute constraints of real-world deployment. I approach this through three connected threads:
Models trained with standard empirical risk minimization learn shortcuts: features that correlate with labels in the training distribution but break on out-of-distribution inputs, causing confident failures on the inputs that matter most. My work targets this through spurious correlation methods that recover worst-group accuracy without group annotations (Decompose-and-Compose, CVPR 2024; Annotation-Free Group Robustness, OOD-CV @ ICCV 2023).
The same gap appears in trajectory forecasting: training datasets reflect typical, average driving behavior, so models perform poorly on rare or out-of-distribution driving styles. In AV deployment, a system that has not learned to anticipate aggressive or erratic drivers will simply fail to account for them on the road, which is not just an accuracy problem.
Quantization converts full-precision model weights to lower-bit representations for efficient inference on constrained hardware. But compression is not neutral: PTQ models often preserve clean-input accuracy while losing robustness to distribution shift, a gap Recti-Q (IROS 2026) targets with a lightweight feature-space rectification adapter. AI's growing energy demand [NYT ↗, Bloomberg ↗] makes efficient models more than an engineering preference.
Beyond robustness-aware quantization, I'm interested in smarter compression strategies more broadly and in applying quantization to Vision Language Models, where inference costs are especially high and edge deployment most pressing.
Edge devices, from phones to autonomous vehicles, must run complex AI under strict constraints on memory, latency, and power. The hardest challenge in this space is temporal understanding in Vision Language Models: reasoning about events, sequences, and causality across video frames.
This is my current primary research focus. VLMs are already expensive to run; genuine temporal reasoning makes the compute demands more acute. Making this viable at the edge requires quantization-aware design from the ground up, connecting directly to the efficiency and robustness directions above.
These three threads are tightly connected: reliable temporal reasoning demands computational precision that standard quantization can undermine, making robustness-aware efficiency a central challenge in my work.
* Equal contribution
TAPSI is an Iranian ride-hailing company similar to Uber, and stands as one of Iran's largest and most technologically advanced companies. Notable projects I contributed to include: