I am a research scientist at Bosch Research (Sunnyvale, USA), where I work on computer vision problems for autonomous driving. I’m broadly interested in model interpretability and the “science” of deep learning — figuring out what deep models are actually doing, and why they work as well as they do.
I completed my PhD at EPFL / Idiap with François Fleuret, and a postdoc at Harvard with Hima Lakkaraju. Some representative papers:
- splice — CLIP representations can be decomposed into human-readable concept vectors, useful for auditing and editing vision-language models
- perceptually aligned gradients — explains why robust models produce cleaner saliency maps, connecting two seemingly unrelated phenomena
- forgetting data contamination — benchmark contamination in LLM pre-training is often simply forgotten during training, and may not inflate scores as commonly assumed
For more, please see my research themes and publications. I’m always happy to chat about interpretability or the science of deep learning — feel free to reach out!
News
| May 2026 | "Explainability Research Must Prioritize Foundations over Ad-hoc Methods" is accepted at ICML 2026 position paper track. PDF coming soon! |
| Mar 2026 | Serving as Area Chair at NeurIPS 2026 |
| Mar 2026 | "WayPoint: Interactive Natural Language Querying for Spatio-Temporal Video Events" accepted at SIGMOD 2026 demo track |
| Jan 2026 | "Evaluating Adversarial Robustness of Concept Representations in Sparse Autoencoders" accepted at EACL 2026 |
| May 2025 | "How Much Can We Forget about Data Contamination?" accepted at ICML 2025 |
| Nov 2024 | Joined Bosch Research, Sunnyvale as Research Scientist |