I am a research scientist at Bosch Research (Sunnyvale, USA), where I work on computer vision for autonomous driving. I’m broadly interested in model interpretability and the “science” of deep learning — figuring out what these deep models are actually doing, and why they work as well as they do.

I did my masters at IISc Bangalore with Prof. Venkatesh Babu, my PhD at EPFL / Idiap with François Fleuret, and a postdoc at Harvard with Hima Lakkaraju — gravitating toward the same question: what are neural networks actually doing?

Some representative papers:

  • splice: CLIP representations can be decomposed into human-readable concept vectors — useful for auditing and editing vision-language models
  • perceptually aligned gradients: explains why robust models produce cleaner saliency maps, connecting two seemingly unrelated phenomena
  • fullgrad saliency: ReLU network outputs can be exactly decomposed into layer-wise gradient terms, yielding a principled saliency method
  • forgetting data contamination: benchmark contamination in LLM pre-training is often simply forgotten during training, and may not inflate scores as commonly assumed

For more, see my research themes and publications.

I’m always happy to chat about interpretability or the science of deep learning — feel free to reach out!