about

email
twitter
linkedin
github
scholar
·
cv

I am a research scientist at Bosch Research (Sunnyvale, USA), where I work on computer vision problems for autonomous driving. My research involves developing interpretability tools that enable model understanding, debugging, and model editing.

Some representative methods (click to expand):

splice: A dictionary learning-like method to interpret CLIP models
discriminative feature attributions: A method to build discriminative models such that their saliency maps are faithful by design
fullgrad saliency: Layer-wise saliency maps for ReLU neural nets with cool mathematical properties (aka completeness)

I am also interested in the “science” of deep learning, i.e., systematic investigations of deep learning phenomena. For example, studying forgetting dynamics in LLM training, or explaining observed links between robustness and gradient interpretability. For more information, please see my research themes and publications.

I was previously a postdoctoral research fellow with Hima Lakkaraju at Harvard University. I completed my PhD with François Fleuret, at Idiap Research Institute & EPFL, Switzerland.

I am/was an organizer on (click to expand):

the theory of interpretable AI online seminar series
xai in action: past, present and future workshop at NeurIPS 2023
interpretable ai: past, present and future workshop at NeurIPS 2024
interpretable ml course at Harvard, spring 2023

Note: If you are looking for mentorship / research collaborations on interpretability, feel free to reach out!