phd thesis
Gradient-based Methods for Deep Model Interpretability
Ecole Polytechnique Federale du Lausanne (EPFL), 2021
Recipient of the EPFL thesis distinction award (top 8% thesis) in the EE dept. for 2021

master thesis
Learning Compact Architectures for Deep Neural Networks
Indian Institute of Science (IISc), 2017

For more information see my google scholar page. Representative papers are highlighted below.


Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)
U. Bhalla*, A. Oesterling*, S. Srinivas, F. Calmon, H. Lakkaraju
pdf · code · summary We convert dense uninterpretable CLIP embeddings to overcomplete sparse interpretable ones, with a minimal loss in fidelity.

Certifying LLM Safety against Adversarial Prompting
A. Kumar, C. Agarwal, S. Srinivas, A. Li, S. Feizi, H. Lakkaraju
pdf · code · summary We present a simple method to detect LLM adversarial attacks, by systematically deleting tokens until the underlying string is labelled harmful.

long papers


Characterizing Data Point Vulnerability as Average-Case Robustness
T. Han*, S. Srinivas*, H. Lakkaraju
pdf · summary We consider a relaxation of adversarial robustness, i.e., average-case robustness, and provide efficient estimators to compute this quantity.


Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness
S. Srinivas*, S. Bordt*, H. Lakkaraju
pdf · code · summary Previous work finds gradients of robust models to be "perceptually aligned". We explain this phenomenon by observing that robust models in practice are not robust in all directions, in fact they are mostly only robust outside the data manifold. This causes their gradients to align with the manifold, causing them to be perceptually aligned.
Spotlight presentation (Top 3%)


Discriminative Feature Attributions: A Bridge between Post Hoc Explainability and Inherent Interpretability
U. Bhalla*, S. Srinivas*, H. Lakkaraju
pdf · code · summary Given a pre-trained model, adapt this model to be robust to the perturbations introduced by feature attribution methods. Doing so results in models that recover ground truth attributions!


On Minimizing the Impact of Dataset Shifts on Actionable Explanations
A. Meyer*, D. Ley*, S. Srinivas, H. Lakkaraju
pdf · summary How to train classifiers such that they are unaffected by small shifts in the dataset? We show theoretically and experimentally that weight decay, model curvature and robustness are all important factors that can help minimize the impact of such dataset shifts.
Oral presentation (Top 5%)


Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post hoc Explanations
T. Han, S. Srinivas, H. Lakkaraju
pdf · code · workshop · summary Several popular post-hoc explanations such as LIME, SHAP, and gradient based explanations can be viewed as performing local function approximation (LFA). Thinking of LFA as a framework for explanations enables us to make useful statements about explanations such as a no-free lunch theorem, and identify which explanations to use.
Best paper award at ICML IMLH 2022 workshop


Efficiently Training Low-Curvature Neural Networks
S. Srinivas*, K. Matoba*, H. Lakkaraju, F. Fleuret
pdf · slides · poster · code · summary We train low-curvature neural networks, that are "as linear as possible" by (1) replacing ReLU with a variant of softplus, (2) spectral normalization of linear layers, (3) (optionally) using gradient-norm regularization; and minimizing the curvatures and spectral norms of each layer independently. This approach rivals adversarial training without training with adversarial examples.


Data-Efficient Structured Pruning via Submodular Optimization
M. El-Halabi, S. Srinivas, S. Lacoste-Julien
pdf · summary Pruning neurons in neural networks can be cast as a submodular optimization problem, enabling proposal of principled algorithms with rigorous theoretical guarantees that perform well when pruning with small number of data points.


Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability
S. Srinivas, F. Fleuret
pdf · slides · poster · code · workshop · summary Commonly used input-gradient saliency maps for explaining discriminative neural nets capture information about an implicit density model, rather than that of the underlying discriminative model which it is intended to explain.
Oral presentation (Top 1%)


Full-Gradient Representation for Neural Network Visualization
S. Srinivas, F. Fleuret
pdf · poster · code · summary Compute saliency information from all intermediate layers in neural networks, rather than just from the input, as is done commonly. This provably captures two desirable properties (sensitivity and completeness) which typical saliency maps cannot capture.
190+ Github stars · 200+ citations


Knowledge Transfer with Jacobian Matching
S. Srinivas, F. Fleuret
pdf · slides · poster · workshop · summary Perform sample-efficient distillation by requiring that the student model mimic the input-gradients of the teacher model. This is equivalent (in expectation) to performing classical distillation with data augmentation via additive input noise.
Best paper award at NeurIPS LLD 2017 workshop


Learning Neural Network Architectures using Backpropagation
S. Srinivas, R.V. Babu
pdf · poster · summary Automatically prune unimportant neurons during neural network training, by introducing multiplicative binary gating variables with each neuron, and encouraging the gate variables to be as sparse as possible via regularization.

Frontiers in Robotics and AI

A Taxonomy of Deep Convolutional Neural Nets for Computer Vision
S. Srinivas, R. Sarvadevabhatla, K.R. Mopuri, N. Prabhu, S.S. Kruthiventi, R.V. Babu
pdf · summary A recipe-style survey of pre-2015 deep neural networks as applied to computer vision.
300+ citations · Top 25% of all research outputs scored on Altmetric


Data-free Parameter Pruning for Deep Neural Networks
S. Srinivas, R.V. Babu
pdf · poster · summary Prune neurons in neural networks by (1) identifying duplicate neuron pairs, (2) removing one and performing a `surgery` step to compensate for removal.
600+ citations

short conference / workshop papers

ICML Workshops

Word-Level Explanations for Analyzing Bias in Text-to-Image Models
A. Lin, L.M. Paes, S.H. Tanneru, S. Srinivas, H. Lakkaraju
pdf · summary For text to image models, we find which input words contribute to bias in output images. For example, we find that the word "doctor" in the input leads to an over-representation of males in the output.
Published at the Workshop on Challenges in Deploying Generative AI

ICML Workshops

Consistent Explanations in the Face of Model Indeterminacy via Ensembling
D. Ley, L. Tang, M. Nazari, H. Lin, S. Srinivas, H. Lakkaraju
pdf · summary With model ensembles, feature attributions are fairly consistent. We find strategies that lead to efficient construction of such ensembles.
Published at the Workshop on Interpretable Machine Learning for Healthcare (IMLH)

CVPR Workshops

Cyclical Pruning for Sparse Neural Networks
S. Srinivas, A. Kuzmin, M. Nagel, M. van Baalen, A. Skliar, T. Blankevoort
pdf · slides · summary Algorithms for training sparse neural networks should be more like projected gradient descent / iterative hard thresholding, which alternates between sparsification (i.e., projection step) and densification (i.e., gradient step), as opposed to common pruning approaches which do not perform densification.
Oral presentation at the Workshop on Efficient Computer Vision for Deep Learning (ECV)


Estimating Confidence for Deep Neural Networks through Density modelling
A. Subramanya, S. Srinivas, R.V. Babu
pdf · slides · summary Model the density of intermediate features in a neural network using a high-dimensional Gaussian distribution. If features for a test point fall outside the "typical set" for such a Gaussian, then declare that test point to be out-of-distribution.

CVPR Workshops

Training Sparse Neural Networks
S. Srinivas, A. Subramanya, R.V. Babu
pdf · slides · summary Encourage weight sparsity in neural networks by introducing multiplicative binary gating variables along with each weight, and regularizing gates to be sparse.
200+ citations · Oral presentation at Embedded Vision Workshop

Tech Report

Generalized Dropout
S. Srinivas, R.V. Babu
pdf · summary A generalized version of dropout where dropout probabilities are automatically tuned during training. This is done by introducing multiplicative bernoulli gating variables to each neuron within a neural network, and modelling the bernoulli probability by penalizing from a beta distribution.


Compensating for Large In-plane Rotations in Natural Images
L. Boominathan, S. Srinivas, R.V. Babu
pdf · poster · summary Correct for large in-plane rotation in images by (1) detecting the presence of rotation using a CNN, and (2) correcting it iteratively using Bayesian optimization.


Controlled blurring for improving image reconstruction quality in flutter-shutter acquisition
S. Srinivas, A. Adiga, C.S. Seelamantula
pdf · summary Deliberately shaking a camera's sensor in a 2D plane during acquisition results in a well-defined blur kernel that can be used to deblur even in the presence of external camera shake.