NeurIPS 2022
This paper tackles post-hoc interpretability for audio processing networks. Our
goal is to interpret decisions of a network in terms of high-level audio objects that are
also listenable for the end-user. To this end, we propose a novel interpreter design that
incorporates non-negative matrix factorization (NMF). In particular, a carefully regularized
interpreter module is trained to take hidden layer representations of the targeted network as
input and produce time activations of pre-learnt NMF components as intermediate outputs.
Our methodology allows us to generate intuitive audio-based interpretations that explicitly
enhance parts of the input signal most relevant for a network’s decision. We demonstrate our
method’s applicability on popular benchmarks, including a real-world multi-label classification
task.