Homepage > Tackling Interpretability in Audio Classification Networks with Non-negative Matrix Factorization

14/09/2023

Tackling Interpretability in Audio Classification Networks with Non-negative Matrix Factorization

Back to list

Under submission at IEEE/ACM TASLP

Abstract

This paper tackles two major problem settings for interpretability of audio processing networks, post-hoc and by-design interpretation. For post-hoc interpretation, we aim to interpret decisions of a network in terms of high-level audio objects that are also listenable for the end-user. This is extended to present an inherently interpretable model with high performance. To this end, we propose a novel interpreter design that incorporates non-negative matrix factorization (NMF). In particular, an interpreter is trained to generate a regularized intermediate embedding from hidden layers of a target network, learnt as time-activations of a pre-learnt NMF dictionary. Our methodology allows us to generate intuitive audio-based interpretations that explicitly enhance parts of the input signal most relevant for a network’s decision. We demonstrate our method’s applicability on a variety of classification tasks, including multi-label data for real-world audio and music.

Jayneel Parekh, Sanjeel Parekh, Pavlo Mozharovskyi, Gaël Richard, Florence d’Alché-Buc (2023). , “[2305.07132] Tackling Interpretability in Audio Classification Networks with Non-negative Matrix Factorization (arxiv.org) ”. arXiv 2023 (under submission)
.

Publications & news

See also news about our activities and links to our main scientific publications, along with publications related to our work.

See publications See news

Research notes

See also news about our activities and links to our main scientific publications, along with publications related to our work.