Layered Explanations: Interpreting Neural Networks with Numerical Influence Measures
Deep learning is currently receiving considerable attention from the machine learning community due to its predictive power. However, its lack of interpretability raises numerous concerns. Since neural networks are deployed in high-stakes domains, stakeholders expect to receive acceptable human interpretable explanations. We explain the decisions of neural networks using layered explanations: we use influence measures in order to compute a numerical value for each layer. Using layerwise influence measures, we identify the layers that contain the most explanatory power, and use those to generate explanations.