MPhil-PhD transfer presentation
When: Wed, 16th Oct 2019, 12.00 noon
Where: A108 (1st Floor, College Building)
Who: Benedikt Wagner; City, University of London
Title: Reasoning about what has been learned: Knowledge Extraction from Neural Networks
Abstract: Machine Learning-based systems, including Neural Networks, are experiencing greater popularity in recent years. A weakness of these model that rely on complex representations is that they are considered black boxes with respect to explanatory power. In the context of current initiatives on the side of the regulatory authorities and societal discussions regarding, a desire for transparency and corresponding accountability of automated decision systems, attempts on better interpretable or explainable methods and systems in Artificial Intelligence and Machine Learning is ongoing. As a result, there has been a plethora of methods introduced in recent years, resulting in a large mixture of approaches and steps towards getting a better understanding of the behaviour of a model. Therefore, we have developed a taxonomy that provides a holistic view and structure on the topic. We further investigate three promising methods deeper which are based on Counterfactuals, Concept Activation Vectors, and Knowledge Extraction approaches. Concept Activation Vectors try to target the hidden representation as useful base for explanations based on conceptual sensitivities. The tree-structured Knowledge Extraction methods, on the other hand, aim at global representation in a constrained architecture that illustrate how a decision was made and achieve reasonable predictive performance. We emphasise potential benefits and weaknesses of the methods before providing an outlook on promising directions for future research.