There is no accepted definition of interpretability and explainability, although the many different methods proposed to explain or interpret how an opaque AI System works. Sometimes the two terms are used interchangeably in the broad general sense of understandability. Some researchers prefer interpretability; for some, the term interpretability holds no agreed meaning. Some others posit that interpretability alone is insufficient to trust black-box methods and that we need explainability. Even the EU makes a circular case for explainable AI, identifying why some form of interpretability in AI systems might be desirable.
In this blog, I prefer to follow the computer scientist Cynthia Rudin and make a clear distinction between the two terms:
Should we prefer explainability or interpretability, or both? Once again, Cynthia Rudin warns:
“trying to explain black-box models, rather than creating models that are interpretable in thefirst place, is likely to perpetuate bad practices and can potentially cause catastrophic harm to society.”
That makes perfect sense in our discussion of the relationship between knowledge and AI. In fact, what is usually reported as the Black Box problem means that “AI doesn’t show how it works. It does not explicitly share how and why it reaches its conclusions”, which raises criticism of the technology and lack of trust for high-stakes decisions.
However, Polanyi [blog#2], during his investigations on tacit knowledge, has clearly shown that implicit knowledge incorporates all the things we know how to perform. Still, we cannot articulate in words, so we cannot explain. Thus, a kind of “Black Box” problem exists in the everyday experience of each of us when we exchange with other human peers. Why should it be a problem with a machine? There is here a double standard? The philosopher John Zerilli is convinced that’s the case “The effect is to perpetuate a double standard in which machine tools must be transparent to a degree that in some cases unattainable, in order to be considered transparent at all, while human decision-making can get by with reasons satisfying the comparatively undemanding standards of practical reason.”
The opacity of Deep Learning has originated by looking at the wrong place, overloading explicit knowledge (propositional knowledge), and overlooking tacit knowledge (procedural knowledge). We have seen that knowledge operates on different planes [blog#1, blog#2]. While “tacit knowledge can be possessed by itself, explicit knowledge must rely on being tacitly understood and applied.” Polanyi 1966. We have also seen how a machine can learn tacit knowledge from data for tasks we (humans) are not able to explain, building a “rich and useful internal representation, computed as a composition of learned features and functions” (Y. Bengio), where “rich and useful internal representation” is ontologically equivalent to tacit knowledge that we can store in a machine in a bottom-up process.
The fact that the internal representation of an ANN is inaccessible is an interesting property shared with human tacit knowledge, which deserves more attention from scholars. Tacit knowledge in humans is as inaccessible as tacit knowledge in a machine but of a very different kind and materiality. Far from being a problem, opacity is a property that a complex cognitive system shares with humans. However, it does not mean a machine will be identical to a human being . The correspondence between human minds and artificial neural networks fails because it suffers from the connectionism bias, which makes it blind to the Collective Tacit Knowledge of human societies [blog#2]
Humans cannot explain their subjective tacit knowledge to other humans. Geoff Hinton famously expressed this concept in an interview “People can’t explain how they work, for most of the things they do.” (Wired 2018). It is then unrealistic to expect an AI System to provide explanations of its internal logic. How can we expect to find explanations in a machine’s internal representation (the tacit knowledge)? by looking at the blueprints?, at the internal neural network connections? That is equivalent to asking a neurologist to take fMRI images when a person is, for example, watching a cat while recording the subject’s internal mental processes: “the facts remain that to see a cat differs sharply from the knowledge of the mechanism of seeing a cat. They are knowledge of quite different things” (Polanyi 1968). Sometimes, providing deep explanations of an AI system’s internal algorithmic logic, although technically correct, produces the opposite effect on a less skilled audience, leading to a lack of trust and poor social acceptance of the technology.
“I’m sorry my responses are limited You must ask the right question”
– Dr. Lanning’s Hologram – I, Robot (2004)
The only possible way to escape this puzzle is to convert tacit knowledge into explicit knowledge in a way that another human peer can interpret, for example, by asking the right question.
A professional algorithm auditor may ask the right questions to an AI System, but never look at the blueprints of the DL model! The auditors’ job is to interrogate algorithms to ensure they comply with pre-set standards without looking at the internals of the DL model. It is also the approach used with counterfactual explanations for people who want to understand the decisions made by an AI system without opening the black box “counterfactual explanations do not attempt to clarify how decisions are made internally. Instead, they provide insight into which external facts could be different to arrive at a desired outcome” (Watcher 2018). Therefore, it should not be surprising that a DL model can be interpretable but not explainable to a larger extent. We should stop asking a machine what we never ask a human because of our cultural biases for adopting a double standard.
The DL black box problem is a misplaced source of criticism that can be mitigated by considering the interplay of tacit knowledge (non-articulated) and explicit knowledge (articulated). I posit that a DL system is inexplicable in the sense that a human brain is epistemically inexplicable.
“If a lion could talk We could not understand him”
– Ludwig Wittgenstein – Phil. Inv. (1953)
One source of this misconception is identifying human knowledge with internal brain processes and overlooking machine (tacit) knowledge with (explicit) algorithms. A robot can be explicitly programmed with engineers’ explicit knowledge (the algorithm) to perform a specific task, e.g., riding a bike. Still, the action is performed with the robot’s internal knowledge, i.e., its inaccessible tacit knowledge, and not with the engineer’s explicit knowledge. The robot does not know the algorithm but knows how to run it and knows how to ride a bike . Even if it were possible that the robot explained how it worked, we (humans) could not understand anything.
Ludwig Wittgenstein, the great philosopher of mind, alluded to this fact when he remarked, “if a lion could talk, we could not understand him.”. Wittgenstein says that the meaning of words is not told by words alone. Lions perceive the world differently; they have different experiences, motivations, and feelings than we do. We may grasp a first level of what a lion may say to make sense of the lion’s words. However, we will never comprehend (verstehen) the lion as a unique individual and his frames of reference that we do not share at all. In analogy, there is little that we can share with a machine. Even if the machine in question explains how it works in perfect English or any other human language, we can grasp only a first level without fundamental understanding. Alternatively, we can do better. Supported by sociological research on tacit knowledge, we can employ counterfactual explanations to explain predictions of individual instances. That is a great way to distill knowledge from data without opening the black box and gaining trust.
In the next blog, we will see how to capture the Tacit Knowledge of experts.
Personal views and opinions expressed are those of the author.