Although Deep Learning (DL) has been the dominant field in Artificial Intelligence for more than a decade, its history can be traced back to the 1940s. The term ‘Deep Learning’ was introduced to the machine learning community by Rina Dechter in 1986. Later, Yan LeCun and his colleagues demonstrated the famous and laudable implementation of DL to recognize handwritten ZIP codes, a feat which took 3 days of training. The “big bang” of DL occurred in 2009 when Nvidia’s enlisted GPUs to train DL models, an approach which reduced training time by a factor of many hundreds. The community was enthralled by a fully AI-powered future which still hasn’t arrived. In the years leading up to August 2021 (at which time this post is written), GPUs have become more affordable, the DL models and training algorithms have advanced tremendously, and training data sets have become more available. Still, it seems that the DL big bang has not revolutionized our lives as predicted.
In February 2021, the International Data Corporation (IDC)’s forecast the value of the global AI market as $327.5 billion for 2021 and $500 billion for 2024. Similarly, Gartner estimates the business value of the AI market at $2.9 trillion as of 2021. The numbers look good on the reports, but not so much in reality. According to Gartner, only 20% of all AI projects reach deployment, and only 60% make an actual profit. Despite the abundance of technology, data, and salaries for the AI professionals, and the determination from both academia and business, the DL hype has not lived up to its promise. This failure has become the subject of academic research, whose findings suggest the lack of understanding of business context, and the data-related problems, such as low-quality or lack of access to training data, are the two main reasons. Nevertheless, in this post, I would like to focus on the lack of enthusiasm for Natural Intelligence (NI), which is the opposite of AI. Broadly speaking, the main goal of AI is to achieve a system like the Universal Turing Machine that is comparable to human-level intelligence. Neuroscience, Cognitive Science, Psychology, Linguistics, Philosophy, and even Animal Cognition have been investigating questions and approaches related to NI that can shed light on the paths of the main goal of AI.
Let’s start with Kahneman’s two systems of thinking. 🔗︎
In his award-winning book, Thinking, Fast and Slow, Daniel Kahneman summarizes his decades-long research on human reasoning. Kahneman posits that there are two modes of thinking: “System 1” is fast, intrinsic, autonomous, emotional, parallel, and experience-based; “System 2” is slower, deliberate, conscious, and serial. For example, driving a car on an empty road, recognizing your mother’s voice, and calculating 2+2 mostly involve System 1, whereas counting the number of people with eyeglasses in a meeting, recalling, and dialing your significant other’s phone number, calculating 13x17, and filling out a tax form depend on System 2.
Having two systems allows NI to minimize effort and optimize performance. System 1 is good at making short-term predictions because it constantly models similar situations based on experience. Its initial reactions are apt because it aims to “save the day” as your “gut feeling.” System 1 arguably contributes to zero-shot learning. On the other hand, when an error occurs or the solution to a problem involves memory-intensive processes and complex computations, System 2 kicks in. If one hears a gunshot, they will likely try to identify its source and proximity to assess the level of danger. In the context of a footrace on the other hand, around 150ms after the crack of a start gun – literally half the time it takes to blink an eye – a sprinter’s leg muscles start moving. When System 1 makes systematic errors, System 2 can take control and actively learn the solution, and even remodel what caused System 1 to fall short. Take the famous example of System 1 being activated to drive a car on an empty road. In fact, new drivers most probably rely on System 2. There are countless such examples in the literature showing that both systems work in coordination, not competition. As demonstrated in various experiments and despite the common belief, System 2 is not superior to System 1. Instead, the two systems cooperate to optimize and resource utilization. System 1’s ability to use short-term predictions empowers the brain to generalize across similar problems. Whenever System 2 solves a new problem, the process becomes a new experience to be applied by System 1 in future similar situations.
AI is traditionally divided into statistical and rule-based approaches, although the field has been dominated by the statistical ones. Probabilistic models have shown high accuracy without acquiring a semantic understanding of their training datasets. This produces not only inductive biases with poor performance in the wilderness but also a lack of explainability in AI. Similarly, rule-based approaches require “a set of rules” and “a set of facts,” which do not allow scalable models but do demand an “expert” to handcraft them. What is missing is the collaboration and interaction between the two, which would combine and accentuate the power of each approach while compensating for the other’s weakness. Kahneman’s work can help AI researchers to pursue an approach that combines aspects System 1 and System 2. This blended approach can be considered a framework because it goes beyond a single model and would act as a Universal Turing Machine. In this framework, System 1 creates, trains, and tests DL models. At the same time, it is System 1’s responsibility to semantically represent the “lesson learned” from System 2, generalize the knowledge, and predict the near future to “save the day.” The idea of combining symbolic and sub-symbolic approaches, also known as the neuro-symbolic approach, is not new. Many researchers have been working on the neural-symbolic cycle which translates symbolic knowledge into neural networks and vice-versa. We need a compiler for neural networks because symbols, relations, and rules should have counterparts in the sub-symbolic space. Moreover, such a framework needs a symbol manipulation that also supports preservation of the structural relations between the two systems without losing the correspondences.
One may ask whether AI needs to involve NI as a KPI. The answer is yes and no. No, because NI, a result of billions of years of evolution, is full of imperfections and mistakes. Yes, because NI is the best way known to help organisms survive countless generations. NI shows astounding scalability, generalizability, and performance. Therefore, an interdisciplinary approach to understanding the details of the NI will help to fully realize the promise of AI, at least until there is an AI comparable to NI.