Computers with common sense – a holy grail
Spike Narayan –
Spike Narayan is a seasoned high-tech executive who manages frontier science and technology research at IBM.
The world of artificial intelligence (AI) is upon us, and it is impossible not to hear about it every day in a multitude of contexts, from start-ups to use cases medical or even mobile applications. But very few of these AI applications are actually “intelligent”. They are mainly applications trained on data in a narrow scope of use and can be used effectively in this space. So when will we be able to see really smart apps? In reality, it will probably be a continuum, but there are very clear differences between how today’s AI engines work and what I believe is needed to achieve general intelligence or what we would call the common sense that any calculation engine or algorithm lacks. of today. Let’s explore this quest for a more human intelligence and the challenges it presents.
Even just two years ago, any use case that merely massaged big data or helped visualize large amounts of data was touted as using AI. This is partly due to the history of AI. As we know, the AI field went through the period of what is called “AI winter” and woke up about fifteen years ago, fueled by the availability of relatively cheap and abundant computing power, access to massive amounts of data (big data) on which machines could train, and powerful new algorithms. Since big data, in many ways, has enabled this AI renaissance, all uses of big data have been loosely called AI-infused. Many applications that sought to see patterns in data use an architecture that goes by the name of neural network which we will discuss later in this article. A few important things have changed over the past two years, however. AI applications have escaped from academic labs globally and industrial use cases are gaining market adoption. More importantly, the many different flavors of neural network architectures that these use cases employ appear to be stable enough to warrant being called hardened or mature enough for enterprise use. This will greatly accelerate the adoption of AI in the near future.
According to the 2022 AI indexan annual study on the impact and progress of AI in the Stanford Institute for Human-Centered Artificial Intelligence (HAI) Led by an independent, interdisciplinary group of experts from academia and industry, the field of artificial intelligence (AI) is at a critical crossroads. This report highlights these two trends among many others. First, private investment in AI has more than doubled since 2020, in part due to larger funding rounds. In 2020, there were four funding rounds worth $500 million or more; in 2021, there were 15 and second, AI has become more affordable and more capable. The cost of image classification training has decreased by 63.6% and training times have improved by 94.4% since 2018.
This brings us to the technology we call AI and what it will take to evolve into a more versatile technology closer to human capabilities. The AI index mentioned above indicates that the cost of training has dropped dramatically. This notion of training is at the heart of existing AI technologies. The most common AI systems use what are called neural networks. Although the name may suggest an architecture inspired by the brain of mammals, the similarity with our brain ends there. These neural networks are simply algorithms represented by nodes called neurons that are connected to other nodes arranged in sequential layers.
By carefully adjusting (a mathematically tedious process) the strength (or weight) of the connection between neighboring neurons, the network is able to visualize patterns in large datasets. The reader can turn to many sources to dig deeper type of network. Since architectures are made up of neurons and connections sometimes called synapses, they tend to be referred to as brain-inspired, but as I said earlier, the parallels end there. These neural networks are used in what we call machine learning (ML) which actually takes these networks and trains them to recognize patterns in large amounts of data.
As a concrete example, one can train these networks by exposing the network to thousands of images of cats, for example, which will help establish connection strength or synaptic weights in the neural network and will further refine these weight as he sees more pictures of cats. . Training is considered complete when the network capability recognizes an unpublished cat image as a cat with an inference accuracy greater than a predefined minimum. Typically, inference accuracies in the high 90s are common.
Once the training is complete, the network can be deployed and should correctly identify a new pattern with the required accuracy. Such ML engines that are trained for narrow use cases (narrow AI), be it visual images, text or voice, will generate a lot of revenue over the next few years as industrial applications real ones will benefit greatly. However, these ML use cases are hardly what we would call intelligent when we even compare them to a child’s ability for a number of reasons.
Most importantly, no child needs to see thousands of pictures of cats to be able to identify a cat in real life. Literally a handful of selected images will do. Let’s also not forget that ML motors require kilowatts or megawatts of power to learn a task while the human brain operates at around 20 watts. To the credit of the AI/ML community, these engines have demonstrated better-than-human inference accuracies in a number of use cases. Still, my first comment about their intelligence is correct.
So what should we look for in an “intelligent” machine? Although we may try to define this narrowly in each area of use, there needs to be a broader general purpose definition or reference. This is where the problem lies. Computer performance metrics have been around for decades and we need to realize that we had very tangible performance descriptors like chip frequency or transistor sizes or number of circuits in a 1cm x 1cm chip or Flops very commonly used (floating point operations per second) etc. It is very easy to quantify and plot improvement over time. Measuring intelligence is a whole different beast.
Society as a whole can’t even agree on what is a good measure of intelligence in humans, let alone machines. Let’s take reading a paragraph of text as an example. Although computers have always done well at reading the text to be scanned, we never really expected the computer to understand the content in any way. There has been progress in this space more recently with Microsoft Understanding Autoplay which attempts to use a new neural network architecture called Reasoning Network (ReasoNet), and the team was able to mimic the inference process of human readers. Tasks such as understanding context or intent, etc. are human-like traits, and depending on the domain of use, we can determine which ability would be considered “human”. You can begin to see the difficulty of measuring intelligence in a general-purpose “intelligent” machine.
Human common sense is still a leap beyond anything we’ve seen in the AI/ML space. To describe this, my colleague, Dr. Wilcke at IBM, often uses this question in his speeches. Do clouds pay taxes? This is a very interesting example because every adult will answer this question without any thought or hesitation. Yet I will argue that not a single one was explicitly taught in school or anywhere else that clouds don’t pay taxes. How then is it so easy to answer this question? If we ask this question of any ML engine, it will be unable to answer it with any degree of confidence unless someone has explicitly programmed it as a rule. Indeed, man, from an early age, develops a model of the world through learning and above all through observation.
Another vitally important difference is the temporal (or sequential) aspect of how we learn. A child can easily recite the English alphabet from A to Z but if we ask a question like “what comes before U?”, for example, the child will have to try to repeat parts of the sequence to find the answer. The same goes for the lyrics of a song. You will begin to see that much of our learning is tied to sequences in time or space indicating an entirely different learning process.
Much of the work today still involves neural networks due to the immediate monetization potential. There have been advancements in moving from narrow AI to broader AI with the evolution of transfer learning and ad hoc learning and many more, but none of them come close to convincing way of common sense.
So, in summary, can these neural networks, over time, achieve this broader inference capability or will it take an entirely different architecture to learn and use common sense. This is an ongoing debate and time will tell. The fact will remain that neural networks need far too much power to learn to compete with the 20 watt human brain.