Is Google’s Chatbot Sentient? “Logical” Reasons to Disagree

File:Chatbot.jpg - Wikimedia Commons
“Chatbot” from Wikimedia Commons

I have some strong reasons why I think it’s useful to weigh in on the recent drama around a Google Ethics Engineer’s declaration of the LaMDA product having reached sentience. Slate has a great article about this that I think may bring you up to speed if you’re interested.

Slate’s position on why the assignment of the sentient label to the chatbot was misguided revolves around LaMDA’s complete reliance on human inputs and foundational language models. My assessment extends their position in a different direction and I’ll explain why.

Deductive Logic

All of these foundational models rely on two different types of logic that are common to the software community. The most common is called deductive logic and describes the process where the software compares the truth (or lack) of multiple assertions to determine actions to take. This is a pretty high level explanation (forgive me) which summarizes a significant body of research and work in deduction, but in general, deduction describes the application of rules and logic.

Inductive Logic

Inductive logic is useful for drawing inferences or conclusions from historical observations. If you draw eight black marbles out of the bag in your first eight tries, you might infer from this data that the bag is full of black marbles. Induction has recently experienced a resurgence in software due to the recent interest in deep learning and other machine learning techniques. Machine learning is a form of inductive logic where historical data are trained into models which allow the machine to infer likely outcomes due to current sensed parameters. So in the example of my weather data systems, I have sampled parameters like temperature, pressure, humidity, luminosity, etc., for years. I have also captured when rain occurs (easy to do in Tucson… it doesn’t happen much) and can then label every example of weather data with “rain” or “not-rain”. THEN, if I want to predict whether it will rain at some time in the near future, I conduct inference into my trained weather-rain model using the current values of the weather sensors. This is a very simple description of how machine learning works.

Combinations of Logic

Much current software relies on combinations of traditional deductive logic that makes decisions on when to incorporate inductive logic inferences in order to solve problems most effectively. I always imagine traditional software logic that is evaluating and connecting hundreds and hundreds of small trained machine learning models. This is an example of the combination of these two kinds of logic. This, very simply stated, is what the large foundational models like LaMDA and GPT-3 are doing. The difference is that they are generally using deductive logic rules and VERY LARGE trained models. Most of these foundational models are so large and computationally expensive that most normal people don’t have much ability to use them in any other format than toy applications provided by Google or OpenAI. The very large body of language used by these foundational models allows them to do incredible inference based off of language created by real humans. All the text in Wikipedia is an example of some of the language used to train these models. Inferencing these models using questions from humans (such as the Google employee) can yield surprising, even spooky results. Deductive logic rules can eliminate ridiculous or meaningless responses.

What’s missing? (Who knows what lies in the heart of a machine?)

Despite the fact that these foundational models can be VERY useful, they’re missing something major that prevents them from truly understanding language. How can I say this with confidence?

Abductive Logic is What’s Missing!

It is easy to point out that machines do not (and will not in the near future) have the capacity for abductive logic. Abduction describes an ability that humans have to make an observation Q and conclude that some general principle P must be the reason that Q is true. Notice that this is quite different than deduction and induction. The complexity of the various principles in the world makes abduction very difficult to perform. Sherlock Holmes was a renowned expert in using abduction when he would see, for instance, a wedding ring that was more shiny on the inside than the outside and make the conclusion with no further information that if a person removed the ring frequently it might have that appearance. Machines are not able to make these kinds of intuitive “leaps”. Our current, modern view states that science itself is an example of abduction. We seek hidden principles or causes that we wish to use to actually deduce the observable facts, “Frequently removing a ring might explain why it is shiny and clean on the inside but not the outside”.

There is plenty of research out there telling us that machines can not perform abductive logic. Part of the reason is that in abduction, a likely hypothesis needs to be inferred from a nearly infinite set of explanations. Something in the human brain protects us from getting locked in the infinite loop required to evaluate all these explanations. It is likely to be some mashup of intuition and mental models of rules and value systems that we use to jump to the most likely causes to explain the data. To go deeper, Mindmatters has a great discussion of all these concepts here. They also have a three part series on “The Flawed Logic Behind Thinking Computers”. Part1, Part2, and Part3. There are many more articles out there that explain this gap of machine intelligence including this one from VentureBeat.

Abduction and Natural Language

There is a growing body of work that indicates that abductive reasoning is part of the reason why humans can understand language (Neurips Proceedings link). Some of this is due to the need to interpret to decode errors in language. A famous example comes from Don Quixote where Sancho Panza, Don Quixote’s assistant says: “Senor, I have educed my wife to let me go with your worship wherever you choose to take me.” Don Quixote, immediately identifying the improper usage replies, “INDUCED, you would say, Sancho. Not EDUCED.” By our definition of abduction, we can see that here, Don Quixote uses abductive logic when he adopts the hypothesis that “induced” is the intended word given the context and the similarity between the two words. According to Donald Davidson, This kind of abductive interpretation can occur in natural language understanding when:

  1. Applying a hypothesis to understand new names or labels
  2. Revising prior beliefs or interpretations about particular phrases
  3. Altering interpretations of predicates or other grammatical constructs to fit the context

Conclusion

In the light of the growing numbers of applications of Machine Learning, there has been much more discussion of deductive and inductive reasoning than there was even ten years ago. It’s likely you’ve seen some of this.

It does appear, however, that the understanding of abductive logic is lagging. Though there have been efforts to simulate machine abduction, it has still yet to have been accomplished and for legitimate processing tractability reasons is likely not to be accomplished on traditional (not quantum) computing. This severely limits a machine from true natural language understanding, which would be needed by any sentient being to understand language and communicate. This would also apply to chatbots and describes why they are just examples of the Chinese Room (or a human-language-speaking parrot), neither of which demonstrate understanding of the languages emanating from them.

Organizing for AI&ML Success – from Conway’s Law to the CDAO

Here’s a topic that I have given a great deal of thought to after observing lots of examples of how companies organize to identify, sense, collect, and use their business data. In a nutshell, HOW a company chooses to organize their data strategy and teams determines how successful they will be in delivering business value through data. Why is this? Conway’s Law gives us the reasons…

Conway’s Law

In short, in 1967, Melvin Conway, a computer programmer proposed that organizations design systems that mirror their own communication structure. This sounds very simple, but I’ll give some examples of why this provides really great insight into the power of architecting organizations around desired business outcomes.

First, why does this make sense?

Conway suggested that the architecture of products by organizations who are broken into functional competencies will tend to reflect those functions. For instance, an application developed by a firm with four functions: mechanical engineering, electrical engineering, software engineering, and signal processing will develop applications with distinct modular capabilities that reflect those functions. A module that manages thermal loads, center of gravity, control systems, structural sensing, and power will emerge and be developed by the mechanical engineering group. This module will interface to another module that contains embedded processing and memory through interfaces that carry power and sensors that provide data. This second module, of course, will be developed by the electrical engineering team. The software engineering team will develop a module that will be loaded into the electrical engineering’s processing system through some programming interface and will receive signals from sensors as well as elements within the mechanical engineering modules and will use logic to make decisions. The signal processing team will also develop a module that will be triggered by signals from the software engineering module and will provide outputs that interface with control modules in the mechanical engineering module. Phew! See below for a very high level visualization of how this might occur. Note how each department “owns” their own content and then someone (hopefully a systems architect or systems engineer) manages the interfaces.

Very high-level block diagram demonstrating Conway’s Law – Tod Newman, 2022

Conway’s Law and Data Science / AI&ML

I have seen Conway’s Law borne out over and over with regards to Data Strategy in an organization. Organization one (lets say Mechanical Engineering) understands their business function well and is intent to optimize for this function. They develop a strategy around data collection, storage, and analysis that helps them achieve their goals. Organization two (Finance, we’ll say) does the same thing. Then Organization three follows suit, and so on. Eventually what we have is 10-15 different data silos, each of which works relatively well for the owner (but each of which requires attention and sustainment — something that’s not always present). However, in traditional organizations (companies not named Uber or Google or SpaceX or similar) there is rarely a central figure like the systems architect who designed the complete business data system and who manages the interfaces. Therefore, Conway’s law results in the isolation of multiple, locally-valuable data sources. Frequently because these organizations design their data strategy to their own unique needs, there’s not even a clear way to connect these data stores!

Are there Solutions?

There are lots of examples of companies who have avoided the bulk of this negative effect by designing a centralized data strategy up front. As I alluded earlier, these companies are often data firms that offer a service like Google or Uber. They were born as data companies and developed from the ground up. If you’re not lucky enough to be a company that was born a data firm, however, there may be some possibilities, but I think they might be difficult and involve culture change management.

  1. Centralize the Data Strategy and Empower an Owner: This role has traditionally been called the Chief Data Officer and these days I’m noticing a positive trend towards redefining this role as the Chief Data and Analytics (or AI) Officer. Here’s a good explanation of the difference. This will have the effect of making the statement to the organization that data is now seen as a central business asset vs. simply a local asset. As the Harvard Business Review states, the trend towards naming CDO’s or CDAO’s “reflects a recognition that data is an important business asset that is worthy of management by a senior executive” and that it is “also an acknowledgement that data and technology are not the same and need different management approaches.” Note that redefining and centralizing the organization can leverage the positive aspects of Conway’s Law towards the goal of integrated, aligned data sources.
  2. Identify “low-hanging fruit” in your existing data silos for integration. You may be lucky and have a common key (employee number, part number, etc.) between two data silos that enables the data to be joined. This assumes that you can get permission to see the data by the silo owner, however, which might be a large assumption. Regardless, a demonstration of the power of integrated data could make the case for the difficult decisions and culture shifts (from local to collective ownership of data).
  3. Make a mandate. Jeff Bezos (legendarily) made his API Mandate at Amazon which required all data and functionality to be exposed publicly across Amazon through a defined interface called an Application Programming Interface (API). This interface managed both access to the data as well as insight into the structure of the data. It is said that this mandate changed the company and enabled their future high-value Amazon Web Services business.

Conclusion

If you’ve made it this far, then you probably have the gist of my argument. If you’ve skipped to the conclusion, here’s what I’d want you to know:

  1. Organizations the build a Data Strategy from scratch will fall into the Conway’s Law trap and are unlikely to have the ability to understand data interfaces.
  2. Conversely, a carefully-architected Data Strategy (everything from design of information to be sensed, sensing approach, collection, and application of Data Science, etc.) can be a surprisingly powerful lever for gaining business value. Some of the largest return on internal investments in process improvement I’m aware of inside large firms involve joining previously-unconnected data sources and gaining a new valuable insight for decisions, risk management, or even better understanding of the flow of business value from suppliers to the hands of the customer.
  3. It is hard to apply a new Data Strategy to an existing business culture. Unless you are leading an amazing business culture, it will require change management techniques (like John Kotter’s 8 steps) to succeed.
  4. An empowered role like the CDO, or better, the CDAO, may help this culture change and can make the kinds of “Bezos API mandates” that might be needed can aid success. It can also help with the next challenge, Sustaining the Data Business.