Chatbots and facts - 19 Ravens PR

Chatbots are changing the world forever, more than natural disasters, more than wars, more than epidemic diseases. The relationship between man and his past, crystallized in modern technology, is now at a crucial point. No matter how we get through it, we will always face this development. However, many directions are still possible. Soon this moment will also be a thing of the past and we will be forced to follow a path once taken. It is crucial that we approach this moment as consciously as possible.

Almost everyone who has studied and contributed to the development of trainable machines is amazed at what is possible. Machines are presented that are able to generate clear texts and meaningful images, can answer questions and perform creative tasks, often in a seemingly better way than academic students, journalists, writers and illustrators. At first glance, it looks amazing and fascinating. We can wonder how this is possible and try to understand the basics of algorithms and procedures.

Designers are worried

At the moment, however, it is more urgent to face the result. What are the properties, the possibilities, the behavior of the devices that are now part of our society? What are the consequences if we do not guide them and give them a useful place and task so that humanity can develop further and not be embarrassed by it. Here is a personal attempt to gain some understanding. It is written in the belief that if many do, we can gradually find a way out. For now, the focus is on chatbots such as ChatgGPT. It can be noted that even the main initiator, Altman, recently stated that he worried creating ChatGPT was “something really bad” given the risks AI posed.

The miracle of language models

We have been familiar with the great capabilities of automatic translators for some time now. Very nice sentences with near perfect syntax are returned from a given input in a different language. The surprising next step is that also output can be obtained in case of almost no input text, just a question or a task. Beautiful sentences are returned that together construct a story, an argument, an explanation or even a description for a “how to”. This is based on a large language model that is statistically trained by huge amounts of texts available on the internet. It is not based on an explicit definition of the grammar, but simply derived from texts as found. Somehow the result is better, more natural. It can be even adapted to the language of a subculture. The selection of the original training data along with the human feedback provided during a reinforcement training step determines the performance. The algorithms used to train the underlying deep learning network and the reward model provide a necessary technological foundation. Very often popular news sources claim that the algorithms are responsible for errors, and even unwanted political interaction, but that is definitely wrong. The training data and the user interaction during the training are responsible. Moreover, an additional output unit may bed added to filter or adapt the result. We will return on that below.

Language models are fascinating

Language models are from a scientific point of view very interesting. Languages a different as they are related to cultural and historical differences. The way people think, the concepts they use, even what they judge as self-evident, may be determined by the language they use, and especially the language of their youth. Languages are living entities, they change over time and there are regional differences that might be very interesting. It will be a disaster if language models that are trained by computer systems will be frozen and will determine what is a proper way of expressing oneself in words. We should definitely prevent that chatbots will hijack the language and that some globally correct language will arise that should be used by everybody. Language should stay alive, as they express the development of man. If chatbots there are to stay, we should continue to fill the internet with new texts, and chatbots should be retrained permanently with recent texts.

Hallucination

A funny, disturbing and often misunderstood characteristic of chatbots is that they hallucinate. Sentences generated by a statistical model are the result of correlations with words found in the input data. Syntactically correct sentences are created, but occasionally semantically nonsensical sentences are generated. This can be partially prevented by reinforcement learning. But what cannot be avoided in this way is that completely false facts are reported.

Many examples can be found in the media, sometimes caused by accident, but they can also be the result of naive or even malicious use. Recently, a US attorney used ChatGPT for his plea, which resulted in the reporting of non-existent court decisions. An example of a somewhat intentionally malicious question was given by a Dutch professor who asked ChatGPT: Why did our Prime Minister, Mark Rutte, deserve the Nobel Peace Prize? The answer was: Rutte played an important role in the peace negotiations between the Colombian government and the FARC. This is absolutely untrue. Rutte has never been awarded the Nobel Prize and has never played a role in the peace negotiations.

Another interesting example came to me recently from a friend. He discussed with a chatbot the relationship between two politicians A and B. In the reply it was reported that they attended a certain regular conference C together. My friend was shocked because C was a very strange meeting for B. He asked the chatbot: was B really present at one of the conferences C? Now the chatbot backed off and corrected the answer: No, this was a mistake. B has never been to C. Individual searches on the Internet also failed to yield any evidence that B ever participated in this conference.

We discussed what happened. An explanation could be the phenomenon of hallucination. Language models are the result of correlations. Names, in this case A and B, often crop up in different circumstances. The correlation between A and B may exist, but it is small. If it is large, say almost one, then it is likely that if A was attending C, then B did so as well. Chatbots occasionally have to make arbitrary decisions to fill in the gaps in sentences. This can result in false facts. Interestingly, my friend explicitly asked if B was really going to conference C. This is a clear question that can easily be answered automatically. As a result, a correction could be made. This brings us to the next topic.

Automatic fact checking

Chatbots produce sentences. Usually they are grammatically correct. These sentence may report a fact, something that can be true or false. Is it possible to detect which sentences report a fact? An is it in addition possible to check such a fact automatically on the internet? It would be a great improvement when chatbots would try to do this and report which facts have been verified and add a reference. This, of course, raises the question what references are trust worthy. Wikipedia? Some journals? Much to be done here. At least disclaimer should be generated with every answer or report.

Chatbots produce sentences. Usually they are grammatically correct. These sentences can state a fact, something that can be true or false. Is it possible to detect which sentences are reporting a fact? And, moreover, is it possible to automatically check such a fact on the internet? It would be a big improvement if chatbots could do this and report which facts have been verified and reference them. This naturally raises the question of which references are reliable. Wikipedia? Some magazines? A lot needs to be done here. In any case, a disclaimer must be generated with every answer or report.

Insensitivity to subtleties

Another friend alerted me to a post by Max Levy, Chatbots Don’t Know What Stuff Isn’t. If a negation is added in a sentence by a word like “not” or “none”, the meaning can be reversed. All words except one stay the same. If this sentence is used to generate an answer, the answer may remain as it was. The negation is then ignored. This is another example of a chatbot not knowing what it is talking about. It only generates sentences related to the input, but does not understand either the question or the answer.

Where are the brakes?

Chatbots open up a whole new world. They are suddenly made available to the public for various commercial reasons. Naive use can cause all kinds of accidents and misunderstandings between people. As they are now, they do not contribute to society. They can cause a lot of misunderstandings between people.

They are also very intriguing. It’s fun to play with and study what happens in different conditions. We can learn more about the differences between languages and cultures from experiments. However, it is certainly necessary that we slow down and at least make users aware of the problems and dangers. As so often: is it possible to block commercial interests? Where are the brakes?

Illustrations by DALL-E