AI naysayers often disparage chatGPT by dismissing it as a glorified autocomplete. Supposedly it repeatedly computes the next likely word and selects it at random based on weighted probabilities.
I don’t believe it.

Here, for example, is part of statement issued by ETH, the MIT of Switzerland
GPT (Generative Pre-trained Transformer) is a neural network that generates plausible fiction. The “generative” algorithm works like the autocomplete function for text messages on smartphones, which suggests likely next words to the user – when typing “I will arrive” the phone might suggest “at”, “in”, and “tomorrow.” As an example, always clicking on the first suggested word would generate the following sentence: “I will arrive at home office in about an hour to get to the university,” which is almost grammatically correct, but non-sensical
ETH Switzerland
Here is Steve Wolfram, of Mathematica:
The first thing to explain is that what ChatGPT is always fundamentally trying to do is to produce a “reasonable continuation” of whatever text it’s got so far, where by “reasonable” we mean “what one might expect someone to write after seeing what people have written on billions of webpages, etc.”
And the remarkable thing is that when ChatGPT does something like write an essay what it’s essentially doing is just asking over and over again “given the text so far, what should the next word be?”—and each time adding a word.
For those who aren’t aware of Richard Stallman, he is the founding father of the GNU Project, FSF, Free/Libre Software Movement and the author of GPL.
Here he is on chatGPT
I can’t foretell the future, but it is important to realize that ChatGPT is not artificial intelligence. It has no intelligence; it doesn’t know anything and doesn’t understand anything. It plays games with words to make plausible-sounding English text, but any statements made in it are liable to be false. It can’t avoid that because it doesn’t know what the words mean.
Even Noam Chomsky gets into the act, decrying the “False Promise” of chatGPT and commenting that
Given the amorality, faux science and linguistic incompetence of these systems, we can only laugh or cry at their popularity.
Chomsky is definitely crying, most likely because the highly multilingual chatGPT doesn’t use the formal grammar theory that Chomsky spent his career elaborating.
The accomplishments of chatGPT
If you listened to these critics you’d conclude that chatGPT is useless and spends it’s time wandering from one random word to the next, producing grammatically ‘almost correct’ nonsense.
But that’s not what the record shows.
For a start it is astoundingly multilingual. If you ask it, it will reply in French, German, Russian, Hindi, Japanese … the list goes on.
Once I tried to find out how far the list went and asked it if it knew a series of increasingly more obscure languages. Farsi (Persian)? Yep! Turkish? Yep! Telugu? Yep!
This is hard to explain using the word-to-word model. When I asked Telugu?, the next word it uttered was “yes”. Why on earth would “yes” be a likely follow up to “Telugu”? Are there millions of sentences out there where “yes” follows “Telugu”?
(A commentor pointed out that “do you speak Telugu? would appear many times, followed by “yes”. But it would appear followed by “no” more often. Why would “yes” be chosen?)
Passing the tests
GPT-4 has quite a record when it comes to passing tests. It scored in the top 10% on the US Uniform Bar examination and aced the LSAT.
It also passed the math SAT and most of the AP exams, including biology.
In a typical multiple choice exam you have a complex question ending in a big word like “anthropomorphic” and the correct answer is “C”. What makes it think “C” would be a likely follow up to “anthropomorphic”?
One caution: its training data may have included test questions and answers, and GPT may have ‘remembered’ them. Yet how can word-to-word probabilities make this possible?
Even poetry
GPT is no slouch at the arts. It can write poems in various formats and styles and generate scripts.
I once asked it to write a haiku about the binomial theorem and on the first try it came up with
Binomial rule
Expands powers with ease
Mathematical bliss
Not perfect, but not bad!
Then I asked it to explain the mean value theorem in pirate language. Well, mateys, it produced
The Mean Value Theorem be sayin’ that, at some point ‘long yer journey, yer speed be exactly equal to the average speed of the entire trip. In fancy speak, that means if ye have a continuous and smooth function on an interval [a, b], there exists a point c in that interval (a < c < b), where the derivative (slope) of the function at point c equals the average rate of change between a and b. So, it be tellin’ us that at some point, we be findin’ a tangent line with the same slope as the secant line between a and b.
Arrr, so there ye have it, me hearty! The Mean Value Theorem in pirate lingo, helpin’ ye navigate the seas of calculus.
Again, I have to wonder how a one-word-at-time algorithm can produce a phrase like “navigate the seas of calculus”.
GPT as Gyro Gearloose
GPT is quite the inventor (Gyro Gearloose is the inventor in Disney’s Scrooge comics).
GPT(-3), with some prodding, invented a sudoku like game and produced code to implement it. It called it “Sumplete”. Unfortunately someone already did this, and called it “Summer”. Perhaps GPT encountered Summer during training.
More impressive is the programming language GPT-4 invented. It’s a mix of Typescript and Python. GPT documented it and expertly justified its design decisions:
In this new language, called “TenetLang,” we’ll combine a simple syntax inspired by Python with some features from functional and object-oriented languages. Here is an overview of some design choices:
- Indentation-based scoping, similar to Python.
- Strong typing with type inference, inspired by TypeScript and Kotlin.
- First-class functions and closures, similar to JavaScript.
- Immutable data structures by default, with optional mutable counterparts.
- A concise lambda syntax for anonymous functions.
- Pattern matching and destructuring, inspired by Haskell and Rust.
- Built-in support for concurrency using async/await and lightweight threads.
- Interoperability with other languages using a Foreign Function Interface (FFI).
Checkmate
It turns out that GPT-4 is a decent chess player. Strong, but not unbeatable. But how can you play chess by generating one word after another? Choosing words at random will surely produce a quick defeat.
There is so much else GPT can do – like summarizing documents – that it is absurd to think that it is simply generating one word at a time.
If you want to understand the marvel that is GPT, you have to get beyond the autocomplete myth.
> Are there millions of sentences out there where “yes” follows “Telugu”?
No, but there are millions of internet sentences ‘ Do you speak X?’ where X is one of a few dozen alternatives.
ChatGPT uses a window of 30K words in order to select the next one… not just the last one.
ChatGPT can only repeat (with minor adjustments) what an intelligent human has previously published on the net.
How does it play chess?
Same. There are plenty of pages of the form Qxb2. 1. d4 e5 that intelligent humans have put out there for LLMs to search for and regurgitate.
“the highly multilingual chatGPT doesn’t use the formal grammar theory that Chomsky spent his career elaborating” umm. umm.
“The haiku is a Japanese poetic form that consists of three lines, with five syllables in the first line, seven in the second, and five in the third.”