ChatGPT: A Mental Model
Since ChatGPT was launched at the end of 2022, I’ve been struggling to find the right framing for the technology. And, so has the rest of the world with countless articles of Doom and Gloom: fears of paperclip maximizers, fears of jobs lost, fears of economies reshaped, fears of AI hallucination, fears of further-accelerated misinformation, fears of students cheating, etc etc etc.
It’s exhausting.
And, as an engineer, I get asked by non-engineers about my opinion on the matter. So here it is..
The Case for Restraint
I’ve been through a number of technology hype-cycles at this point and my modus operandi has always been and remains: “Stay Calm and Carry On!”.
To jog your memory, these happened:
- In the 1990s we finally found the “Ark of the Covenant” and it was called “Java Object-Oriented Programming”. We were going to rewrite everything, even the Operating Systems. And today Linux is… oh wait… it’s still C
- In the late 1990s and early 2000s we all understood that the World-Wide-Web was so revolutionary that What Does The Company Do? was secondary to Do They Do It Online? And the NASDAQ totally didn’t crash nor did it take 15 years to recover the same price levels..
- After the great-recession of 2008, Satoshi Nakamoto completely superseded the world financial system built on flaky humans trusting each other. With “trust” no longer required for anything, bitcoin ushered in a new era of computerized money, prosperity, and freedom. Instability in the financial sector was no more. And, black markets completely failed to function in the now digitized world. All rejoiced. Unfortunately many worthless pieces of archaic fiat currency paper still exist and thus, as a service to the world, this author began a charitable collection service (email me)
- In 2022 after the spot-on 5 year predictions came to fruition, the US Department of Transportation outlawed manual driving of automobiles, stating "It's clear that Level 5 self-driving is far superior to human drivers and Today is a Landmark Day for Public Safety." Argo AI stock tripled on market open. But, for some reason, I can't seem to reach the argo.ai website.. Hmmmm..
- 2023: ChatGPT makes the world into one big paperclip factory, killing all humans in the process. RIP Humanity.
Honorable mentions for your next game of Buzzword Bingo: Everything is Big Data, Everything is Microservices, Everything is Agile, Everything is a Service-Oriented Architecture, Everything should be Javascript everything, Everything can be done with No Code, Everything should be in the Cloud, Everything should be On-Prem, Everything can be modelled with Machine Learning & Data-science, ...
Glib jokes aside, there is a sense that ChatGPT is a bit different, and honestly I don’t disagree (read on). But, there is an awful tendency of the human brain to grab onto change and either project too much excitement or too much fear. The truth lives somewhere in the middle.
Enter Stage Left: Rodney Brooks
Recently IEEE Spectrum published an interview with Rodney Brooks, regarded roboticist, entitled Just Calm Down About GPT-4 Already. In it, Rodney Brooks poses a framing that I've Felt from the beginning but didn’t have the correct words for:
it doesn’t have any underlying model of the world
And in what is nearly a Zen Koan on Theory of Mind, he says:
What the large language models are good at is saying what an answer should sound 'like', which is different from what an answer should 'be'
And this captured my Feeling precisely.
Let me explain.
Interviewing ChatGPT
When ChatGPT was launched in late 2022, friends immediately raved to me about it. They said, “I want it right next to me, like a pair programmer”. So, naturally, I sought to evaluate such a bold claim.
I asked it the same interview questions I’d ask a candidate. If it’s going to work with me, it ought to pass the interview, right? It didn’t. In fact, it failed miserably. And it failed in all the ways a normal candidate fails (which is remarkable in its own way).
How did it fail? It simply didn’t have an underlying Mental Model of the World. On reflection, this is what my interview questions have always been about. I’m not interested in trivia knowledge. I’m not interested in Tools Used. I’m not interested in a few properly composed buzz-words.
But, I am interested in seeing someone reason through a problem based on some underlying model of reality. I like to probe at the edge-cases of that model. I like to throw rare unexpected “curve-balls”. I like to get people to think about sub-problems they’ve never considered before. It is as-if I want to say “let’s together go to the edge of our collective understanding and then try to keep going”. However, to get there, we usually have to first consider and dispatch with the “standard” or “average” answers. By contrast, ChatGPT does not show this ability.
Expert Test-Taking and World Model Building
Back in my school days, I would occasionally meet a super good test taker. I mean the kind of person who don’t learn the actual material. Instead they think about how the test-maker constructed the test. E.g. how often did they make “(a)” the answer to a multiple-choice question? I’ve meet people who never actually learned algebra because they could just “pass the test”. There’s a part of me that’s in Absolute Awe of such a skill. It’s one I don’t have. I have a bad memory. I’m a bad actor. My ability to “read people” is almost certainly below average. I’ve always relied on building and exploring an ever-elaborate model of the world as a crutch to navigate this complicated world.
It’s long been my estimation that all others do the same: Build a World Model. Is this true? I don’t know. The long line of job candidates that are particularly skilled at producing “facts on-demand” would seem to argue otherwise. However, its clear to me that ChatGPT certainly doesn’t.
But this cuts another way too.
Knowing an Average Amount of … Everything!
My current mental model of ChatGPT is that it’s akin to a “Maximum Likelihood Estimator for the Entirety of Human Knowledge”. There are two very different ways to interpret that: (1) Meh, it’s just a silly stats trick and (2) Holy F***ing Shit!!
Have you ever met a person who seemed to know a little something about everything? Perhaps that person also had a large and diverse social circle? Perhaps you would seek this person out if you had a question about something and needed someone to point you in the right direction? A person with extreme breadth.
In my experience, that person doesn’t have the most depth of knowledge. Or perhaps they even give you some wrong answers. Maybe those wrong answers were even given quite confidently. And maybe you feel slighted by them leading you on… But, maybe you stick around anyways because you appreciate their breadth. After all, it’s only occasionally that they are disastrously wrong (shrug).
Now that person goes off to GPU training camp for about 1000 years and comes back as ChatGPT
How can we not be impressed? To know and have access to the “standard” or “average” answer of … well … Everything. Wow.
But in that millennium of training, the core structure didn’t change. It’s the same old friend you always had that will error in the same ways, occasionally make you feel slighted, and sometimes leave you disappointed at their lack of depth.
So where does that leave us?
ChatGPT is Unreasonably Effective and Valuable
An entrepreneur friend recently told me that they use it everyday constantly. This makes sense, being an entrepreneur means that you need to wear a “different hat” constantly. Success favors one who can manage and leverage a large breadth. And quickly!
I myself managed to learn and implement an RSS feed for this website with Zero Background in ~1 hour with ChatGPT. There were a couple mistakes it made, but they were easy to correct. I'm certain it would have taken much longer with Google alone.
At this point, Google seems so SEO-gamed that it can be hard to find “maximum likelihood” average information quickly. You have to wade through lots of clickbait and ads and fluff that is more about “brand-building” than “education” to find the real gems. ChatGPT is simply a time win. Google is scared and they should be.
So, what does the future look like?
Will Large-Language Models transform the global economy? Probably. But it will take some time. The internet took some time. So did mobile phones. So did most new technologies.
Will a sufficient number of human jobs cease to exist? Probably not.
Instead what you have is a remarkable tool to take creativity and ingenuity to new heights. I would expect to see people combining disparate knowledge sets (breadth) in novel ways. It’s a massive boon for multi-disciplinary projects.
And if you’re afraid about job losses, consider this: we once had actual people that were called “Computers” (e.g. see underrated movie: Hidden Figures) which were replaced by machines. Did those jobs go away? No, they got majorly restructured and then growth absolutely exploded! We now just call them “computer programmers” and as of 2023 there are >25 million of them (stats).
It’s hard to believe that it’ll be significantly different this time. For some reason, every time humanity invents 1 new innovative tool, we seem to immediately find about 100 new things to do with it that were never practical before. This is a story of the history of humanity. And, there’s surely some Philosophy of the Human Mind buried in there somewhere, but I’m not going to attempt to unpack that today.
Can change be scary? Yes. Absolutely. And I feel sorry for you if you’re unlucky enough to have to restructure your life and career as a result. But this type of change is essential. An average person today has it significantly better than even the richest humans of just a handful of generations ago. And it’s precisely because of this type of change.
It’s a remarkable time to be alive.
Keep Calm and Carry On!