ChatGPT: A Mental Model

Since ChatGPT was launched at the end of 2022, I’ve been struggling to find the right framing for the technology. And, so has the rest of the world with countless articles of Doom and Gloom: fears of paperclip maximizers, fears of jobs lost, fears of economies reshaped, fears of AI hallucination, fears of further-accelerated misinformation, fears of students cheating, etc etc etc.

It’s exhausting.

And, as an engineer, I get asked by non-engineers about my opinion on the matter. So here it is..

The Case for Restraint

I’ve been through a number of technology hype-cycles at this point and my modus operandi has always been and remains: “Stay Calm and Carry On!”.

To jog your memory, these happened:

Honorable mentions for your next game of Buzzword Bingo: Everything is Big Data, Everything is Microservices, Everything is Agile, Everything is a Service-Oriented Architecture, Everything should be Javascript everything, Everything can be done with No Code, Everything should be in the Cloud, Everything should be On-Prem, Everything can be modelled with Machine Learning & Data-science, ...

Glib jokes aside, there is a sense that ChatGPT is a bit different, and honestly I don’t disagree (read on). But, there is an awful tendency of the human brain to grab onto change and either project too much excitement or too much fear. The truth lives somewhere in the middle.

Enter Stage Left: Rodney Brooks

Recently IEEE Spectrum published an interview with Rodney Brooks, regarded roboticist, entitled Just Calm Down About GPT-4 Already. In it, Rodney Brooks poses a framing that I've Felt from the beginning but didn’t have the correct words for:

it doesn’t have any underlying model of the world

And in what is nearly a Zen Koan on Theory of Mind, he says:

What the large language models are good at is saying what an answer should sound 'like', which is different from what an answer should 'be'

And this captured my Feeling precisely.

Let me explain.

Interviewing ChatGPT

When ChatGPT was launched in late 2022, friends immediately raved to me about it. They said, “I want it right next to me, like a pair programmer”. So, naturally, I sought to evaluate such a bold claim.

I asked it the same interview questions I’d ask a candidate. If it’s going to work with me, it ought to pass the interview, right? It didn’t. In fact, it failed miserably. And it failed in all the ways a normal candidate fails (which is remarkable in its own way).

How did it fail? It simply didn’t have an underlying Mental Model of the World. On reflection, this is what my interview questions have always been about. I’m not interested in trivia knowledge. I’m not interested in Tools Used. I’m not interested in a few properly composed buzz-words.

But, I am interested in seeing someone reason through a problem based on some underlying model of reality. I like to probe at the edge-cases of that model. I like to throw rare unexpected “curve-balls”. I like to get people to think about sub-problems they’ve never considered before. It is as-if I want to say “let’s together go to the edge of our collective understanding and then try to keep going”. However, to get there, we usually have to first consider and dispatch with the “standard” or “average” answers. By contrast, ChatGPT does not show this ability.

Expert Test-Taking and World Model Building

Back in my school days, I would occasionally meet a super good test taker. I mean the kind of person who don’t learn the actual material. Instead they think about how the test-maker constructed the test. E.g. how often did they make “(a)” the answer to a multiple-choice question? I’ve meet people who never actually learned algebra because they could just “pass the test”. There’s a part of me that’s in Absolute Awe of such a skill. It’s one I don’t have. I have a bad memory. I’m a bad actor. My ability to “read people” is almost certainly below average. I’ve always relied on building and exploring an ever-elaborate model of the world as a crutch to navigate this complicated world.

It’s long been my estimation that all others do the same: Build a World Model. Is this true? I don’t know. The long line of job candidates that are particularly skilled at producing “facts on-demand” would seem to argue otherwise. However, its clear to me that ChatGPT certainly doesn’t.

But this cuts another way too.

Knowing an Average Amount of … Everything!

My current mental model of ChatGPT is that it’s akin to a “Maximum Likelihood Estimator for the Entirety of Human Knowledge”. There are two very different ways to interpret that: (1) Meh, it’s just a silly stats trick and (2) Holy F***ing Shit!!

Have you ever met a person who seemed to know a little something about everything? Perhaps that person also had a large and diverse social circle? Perhaps you would seek this person out if you had a question about something and needed someone to point you in the right direction? A person with extreme breadth.

In my experience, that person doesn’t have the most depth of knowledge. Or perhaps they even give you some wrong answers. Maybe those wrong answers were even given quite confidently. And maybe you feel slighted by them leading you on… But, maybe you stick around anyways because you appreciate their breadth. After all, it’s only occasionally that they are disastrously wrong (shrug).

Now that person goes off to GPU training camp for about 1000 years and comes back as ChatGPT

How can we not be impressed? To know and have access to the “standard” or “average” answer of … well … Everything. Wow.

But in that millennium of training, the core structure didn’t change. It’s the same old friend you always had that will error in the same ways, occasionally make you feel slighted, and sometimes leave you disappointed at their lack of depth.

So where does that leave us?

ChatGPT is Unreasonably Effective and Valuable

An entrepreneur friend recently told me that they use it everyday constantly. This makes sense, being an entrepreneur means that you need to wear a “different hat” constantly. Success favors one who can manage and leverage a large breadth. And quickly!

I myself managed to learn and implement an RSS feed for this website with Zero Background in ~1 hour with ChatGPT. There were a couple mistakes it made, but they were easy to correct. I'm certain it would have taken much longer with Google alone.

At this point, Google seems so SEO-gamed that it can be hard to find “maximum likelihood” average information quickly. You have to wade through lots of clickbait and ads and fluff that is more about “brand-building” than “education” to find the real gems. ChatGPT is simply a time win. Google is scared and they should be.

So, what does the future look like?

Will Large-Language Models transform the global economy? Probably. But it will take some time. The internet took some time. So did mobile phones. So did most new technologies.

Will a sufficient number of human jobs cease to exist? Probably not.

Instead what you have is a remarkable tool to take creativity and ingenuity to new heights. I would expect to see people combining disparate knowledge sets (breadth) in novel ways. It’s a massive boon for multi-disciplinary projects.

And if you’re afraid about job losses, consider this: we once had actual people that were called “Computers” (e.g. see underrated movie: Hidden Figures) which were replaced by machines. Did those jobs go away? No, they got majorly restructured and then growth absolutely exploded! We now just call them “computer programmers” and as of 2023 there are >25 million of them (stats).

It’s hard to believe that it’ll be significantly different this time. For some reason, every time humanity invents 1 new innovative tool, we seem to immediately find about 100 new things to do with it that were never practical before. This is a story of the history of humanity. And, there’s surely some Philosophy of the Human Mind buried in there somewhere, but I’m not going to attempt to unpack that today.

Can change be scary? Yes. Absolutely. And I feel sorry for you if you’re unlucky enough to have to restructure your life and career as a result. But this type of change is essential. An average person today has it significantly better than even the richest humans of just a handful of generations ago. And it’s precisely because of this type of change.

It’s a remarkable time to be alive.

Keep Calm and Carry On!