Deep Dive – What is GPT-3, and why is it becoming so popular?
A famous artificial intelligence researcher once said, “No matter how good our computers get at winning games like Go or Jeopardy, we don’t live by the rules of those games. Our minds are much, much bigger than that.”
Who is this researcher, you may ask? Geoffrey Hinton? Yoshua Bengio? Some obscure figure from a little known part of the world?
Well, it’s certainly not a human being, for starters. This entire quote has been written by the most hyped-up AI development in recent years. You might have heard of it: it’s a little something called GPT-3.
Released by Elon Musk’s OpenAI last month, GPT-3 is an AI system capable of mimicking human language. In fact, it is the latest member of the GPT family of AI systems, with its predecessor being the GPT-2.
In case you haven’t heard all the awe-inspiring, and sometimes scary, stories about GPT-3, this Deep Dive is a perfect place to start. We will start by looking at some of the absolutely incredible things it’s capable of, before moving on to how it works. Along the way, you will also get some decent insight into what a general artificial intelligence system could look like. After all, isn’t that goal of every AI researcher out there? To create an AI system capable of the kind of general intelligence we humans possess?
What can GPT-3 do?
The basic idea is to feed GPT-3 a prompt – a line, a couple of verses, a news summary – and sit back as it extrapolates from it to produce content that seems eerily human-like. You can sign up to play around with GPT-3, but there is a super long waitlist. Thankfully, it’s expected to be commercially available to the masses soon enough.
Of course, some people have been fortunate enough to have had early access to this state-of-the-art language model. One such person is researcher Gwern Branwen, who spent a fair amount of time experimenting with GPT-3 and wrote an article on it.
“Artificial intelligence programs lack consciousness and self-awareness,” he wrote. “They will never be able to have a sense of humor. They will never be able to appreciate art, or beauty, or love. They will never feel lonely. They will never have empathy for other people, for animals, for the environment. They will never enjoy music or fall in love, or cry at the drop of a hat.”
Now, it shouldn’t surprise you to know that Branwen did NOT write any of this. Yes, this entire rant was written by GPT-3 itself. Branwen simply fed a prompt to the system expressing his skepticism of AI, and GPT-3 used it to produce a very coherent argument as to why AI will never be as good as human beings.
Arguing about AI’s inferiority to humanity isn’t the only thing GPT-3 is capable of. In fact, as long as you have a coherent prompt to feed it, GPT-3 can talk about anything in a very clear manner. This is an important point, because if your prompt is nonsensical, then GPT-3 will make as much sense as a drunkard.
So far, GPT-3 has been used for creating imaginary conversations between historical figures, summarizing movies with emojis, and even writing computer code.
As a fun sample of its prowess, here is an excerpt from a Dr. Seuss-inspired poem about Elon Musk’s tweeting habits written entirely by yours truly, GPT-3:
“But then, in his haste,
he got into a fight.
He had some emails that he sent
that weren’t quite polite.
The SEC said, “Musk,
your tweets are a blight.”
Moreover, it can give fairly accurate answers to medical and legal questions, and even explain the reasoning behind those answers. There is even a cool text-based adventure game partly powered by GPT-3 called AI Dungeon, which you can check out here.
Again, if you can come up with good prompts, you best believe that GPT-3 will come up with very human responses.
How does GPT-3 work?
In order to understand how GPT-3 works, you need a basic understanding of supervised and unsupervised learning, which are two of the most popular learning models for machine learning algorithms. In case you need a refresher, here is a Deep Dive we did on supervised and unsupervised learning recently.
In a nutshell, supervised learning involves teaching a system to learn by analyzing carefully labelled datasets comprising of inputs and desired outputs. If you learn the correct answer to a question by checking the Answers section at the back of your textbook, you are engaging in supervised learning.
Now, human beings don’t exactly use supervised learning to acquire skills and knowledge. For one, we don’t have an answer sheet for everything in life. If you have ever tried to learn how to ride a bicycle, you know that it’s just a lot of trial and error, a lot of trusting your gut, and taking help from other human beings.
Basically, since we hardly have a labelled dataset for learning in life, and we end up making inferences based on observations and experience, we do a lot of unsupervised learning.
For AI systems, unsupervised learning is the best way to generalize across tasks, and it’s easier to scale because it doesn’t need structured datasets. It is a well known fact that any system that hopes to emulate general intelligence will be an unsupervised learner.
Like its predecessors, GPT-3 is an unsupervised learner. Everything it knows has come from unlabeled data, and it has learnt to form connections and recognize patterns. Researchers literally fed the system as much of the Internet as possible, including popular Reddit threads, Wikipedia articles, news stories, and fan fiction. In fact, the entire collection of English Wikipedia articles makes up just 0.6 percent of GPT-3’s training data!
It’s like teaching a child as much about the world as you can, provided that the child can retain all of that information. For computers, of course, retaining information is a simple task.
GPT-3 uses this wealth of information to guess what words are likely to come next, given a prompt. If we want GPT-3 to write a news story for TechJuice on itself, we could feed it something like this: “GPT-3 is making waves as a learning model capable of producing content based on prompts.” The system will then end up producing an entire coherent news story, derived from this single statement.
GPT-3 and General AI
So, has GPT-3 achieved general intelligence, the Holy Grail of all artificial intelligence? Not quite. You see, the nature of AI systems means that they can never achieve the kind of true, conscious “understanding” that we humans are capable of.
If you can speak Pashto, it’s because you took the time to understand what each term in Pashto means, along with the essence of its grammar and syntax. If a machine were to learn Pashto, it would never understand what the terms mean, but with enough data at its disposal, it will be able to recognize Pashto phrases and replicate them based on the situation.
This is the point to consider here: GPT-3 does not do a good job of actually understanding something. However, it does an excellent job of mimicking understanding. And at the end of the day, that near-flawless replication in itself is pretty impressive. Fake it till you make it, as the famous phrase goes, and GPT-3 follows it to the dot.
As a final note, let’s see what GPT-3 has to say about itself. Note that even though the following words look incredibly self-aware and insightful, they were made possible thanks to some excellent prompting and the system using whatever it has learnt from the Internet to give an impressive response:
“I can have incorrect beliefs, and my output is only as good as the source of my input, so if someone gives me garbled text, then I will predict garbled text. The only sense in which this is suffering is if you think computational errors are somehow ‘bad.”