Last year, OpenAI released the third version of its Generative Pretrained Transformer model (GPT-3), to much excitement amongst the tech and business communities – so much, in fact, that OpenAI’s CEO tweeted “the hype is way too much.” GPT-3 has astonished observers with groundbreaking examples of code, news articles, translations and even poetry which evaluators have difficulty distinguishing from human-written output. Fundamentally, it simply autocompletes: give it a prompt, and it’ll predict what comes next. But the enormous dataset it was trained on, along with the sheer complexity of its architecture, has enabled it to achieve the best results yet. So, how exactly does this technology work, and where could it take us?
How does the technology behind GPT-3 work?
GPT-3 has improved upon its predecessors in a few key ways. Firstly, all of the GPT models are – as it says in the name – ‘transformer’ models. This is a type of architecture which makes use of the ‘attention mechanism’, which allows a system to judge all previous states in the input at once (rather than sequentially processing each state such that it is only possible to access state at state ). In practice, this means that if GPT-3 is processing a word within a text, it will have access to all previous words. Over time, as it encounters more texts, it will learn a way to weight each word in a given text according to how relevant it is to the overall task (i.e. how much attention to pay to it). Intuitively, the ability of GPT models to come up with their own relevance weightings means that they can process every word in a text at the same time without becoming overloaded, since the model is able to ‘prioritise.’
Another significant improvement made by GPT-3 is its sheer scale. While GPT-2 contained 1.5 billion parameters, GPT-3 contains a mindblowing 175 billion (these are the weights of the connections in a neural network, and the number of parameters describes the scale of the model). Scaled-up models can develop accurate pattern-recognition capacities and learn information from context much more quickly. The graph below, taken from the original paper on GPT-3, shows how accurately models of different sizes can respond to a task given no examples (zero-shot), one example (one-shot) or a few examples (few-shot):
Finally, this method of providing a task description for each task along with varying numbers of examples, as opposed to fine-tuning the model using a large dataset specific to a certain task, is much more efficient. Collecting a large dataset each time the model needs to adapt to a new task is difficult and limits the model’s applicability; instead, GPT-3 can use its general pattern-recognition and language-processing capacities to adapt to a wide variety of tasks without the need for a new round of fine-tuned training.
What results has this model already achieved?
The elimination of the need for fine-tuning has made GPT-3 the most versatile system yet. It can complete a wide range of natural language processing tasks extremely well: to name a few, it can complete sentences (‘fill in the blank’ style) with up to 86.4% accuracy, answer questions requiring broad factual knowledge with up to 74% accuracy, and complete stories by selecting the best ending with 79% accuracy. Despite 93% of its enormous dataset being English-only, it gives results comparable to neural machine translation models (such as the one which powers Google Translate) with only a few examples of accurately-translated sentences. Perhaps the most widely-reported result is the incredibly realistic news articles GPT-3 has produced. Only 12% of human evaluators guessed that the following article was not in fact written by a human:
Although the source code is private, hundreds of companies are already using the OpenAI API to automate a variety of tasks requiring the comprehension or prodution of language. One example is Viable, a customer service analytics tool. It uses GPT-3’s information extraction capacity to sort through hundreds of helpdesk tickets, survey responses, reviews, and other feedback sources extremely quickly and generate concise summaries of positive and negative feedback and recommendations for what to prioritise. GPT-3’s speed and accuracy makes it incredibly easy to stay on top of customer feedback even when large volumes of new material are being submitted every day. Since it is also adept at producing its own texts, its summaries are clear and comprehensive. This illustrates one of the key strengths of technology like GPT-3 in this data-driven era: with so much information at our fingertips and such rapidly-evolving customer preferences and trends, it is crucial to develop tools to deal with these large information loads effectively.
GPT-3: an AI brainstorming partner?
There are a few limitations to consider when envisaging what impacts this technology could have in the future. Firstly, it requires huge amounts of computational power, making it costly to run. Building and training large language models is not something you can do every day. Currently, as well as being inaccessible to some companies on cost grounds, GPT-3’s use is also kept under tight control by its parent company, OpenAI – so even if millions of businesses worldwide decided to transform their workflow overnight by adding a GPT-3-backed tool, they couldn’t. This will make the impact of GPT-3 in the near future largely dependent on what OpenAI want the technology to be used for, and whether they decide to radically open up access any time soon.
The technology itself also has some limitations. As obvious as it seems, GPT-3 is not an artificial general intelligence: it does not have reasoning capacity of its own. What’s more, it inherently lacks a large amount of context about the world: specifically, context grounded in physical experience. This is why it has trouble answering questions such as “If I put cheese in the fridge, will it melt?” Since GPT-3 is fundamentally a predictive tool, all it really does is take in large amounts of information and suggest what could come next, albeit with impressive accuracy and nuance – and there’s no guarantee that framing every problem in natural language processing as a prediction problem is the most effective approach.
However, GPT-3 may not be able to think, but it can complement and speed up every stage of the human brainstorming process by summarising information and suggesting additions and edits to text (or even code) far more quickly than another human could. Even the simple idea of a text prediction model can take us very far indeed.
This blog was brought to you by TypeGenie. TypeGenie is an auto-complete product for customer service agents. If you are looking to improve your customer service speed and quality, learn more about TypeGenie:Learn more about TypeGenie >>
Github's Copilot offers a similar service to TypeGenie but focused on Developers rather than Customer Service Agents so we decided to take a closer look at it.
Do you want to know if agents find TypeGenie saves them time? Whether they find it distracting? If you are curious about what our users say about TypeGenie and how it helps them, read on.