OpenAI and ChatGPT
I had two weeks off at the end of 2022. Telesign closed its operations on the last week of 2022 to give employees well-deserved time off for a year of hardwork and I took another week off in addition to that. Like many technologists, I became captivated by OpenAI’s release of ChatGPT in 2022, and I spent a lot of the last two weeks exploring what OpenAI has to offer.
ChatGPT is a chatbot developed by OpenAI. It has been widely recognized for its impressive capabilities, such as imitating human writing, transforming plain English into code, and making glaringly stupid mistakes. The underlying technology of ChatGPT is a machine learning model known as GPT-3. This model is designed to predict how a human might continue a previous piece of text. For example, given what I’ve written so far, GPT-3 predicted what the rest of this blog post will look like:
I was intrigued by the potential of ChatGPT, and I wanted to see how I could use it to create something unique. After some experimentation, I managed to create a chatbot that could generate a personalized poem for any given topic. The poem generator uses a combination of OpenAI’s GPT-3 and my own custom code to generate poems that are tailored to the user’s input.
The results of my experiment were quite fascinating. The poem generator was able to generate poems that were surprisingly good and unique. Some of the generated poems were even able to capture the emotion of the topic, which was something I didn’t expect.
I believe that the success of my experiment is due to the combination of OpenAI’s GPT-3 and my own custom code. With the help of OpenAI’s LLMs, I was able to quickly develop a prototype that could generate personalized poems in a matter of minutes. Furthermore, the use of my own custom code enabled me to fine-tune the model and create a personalized poem generator that is tailored to each user’s needs.
I am truly grateful for the opportunity to be able to explore OpenAI’s technologies and to create something unique with it. I am confident that OpenAI’s technologies will continue to open up new possibilities and opportunities for technologists all over the world.
For reasons I don’t fully understand, GPT-3’s ability to predict human writings gives it emergent capabilities to write essays and code. It is as if GPT-3 is so good at mimicry that it can perform simple tasks, such as summarization and coding, mechanically, without understanding what it’s doing. It is, in a way, the ultimate example of “fake it till you make it.” Yet these emergent capabilities can be incredibly powerful tools. In the last two weeks, I’ve used it to draw ASCII art for my technical designs, turn draft design into markdown, and predict how an editor will improve my writing.
ChatGPT, a research model, is temporarily free to a limited audience. If you want to build a product using the same technology, you will need to use OpenAI’s Completion API directly. The Completion API offers you access to the “text-davinci-003” model, which is a vanilla version of GPT-3 without the specialized fine-tuning ChatGPT received. (We’ll come back to fine-tuning later.) An example of such a product is a copy-editor which generates several possible improvements to a piece of writing:
From my experience, the vanilla text-davinci-003 model is as good as ChatGPT at copy-editing. I suspect this is because the fine-tuning ChatGPT received is designed to make it better at “talking to a human” rather than being better than performing a specific task. To get the vanilla model to rewrite a piece of text, you’d have to give it a custom prompt. For example, I’d submit this piece of text, called a “prompt” to the Completion API:
Rewrite this to be clear, assuming an educated audience:
The boy is named John. The boy ate an apple. The boy ate broccoli. The boy was happy.
The Completion API then predicted the rest of the text and returned:
Rewrite this to be clear, assuming an educated audience:
The boy is named John. The boy ate an apple. The boy ate broccoli. The boy was happy.
The boy, John, ate both an apple and broccoli, which made him happy.
GPT-3 simply predicted what someone is likely to do when given the instruction to rewrite a sentence. Creating a product with GPT-3 involves understanding how to structure user input in order to obtain a desired effect. This may require a significant amount of trial and error. To make the process easier, OpenAI has provided an online playground to preview how GPT-3 will respond to different prompts.
Another way to customize GPT-3 is through fine-tuning. Fine-tuning allows you to create a customized model with your own training data. Although I have not played experimented with this API, I am looking forward to doing so and writing about it.
As far as costs, the most powerful model from OpenAI costs $0.02 /1k token as of Jan 2023. Assuming this post has 1250 tokens and a copy-editor consumes equal amounts of tokens during input and output, then rewriting this post once will consume about 2500 tokens, or $0.05. That is much cheaper than paying a copy-editor.
Large language models such as GPT-3 contain a massive amount of human knowledge. The ability to extract relevant information from them requires skill and expertise. As large language models become more commonplace, it will become increasingly valuable to be able to work with different models, as how they respond to prompts may differ greatly. For technologists, having a deep understanding of how to use these models will be a significant advantage. Thus, learning to work with different models and hone prompt-hacking will be just as valuable in the next two decades as Google-fu was in the last two decades. For anyone who is unable to access a large language model, I highly recommend setting up an OpenAI account and making it a part of their daily routine.