OpenAI’s GPT and other large language models (LLMs) have unleashed a kind of breathless energy from parts of the tech community that have invested their time in the cutting edge of computing and recognize opportunities when they look back. In my previous post, I took a serious look at how LLMs can help developers now and the existing utility of Copilot. But what about the degree to which everyone else can create software? Beyond the lack of hard-hitting examples, there are now enough clues to see where all of this is probably headed.
Introduction to conversational programming
First, a fairly important naming issue. I will refer to conversational programming in this post, although that term is already very familiar in another domain (CNC machine tool software). Claiming it by force will not be trivial. However, I think the term “rapid engineering,” while popular among users of imaging services like Midjourney, seems to imply a technical approach that is the antithesis of a conversation.
So what do I mean by “conversational programming”? ChatGPT has shown us that we no longer need a “machine whisperer” for a computer to do useful things. Conversational programming moves the focus away from the specialization and planning associated with the classic use of software, towards informality and reactive steps. The conversation between a crew member and the ship’s computer in Star Trek is the hidden mental model held by most geeks and remains a useful lodestar. Whether through text or voice, computing will slowly fade into the background, much like Homer Simpson did in the hedge. But there are a number of rules that should guide the early measurement of these systems, and I have revealed some of them below.
Let’s have a talk
Conversations in real life begin with a common or shared context. This helps us narrow down the seemingly limitless possibilities of language. LLMs need this context for the same reasons: it is very different from a blinking cursor in a system with no expectations of past and future. Most new innovations will come within existing applications because their users already share a context. The word “chat” in front of “ChatGPT” works more than you think: it prepares us for the idea of a casual conversation with a stranger. Furthermore, that computer in Star Trek shares the same circumstances as the crew.
Different conversations, same result
Two people explaining the same thing in different ways must get the same results if we want to trust a system. Otherwise, we just go back to the rumors of engineering and machines. While it seems like this should be part of the training data, we already know that ChatGPT has a habit of responding in different ways to similar inputs. If GPT systems have access to a set of frequent requests, then they can better reason about the likely meaning of similar requests. Since ChatGPT is currently partnered with a company and we are not looking too deeply into legal and security issues, this is a reasonable proposal.
Not standards, but understood objects
In addition to an agreed context, there must be agreement on the form of the unit of work, or the outcome or progress. But not in the old sense of “norm.” As long as conversions exist, an LLM has little or no effort to apply them and change the described form of an object to fit fixed specifications. This is because the GPT engine can track what an image, calendar appointment, document, or rock is while applying external features to them. The implication is that applications and organizations that already specialize in these model domains will come under pressure to make these areas available for autonomous requests. At the moment, LLMs are largely formed from what’s on the web; and that is the same for everyone. What we don’t want is corporate training that instills a brand or a product as a base type.
Hello AI, this is AI calling
Speaking of autonomous requests, to achieve their goals, LLMs will launch background tasks that return with the required information. AutoGPT projects attempt to use APIs to connect to other LLMs and act as task management agents. What sets LLM apart from any other goal-oriented system is that GPT can analyze its own reasoning and even criticize the outcome.
How will people adapt their existing systems? This is the job of the ChatGPT plugin and Simon Willison gives a good overview of how to use it. he maintains Data setan established tool for exploring and publishing data:
“Creating ChatGPT plugins, like everything related to large language models, is really easy and deceptively complicated. You provide ChatGPT with a brief human-language description of your plugin and how to use it, and a machine-readable OpenAPI schema with the API details. And that is! The language model realizes everything else. Datasette exposes a JSON API that speaks SQL. ChatGPT already knows SQL, so all I had to do was give it some hints.”
The human pile
The scope of a conversation should reflect a human “mental stack,” not that of a computer. When I use a conventional Windows interface on my laptop, I am faced with the computer’s file system, which is presented as folders and files. That effort is invested in conversational programming: the LLM system has to work with my limited human cognition capabilities. This means creating things in response to requests and reporting the results at the same level at which I requested them. Returning arcane error codes in response to requests will immediately break the conversation. We’ve already seen ChatGPT reflect on its mistakes, which means that a conversation must retain its value to the user.
Midjourney responds to requests for new images by showing you four alternatives and then you can continue trading with them. Discord’s scroll log (Midjourney is commonly used within the Discord app) acts as a conventional “memory”, but otherwise I can work moment by moment.
An example of a conversation
Below is an example of me creating a wiki page through a conversation with a GPT-enhanced corporate wiki server.
David: Create a new team wiki page
Wiki: Hi David. You don’t have permission to create pages in the company wiki space at this time, but I’ll email the owner and ask for permission. In the meantime, I’ll create a temporary page in your private area and post it when you give me permission. There is a space team template, so I’ll use that. I will call your team “Team Amazing Spoon”; you can change the name before publishing it. I added a team members section and added you as the first member. I have emailed you the link to the provisional page.
David: Rename the team page to “Team Awesome Teacup”
Wiki: I changed the name of the page to “Team Awesome Teacup”. I will register the page in the teams directory when it is published.
Notice how the system addresses the first problem and handles the required default information. It also understands the standard patterns that are already used. By removing the small obstacles that make technical tasks much less straightforward, the average employee can gain confidence in creating resources, without in any way degrading the existing process.
The democracy of computing in the face of risks
The industrialization of LLMs is the only thing we can be reasonably sure of, because the investment has already been made. However, the rapid advancement of GPT systems will likely run aground in the same areas as other large-scale projects in the past. The lack of collaboration between major competitors has eroded countless good ideas that depended on interoperability. Furthermore, no matter how slowly the law advances, it will catch up. How many autonomous decisions do you want based on Wikipedia articles?
Today, apps already give people notable access to data, but typically that’s on a personal basis. Conversational programming will lead to a democratization of computing, in the sense that more people will be able to take responsibility for a broader range of tasks, tasks that we would previously have described as specialized. And that includes building systems that others will then use. But don’t expect many of these to hit the consumer market as quickly as the progress of GPT systems might imply.
Technology advances quickly, don’t miss any episode. Subscribe to our YouTube channel to stream all our podcasts, interviews, demos, and more.