Auto-GPT is a revolutionary technology that unleashes new abilities for ChatGPT, enabling it to complete tasks all by itself, creating it’s own prompts to get the job done.
The buzz around Auto-GPT has recently surpassed ChatGPT itself, trending number one on Twitter for several days in a row.
What is Auto-GPT?
Auto-GPT is an experimental open-source interface to GPT-4 and GPT-3.5 that enables self-guided (autonomous) task completion.
One only has to provide a list of tasks that need completion and Auto-GPT completes them.
In contrast to ChatGPT which requires numerous detailed prompts, Auto-GPT generates its own prompts to complete the given goals.
If necessary, Auto-GPT will will access websites and search engines to gather data to complete tasks.
What makes its ability to gather external data extraordinary is that Auto-GPT is self-evaluating and can verify the accuracy of the collected data and discard what’s incorrect or substandard and spawn a new subtask to gather better data.
This ability to self-generate prompts to complete tasks is why it’s referred to as an autonomous AI agent.
The official GitHub page for Auto-GPT describes it like this:
“Auto-GPT is an experimental open-source application showcasing the capabilities of the GPT-4 language model.
This program, driven by GPT-4, chains together LLM “thoughts”, to autonomously achieve whatever goal you set.
As one of the first examples of GPT-4 running fully autonomously, Auto-GPT pushes the boundaries of what is possible with AI.”
To use Auto-GPT one needs to first create a paid account at OpenAI.
After the paid account is created the next step is to obtain an OpenAI API which will connect Auto-GPT to your OpenAI access account and bill you for whatever amounts you use.
API stands for Application Programming Interface.
It’s a technology that enables software to securely communicate with another software.
The API allows Auto-GPT to communicate with OpenAI’s GPT-4 and ChatGPT.
OpenAI explains how their API works:
“The OpenAI API can be applied to virtually any task that involves understanding or generating natural language, code, or images.
We offer a spectrum of models with different levels of power suitable for different tasks, as well as the ability to fine-tune your own custom models.
These models can be used for everything from content generation to semantic search and classification.
…The API is powered by a set of models with different capabilities and price points.
GPT-4 is our latest and most powerful model.
GPT-3.5-Turbo is the model that powers ChatGPT and is optimized for conversational formats.”
OpenAI account holders can set hard limits to how much OpenAI will charge and when the limit is reached the service stops working.
Users can also set a soft limit that will send a notification email to alert an account holder when a set limit is reached.
Pricing is based on a charge per a unit of measurement called a token.
A token can be thought of as a measurement of words.
OpenAI defines tokens like this:
“For English text, 1 token is approximately 4 characters or 0.75 words.”
The amount of how many words (tokens) that are sent through the API in the form of a prompt and the amount of tokens (words) contained in the output are used to calculate the usage costs.
One hundred tokens cost a fraction of a penny, $0.002
Examples of What Auto-GPT Can Do
Someone named, Jon Miller (@botzero_net) shared on Twitter a clever example of what Auto-GPT can do.
Write a Midjourney generative art prompt that will create a masterpiece to inspire fear in humans.”
@SullyOmarr tweeted what happened next:
“First: It went straight to google to find the top 5 waterproof shoes reviews.
Once it found links, it created questions for itself like
- “What are the pros and cons of each shoe”
- “What are the pros and cons of each top 5 waterproof shoe”
- “Top 5 waterproof shoes for men””
Then he documented the subsequent analysis:
“It continued to analyze the various sites, with a combination of googling, updating its queries, until it was happy with the results.
Here’s an example of when it thought “critically”.
It knew that some reviews could be biased to fake, so it had to validate the reviewer.”
The Auto-GPT agent spawned sub-agents that were assigned to analyze websites that were used for research and when the AI agent became stuck it figured out a way forward without any outside help.
Finally it finished the task, creating a multi-paragraph analysis of five shoes, listing the pros and cons of each, plus an introduction and a conclusion.
Shockingly, the entire research, analysis and creation process took only eight minutes and ten cents of GPT-4 use to complete.
How Does Auto-GPT Work?
The main feature that powers Auto-GPT is the ability to use ChatGPT to independently create prompts to plan how to complete a task and then create more prompts for finishing that task.
If the AI agent finds itself unable to complete the task it will create new prompts to figure out how to proceed.
Auto-GPT is a self-prompting AI agent that removes the need for creative and detailed prompts. All it needs is a set of goals for a task to complete.
It will generate the necessary prompts to complete the task.
This quality of Auto-GPT can be said to make GPT-4 and ChatGPT even more powerful, astoundingly capable.
One of the secrets to how Auto-GPT works is that it is able to create sub-tasks for each goal, breaking down each task into multiple steps.
Memory management provides Auto-GPT the ability to save important data for the short and long-term so that it doesn’t have to repeat steps, can store data for processing and to keep a running list of what it’s doing.
The GitHub page for Auto-GPT lists these important features that make Auto-GPT work:
- “Internet access for searches and information gathering
- Access to popular websites and platforms
- Long-Term and Short-Term memory management
- File storage and summarization with GPT-3.5
- GPT-4 instances for text generation”
What Do You Need to Make Auto-GPT Work?
Auto-GPT does not have a simple user interface like many consumer-facing software does.
But don’t let that be a discouragement because there is a way for everyone to use it.
There are two requirements to use Auto-GPT:
- An environment to run the program
- An OpenAI API key
The Auto-GPT GitHub page lists three environments to choose from:
- VSCode + devcontainer: It has been configured in the .devcontainer folder and can be used directly
- Python 3.10 or later
The GitHub page also links to a tutorial for installing Python on Windows.
Other Autonomous AI Agents
Auto-GPT is not the only autonomous AI agent, there’s another one called BabyAGI that’s a python script.
“This Python script is an example of an AI-powered task management system. The system uses OpenAI and Pinecone APIs to create, prioritize, and execute tasks. The main idea behind this system is that it creates tasks based on the result of previous tasks and a predefined objective.
The script then uses OpenAI’s natural language processing (NLP) capabilities to create new tasks based on the objective, and Pinecone to store and retrieve task results for context.
This is a pared-down version of the original Task-Driven Autonomous Agent (Mar 28, 2023).”
If all that sounds complicated, there’s still a way for non-developers to use AI Agents like Auto-GPT and BabyAGI.
Easy to Ways to Run an AI-Agent
The pace of AI innovation is incredibly fast and within a matter of two weeks developers created alternate ways to run Auto-GPT with user-friendly interfaces.
These interfaces are so brand new that they are currently in experimental or beta mode, but they work very well.
A brand new web-based AI agent user interface is Cognosys.ai. You still need an OpenAI API key to use Cognosys.ai web interface.
Once you have the OpenAI API key the next step is to sign in with your Google ID or create a log in and password.
Now using an AI agent is as simple as filling out a form and watching the machine complete the task.
Another easy to use interface is called AgentGPT, which is in beta. AgentGPT works similarly to Cognosys.ai.
AgentGPT describes itself like this:
“AgentGPT allows you to configure and deploy Autonomous AI agents.
Name your custom AI and have it embark on any goal imaginable. It will attempt to reach the goal by thinking of tasks to do, executing them, and learning from the results 🚀
This platform is currently in beta, we are currently working on:
- Long term memory
- Web browsing
- Interaction with websites and people”
A tweet from the creators of AgentGPT goes into further detail:
“It works through using models to generate a task list and then iteratively executes tasks, evaluating whether or not tasks are completed or require further sub-actions.
In the future, we’ll have long term memory via @pinecone and give models the ability to query the web…”
One of the latest AI agent interfaces is called Godmode.
To use it one first creates a task. The interface responds with prompts to use that define the task.
Choosing one of the prompts launches the AI agent which commences its work.
Godmode requires either a sign-in with a Google, GitHub, or Twitter account.
Using Godmode at this time doesn’t require an OpenAI API key to function but using one will add the power of GPT-4 to the Godmode output.
Autonomous AI Agents
Some people have been freaking out about ChatGPT.
But Autonomous AI agents like Auto-GPT reveal that there is more to what OpenAI products can do.
The breakthrough of autonomous AI agents is brand new and on the cutting edge. They’re produced by developers and not big companies like OpenAI and Google.
These technologies are still in experimental and beta stages but some of them are mature enough that they can accomplish amazing tasks at a level one expects from a human.
Technology like Auto-GPT make it easy to imagine a point where employers can hire one person to assign tasks to AI agents tasks to do the work of five employees.
It’s not difficult to imagine a time when employers can dispense with the human overseer and simply set loose an AI agent to manage the AI agents.