OpenAI Introduces Operator for Effortless Task Handling

By Huma Ishfaq|1 year ago|

At the beginning of this year, OpenAI CEO Sam Altman predicted in a blog post that artificial intelligence agents – tools capable of automating tasks and acting on behalf of users would make a major impact in 2025. That vision is now taking shape.

Finally, OpenAI made its first serious effort.

Thursday, OpenAI unveiled a research preview of Operator, a general-purpose AI agent capable of autonomously controlling a web browser and carrying out specific tasks. The operator will be available to ChatGPT Pro subscribers in the US for $200 initially. OpenAI has stated its intention to expand the availability of this option to additional users in the future across its Plus, Team, and Enterprise levels.

“[Operator] will be [in] other countries soon,” stated OpenAI CEO Sam Altman during a livestream on Thursday. “Europe will, unfortunately, take a while.”

Visit operator.chatgpt.com to get this early research preview; however, OpenAI plans to integrate Operator into all of its ChatGPT clients in the near future.

Features of Operator

Online shopping, making restaurant reservations, and arranging travel lodgings are just a few of the operations that Operator claims to automate, according to OpenAI. Within the Operator interface, users have the option to select various task categories that offer various forms of automation. These categories include shopping, delivery, eating, and travel, among others.

A small window will appear when ChatGPT users enable Operator. It will show the agent’s dedicated web browser and explain what the agent is doing as they complete tasks. Due to the Operator’s usage of its own dedicated browser, users retain control of their screen even while the Operator is functioning.

In order to power the Operator, OpenAI claims to have used a Computer-Using Agent (CUA) model that integrates the reasoning powers of their more advanced models with the vision capabilities of their GPT-4o model. The CUA may access various services without using developer-facing APIs because it is designed to work with website front ends.

In simpler terms, the CUA is capable of interacting with online forms, menus, and buttons in the same way that a human would.

To make sure that the Operator follows the rules set out by firms like Uber, StubHub, DoorDash, eBay, Priceline, and Instacart, OpenAI claims to be working with these companies.

OpenAI Operator

“The CUA model is trained to ask for user confirmation before finalizing tasks with external side effects, for example before submitting an order, sending an email, etc., so that the user can double-check the model’s work before it becomes permanent,” OpenAI explains in a Techjuice report. “[It] has already proven useful in a variety of cases, and we aim to extend that reliability across a wider range of tasks.”

On the other hand, OpenAI cautions that the CUA isn’t without its flaws. It “[doesn’t] expect [the] CUA to perform reliably in all scenarios just yet,” according to the business.

“Currently, Operator cannot reliably handle many complex or specialized tasks,” OpenAI says in a support article, “such as creating detailed slideshows, managing intricate calendar systems, or interacting with highly customized or non-standard web interfaces.”

Out of an abundance of caution, OpenAI requires supervision for specific tasks, such as financial transactions, that the CUA and Operator may otherwise undertake on their own. For example, users will have to assume control in order to enter their credit card details. According to OpenAI, the Operator does not record or capture any information.

“On particularly sensitive websites, such as email, Operator requires active user supervision, ensuring users can directly catch and address any potential mistakes the model might make,” according to the OpenAI documentation.

Sure, this reduces the Operator’s utility, but it also prevents the agent from having a hallucination and, for example, buying accent chairs with your mortgage money. As an example, Google’s Project Mariner AI agent follows a similar strategy of not pre-filling fields with sensitive data like credit card information.

Limitations & Security Measures

It is important to be aware of the limitations of the operator. Both the daily and task-specific rate limitations exist. According to OpenAI, there are “dynamic limits” to how many tasks the Operator can handle simultaneously. On a daily basis, there is also a new overall consumption cap.

For obvious security reasons, the Operator will not do things like send emails (even though the CUA may) or remove calendar items at this release point. OpenAI says this will change in the future, but provides no timetable.

The operator could become “stuck” if it encounters an overly complicated interface, password field, or CAPTCHA check. According to OpenAI, if this happens, the user will be prompted to take control.

Future of Autonomous AI

The safety concerns around the technology may explain why OpenAI has been slower in creating an AI agent than its competitors (see Rabbit, Google, and Anthropic’s agents).

When an AI system can act on the web, it opens the door to considerably more risky use cases from malicious actors. You could use AI to run email scams, launch DDoS attacks, or quickly buy concert tickets before anyone else. It is crucial that OpenAI takes measures to avoid this kind of vulnerability, particularly for a product with ChatGPT’s widespread usage.

Operator appears to be sufficiently stable for publication in its current state, according to OpenAI, at least as a research preview.

“Operator employs tools that seek to limit the model’s susceptibility to malicious prompts, hidden instructions, and phishing attempts,” According to OpenAI’s website. “A monitoring system pauses execution if suspicious activity is detected, while automated and human-reviewed pipelines continuously update safeguards.”

Operator is OpenAI’s most ambitious effort to create an AI assistant. With the release of Tasks last week, ChatGPT gained access to basic automation tools like reminders and the ability to schedule daily runs of prompts, courtesy of OpenAI.

The addition of Tasks to ChatGPT brought some much-needed familiar functionality, elevating ChatGPT to a level of practicality comparable to Alexa or Siri. However, the Operator demonstrates skills that were unimaginable in earlier iterations of virtual assistants.

Some have speculated that AI agents will revolutionize people’s interactions with computers and the internet, making them the industry’s next big thing following ChatGPT. It is not enough for agents to merely transmit and process data; they can, in fact, act and perform tasks as well.

OpenAI is set to unveil its first tangible approach to AI agents, offering us a glimpse into how feasible and impactful this concept could truly be.