By Huma Ishfaq ⏐ 5 months ago ⏐ Newspaper Icon Newspaper Icon 3 min read
Openai Unveils Powerful Chatgpt Agent For Complex Digital Tasks

OpenAI has launched ChatGPT agent, a general-purpose AI tool designed to carry out complex digital tasks for users, going far beyond simple chatbot interactions. This upgrade marks a significant step towards turning ChatGPT into a truly agentic platform that acts on user instructions, rather than just responding to them.

What Can ChatGPT Agent Do?

According to OpenAI, the ChatGPT agent can now:

  • Navigate calendars
  • Generate and edit presentations
  • Execute code
  • Connect with third-party apps like Gmail and GitHub
  • Research competitors and create slide decks

Even plan and shop for ingredients, like “making Japanese breakfast for four”

These tasks involve navigating websites, planning actions, and using external tools, tasks that require far more coordination and reasoning than ChatGPT’s earlier capabilities.

How It Works

ChatGPT agent blends features from OpenAI’s previous agent tools. It inherits browsing and site navigation capabilities from Operator and research summarization power from Deep Research. Users can activate the agent by selecting “agent mode” in ChatGPT’s tools dropdown. Natural language is all it takes to give commands.

The new tool is available to Pro, Plus, and Team subscribers starting Thursday.

Significantly Improved Performance

OpenAI claims this model outperforms previous versions across several benchmarks:

  • Humanity’s Last Exam: Scores 41.6%, nearly double the performance of o3 and o4-mini.
  • FrontierMath (when tool access is enabled): Scores 27.4%, compared to just 6.3% by o4-mini.

This positions ChatGPT agent as a cutting-edge tool not just for productivity, but also for high-level reasoning and problem-solving.

OpenAI acknowledges that the ChatGPT agent introduces new risks due to its enhanced capabilities. As a result, it has been classified as “high capability” in areas related to biological and chemical weapon domains, meaning the model could “amplify existing pathways to severe harm.”

Although there is no direct evidence of misuse so far, OpenAI is proactively implementing safeguards:

  • Real-time Monitoring: Every prompt is screened with a classifier that checks for biology-related content. If flagged, responses are passed through a second layer to assess for potential biological threats.
  • Memory Feature Disabled: To prevent potential misuse like prompt injection attacks, the agent doesn’t store or recall user conversations, a precaution OpenAI says may be revisited later.

While the ChatGPT agent sets a new bar for general-purpose AI assistants, OpenAI acknowledges that early AI agents from other companies have often fallen short in real-world use. The true measure of success will be how effectively this new agent performs outside of benchmarks, in unpredictable, everyday scenarios.

That said, OpenAI is confident: this is its most capable AI agent yet.