VentureBeatMar 3, 02:00 PM
OpenAI's AI data agent, built by two engineers, now serves 4,000 employees — and the company says anyone can replicate it
When an OpenAI finance analyst needed to compare revenue across geographies and customer cohorts last year, it took hours of work — hunting through 70,000 datasets, writing SQL queries, verifying table schemas. Today, the same analyst types a plain-English question into Slack and gets a finished chart in minutes.
The tool behind that transformation was built by two engineers in three months. Seventy percent of its code was written by AI. And it is now used by more than 4,000 of OpenAI's roughly 5,000 employees every day — making it one of the most aggressive deployments of an AI data agent inside any company, anywhere.
In an exclusive interview with VentureBeat, Emma Tang, the head of data infrastructure at OpenAI whose team built the agent, offered a rare look inside the system — how it works, how it fails, and what it signals about the future of enterprise data. The conversation, paired with the company's blog post announcing the tool, paints a picture of a company that turned its own AI on itself and discovered something that every enterprise will soon confront: the bottleneck to smarter organizations isn't better models. It's better data.
"The agent is used for any kind of analysis," Tang said. "Almost every team in the company uses it."
A plain-English interface to 600 petabytes of corporate data
To understand why OpenAI built this system, consider the scale of the problem. The company's data platform spans more than 600 petabytes across 70,000 datasets. Even locating the correct table can consume hours of a data scientist's time. Tang's Data Platform team — which sits under infrastructure and oversees big data systems, streaming, and the data tooling layer — serves a staggering internal user base. "There are 5,000 employees at OpenAI right now," Tang said. "Over 4,000 use data tools that our team provides."
The agent, built on GPT-5.2 and accessible wherever employees already work — Slack, a web interface, IDEs, the Codex CLI, and OpenAI's internal ChatGPT app — accepts plain-English questions and returns charts, dashboards, and long-form analytical reports. In follow-up responses shared with VentureBeat on background, the team estimated it saves two to four hours of work per query. But Tang emphasized that the larger win is harder to measure: the agent gives people access to analysis they simply couldn't have done before, regardless of how much time they had.
"Engineers, growth, product, as well as non-technical teams, who may not know all the ins and outs of the company data systems and table schemas" can now pull sophisticated insights on their own, her team noted.
From revenue breakdowns to latency debugging, one agent does it all
Tang walked through several concrete use cases that illustrate the agent's range. OpenAI's finance team queries it for revenue comparisons across geographies and customer cohorts. "It can, just literally in plain text, send the agent a query, and it will be able to respond and give you charts and give you dash