Site icon Hot Paths

Illia Polosukhin on AI Agents and Why They Still Need Human Oversight

On a given day, Illia Polosukhin has a dozen agents completing different “missions” for him.

One such mission could be “I want to become a better CEO,” he said.

“So it effectively summarizes all of the meeting notes, Google Drive docs, Slack messages, and provides me with a coaching and executive summary of what happened, what I’m missing, and where decisions are stuck,” Polosukhin told Business Insider. “So that runs every week.”

Polosukhin calls these agents his “billionaire-chief-of-staff level of support.” The description is “literally” in the prompt, he said: “You’re a billionaire’s chief of staff.”

It’s an early glimpse of the future Polosukhin sees not only for individual workers or CEOs, but for the entire global economy: a world where agents can make trades, coordinate supply chains, and broker transactions on behalf of people and large companies. And in his view, we’re wholly unprepared for it.

“I think the bigger issue is that we have fundamentally not prepared the system for AGI (artificial general intelligence) being available,” he said. The system being “society, the internet, government institutions, etc.”

Polosukhin is one of the key figures behind generative AI. In 2017, he coauthored the seminal research paper “Attention Is All You Need,” which introduced the Transformer architecture, a novel approach to building AI models. That groundbreaking paper is the reason there’s a “T” at the end of ChatGPT.

Peeling back the black box

Very little about the trajectory of AI surprises the researcher-turned-founder.

The same year the Transformer architecture paper was published, Polosukhin started NEAR AI around the idea that machines could eventually generate software. His thesis was that humans would talk to computers in natural language, like English, and the machines would write the code.

“In 2017, that sounded pretty ridiculous,” he said. Today, it’s called vibe coding.

Polosukhin is also unsurprised by the capabilities some models are now showing. Anthropic on Tuesday said its latest preview model, Mythos, is so capable of finding and exploiting vulnerabilities that the lab is limiting access.

Polosukhin said he had been warning for years that “models will start breaking everything.” He described it to Business Insider as a “cat and mouse” game, where each model iteration can break whatever the previous model fixed.

In a world where people manage their health — or corporations manage logistics — with AI agents, Polosukhin sees a need for a backend trust and security layer meant to guard against those risks.

At NEAR, Polosukhin is building infrastructure to reduce AI agents’ dependence on a single company, such as a frontier AI lab, for controlling and overseeing every step of a task.

In practice, that could mean an AI agent — one that handles your login information, books your travel, and moves money to pay for an airline ticket — wouldn’t require a user to blindly trust a single gatekeeper.

“This is going to have all your information,” Polosukhin said of AI models handling data. “Literally, your life will be there. So you don’t want any singular company to have control or access to this.”

Another risk Polosukhin wants to guard against is manipulation. People are increasingly using AI to get information, from news summaries to investment suggestions. An AI lab, or a malicious actor within it, could quietly shape those answers, Polosukhin said.

One example came last year from xAI, when Grok repeatedly brought up “white genocide” in unrelated responses after what the company said was an “unauthorized modification” to its backend.

Polosukhin’s pitch with NEAR is to develop an open-source, auditable platform that gives users greater visibility into how an AI system operates, rather than treating it as a black box.

Supervision is what AI still needs

At the moment, his own agents are not fully trustworthy.

Polosukhin showed Business Insider how one of his agents can aggregate news around the US-Iran ceasefire and provide market reads. Others are “developer agents” that code and a “growth agent” that can propose steps to increase a certain metric at his company.

As helpful as they are, Polosukhin doesn’t let an AI off leash. The researcher said AI systems still need careful attention.

In his view, AI still struggles with sound judgment, even as online conversations about it can overhype its current progress.

“If I just let it go and run and do things, I come back to something that makes no sense,” he said of AI models. “So you need to babysit it with your judgment.”

Exit mobile version