VentureBeatApr 21, 04:55 PM
Kimi K2.6 runs agents for days — and exposes the limits of enterprise orchestration
Most orchestration frameworks were built for agents that run for seconds or minutes. Now that agents are running for hours — and in some cases days — those frameworks are starting to crack.
Several model providers, such as Anthropic with Claude Code and OpenAI with Codex, introduced early support for long-horizon agents through multi-session tasks, subagents and background execution. However, these systems sometimes assume agents are still operating within bounded-time workflows even when they run for extended periods.
Open-source model provider Moonshot AI wants to push beyond that with its new model, Kimi K2.6.
Moonshot says the model is designed for continuous execution, with internal use cases including agents that ran for hours and, in one case, five straight days, handling monitoring and incident response autonomously.
But this growing use of this type of agent is exposing a critical gap in orchestration: most orchestration frameworks were not designed for this type of continuous, stateful execution. Open-source models, such as Kimi K2.6, that rely on agent swarms are making the case that their orchestration approach comes close to managing stateful agents.
The difficulties of orchestrating long-running agents
While it is true that some enterprises would rather bring their own orchestration frameworks to their agentic ecosystem, model providers and agent platforms recognize that offering agent management remains a competitive advantage.
Other model providers have begun exploring long-running agents, many through multi-session tasks and background execution. For example, Anthropic’s Claude Code orchestrates agents with a lead agent that directs other agents based on a set of user-instructed definitions. OpenAI’s Codex runs similarly.
Kimi K2.6 approaches orchestration with an improved version of its Agent Swarms, capable of managing up to 300 sub-agents “executing across 4,000 coordinated steps simultaneously,” Moonshot AI wrote in a blog post. Compared to both Claude Code and Codex, K2.6 relies on the model, rather than pre-defined roles, to determine orchestration.
Kimi K2.6 is now available on Hugging Face, through its API, Kimi Code and the Kimi app.
Practitioners experimenting with long-horizon agents say the brittleness runs deeper than prompting can fix.
As one practitioner, Maxim Saplin, put it in a blog post, “That does not mean subagents are useless. It means orchestration is still fragile. Right now, it feels more like a product and training problem than something you can solve by writing a sufficiently stern prompt.”
The problem long-running agents pose is that it’s difficult to maintain their state, especially as their environment continues to change while they're doing their job. The agent would constantly call different tools and APIs or tap into different databases during its runtime. Most current agents, those that may run for one or two executions, do call different tools, but for at most a minute.
Mark Lambert, chief