← Back to blog

    The Agentic Workload

    Written by Igor Zalutski · March 15, 2026

    For a while now I've been sitting on this uneasy feeling that the code we write when building agents does not fit nicely into any of the existing "kinds" of code that we are used to from the pre-AI era. But I didn't know why; the only hunch I had was that every time I made another agent, the code that I wrote came out awkward - and it wasn't Claude's fault. It took embarrassingly many repetitions of building the same thing over and over again for it to "click". It feels obvious in hindsight assuming a few plausible priors are true.

    Leading coding agents are CLIs for a reason

    Claude Code, Codex, Amp and similar command-line agents are so universally loved because their creators have figured something counterintuitive: for the "right" system design of an agent to become practically useful without disastrous security consequences, most of the existing developer infrastructure needs to be rebuilt for agentic autonomy. You could wait for that to happen - or you could ship something that works today, even if it looks somewhat wrong.

    doesn't work

    The "right" design

    Stateless Server
    git ???
    npm ???
    docker ???

    these live on laptops, not behind APIs

    just works

    $ npm install -g claude-code

    $ claude

    ▸ Agent has access to:

    git, npm, docker, databases

    your codebase (read/write)

    all terminal tools

    no network overhead

    ready

    What does the right design look like? An agent is basically LLM calls with tools in a loop. It reacts to user input, produces outputs, manages context, and so on. One might naturally design such a thing as a stateless server-side application, running in a container or a serverless function. Tools would be API calls, each tool cleanly abstracting away the internals. On a whiteboard this looks great! Scalability, reliability, all that.

    The only problem is this right design doesn't work. Not because of technical reasons; but because the code that the agent needs to modify doesn't sit neatly in one github repo waiting to be pulled. It's not even about the code - it's about all the countless utilities and services that developers use for building. They are all on their laptops - and no one is going to bother setting up remote development environments just to try this new AI thing.

    The wrong design on the other hand solves it beautifully. A CLI meets developers where they are - there's no need to set up anything else other than installing Claude Code or Codex. If a tool can be used by hand in the terminal, the agent can use it too. This way the actual development environment is fully at the harness's disposal. Also no networking overhead on calling remote services for every step - so it feels snappy, much more so than first-gen cloud-based agents.

    The "wrong" design works because it meets developers where they are. A CLI agent has access to everything a developer has — no setup, no remote environments, no API abstractions.

    The rise of harness-based agent SDKs

    The best coding agents are indeed CLIs - but that does not mean that agentic coding can now only happen on people's laptops! All sorts of code-generating agents are exploding in popularity, many of which are fully autonomous, or are built for non-technical users. For example apps like Lovable allow anyone to create a fully functional application in a few prompts in the browser; Greptile reviews pull requests for bugs and security; many other agents are built for implementing fixes or entire features in response to Slack mentions or Linear tickets.

    All such agents work with code - so if you are building one, which agent framework should you use? Turns out Claude, Codex and other CLI harnesses got so good that if you choose a conventional LLM framework like Langchain or AI SDK you'll have to put in a lot of work just to match their performance with code, especially on more challenging assignments that might take longer. Anthropic and other coding CLI creators put in a lot of effort into their agents to make them stable in a wide range of coding scenarios; matching that is anything but trivial.

    The shift: from DIY to harness SDKs

    DIY Harness

    Langchain, AI SDK, custom loops

    CLI Agents

    Claude Code, Codex, Amp — the breakthrough

    People shelling out to CLIs

    calling Claude Code / Codex from app code

    Official SDKs

    Claude Agent SDK, Codex SDK

    Each layer builds on the one below — the industry converged on wrapping CLI harnesses, not replacing them.

    Realising that, people started simply calling Codex or Claude CLIs in their applications - and achieved superior performance compared to a DIY harness. Big labs noticed that pattern, and shipped SDKs that make such usage easier while still relying on the harness CLI under the hood ( Claude Agent SDK, Codex SDK). Bill Chen from OpenAI suggests to "shift your mindset" - stop making direct model calls and treat the harness as the pluggable building block instead.

    "Shift your mindset" — stop making direct model calls and treat the harness as the pluggable building block instead.

    So where does my agent live?

    Because leading coding agents are CLIs, the most useful SDKs for building coding agents ended up built around those CLIs. So if you are building an agent with Claude Agent SDK, it will shell out to the Claude Code CLI to do its thing. If you are new to building agents but built some web apps or distributed systems before, you might be freaking out - and rightfully so! What do you mean it shells out??? Like, starting another process… on the same host where my application runs? For every request??? Forget scalability or even basic reliability - because it's going to read and write files also, oh and could also run arbitrary code.

    These properties make the code written with these SDKs much more similar to a CI job in nature than it is to an application. Because every "run" of an agent could potentially overwrite or delete any file if the harness decides so; also its resource consumption profile cannot be known beforehand. But at the same time, it's clearly an application - an agent might need to respond to user requests, make API calls and so on. However deploying such code to destinations that the application code is traditionally deployed to - like containers or serverless functions - is clearly not a good idea, for the reasons stated above. So what do we do?

    Agentic Workload

    interactive + destructive + unpredictable

    Traditional App

    Predictable, stateless, doesn't touch the filesystem

    CI Job

    Isolated, runs anything, fire-and-forget

    Responds to requests
    Makes API calls
    Interactive / multi-turn
    Runs arbitrary code
    Modifies files freely
    Isolated environment
    Predictable resources
    AppAgentCI

    The Agentic Workload

    Anthropic wrote a detailed guide on Agent SDK deployment patterns. Regardless of which pattern you pick, you'll likely end up implementing some of the following in your application:

    What every agent deployment ends up needing

    Gateway Service

    always-on, handles webhooks & requests

    Message Queue

    buffers requests while sandbox spins up

    Ad-hoc Sandbox

    isolated env per session, lifecycle tracked

    Your Agent Code

    injected into the sandbox at runtime

    ...regardless of what your agent actually does.

    Pretty much every agent using Claude Agent SDK or Codex SDK will end up doing something similar, regardless of what the agent does (other than generate some code).

    It's a new type of workload - not a traditional backend service; and not a frontend. This type of workload doesn't (yet) have an obvious "home" - as in, here are some services I could deploy it to. But that's strange; because for every piece of code that works locally (and such agents obviously do work locally) there's typically a range of services they can be deployed into. I think this is something that will be solved in the near future and we'll see some new awesome dev platforms emerge.

    It's a new type of workload — not a traditional backend service, and not a frontend. For every piece of code that works locally there's typically a range of services it can be deployed into. This one doesn't have a home yet.

    References