Code was never the moat: context is the only durable asset in AI services

The asset that a services-as-software firm sells, and the asset that makes it defensible, are turning out to be two different things. The deliverable a customer pays for is a running agent that does work. The thing that keeps a competitor from doing the same job next quarter at a lower price is the accumulated knowledge of how that customer's business actually operates, and that knowledge is becoming the scarce input precisely as the code stops being scarce.

The input that used to be the constraint is collapsing in price

For most of software history, the binding constraint on shipping a capability was the cost of building and running the thing that delivered it. That constraint is going away on the dimension that matters for agents. Stanford's AI Index 2025 reports that the inference cost to reach a fixed level of capability fell roughly 280-fold over about 18 months, from around $20 per million tokens in late 2022 to about $0.07 per million tokens in late 2024. Read that as a statement about the input to an agent rather than about the agent itself. The reasoning an agent performs over a process, the classification and extraction and decision steps, is the part whose unit price keeps falling toward zero. When the marginal cost of the work-producing computation approaches zero, the computation cannot be the source of durable margin, because anyone willing to call the same model can produce the same output at the same declining price.

What survives a price collapse is what cannot be copied by running the same model

A firm's moat is whatever a well-funded competitor cannot reproduce by buying the same inputs you bought. If the input is a foundation model accessible to everyone on roughly equal terms, then nothing built only from that input is defensible, and that includes the agent's code. A competitor can read your product, infer the prompts, retrain on public examples, or call a stronger model that has arrived since you shipped. The code an agent runs has the copyability profile of a configuration file rather than of a patent. So the question for any services-as-software firm is which part of the deliverable does not get cheaper or easier to copy as models improve, and the answer is the part that was never in the model to begin with, the specific structural facts of one business.

Context is the thing the model never contained

By context we mean the structural knowledge of how a specific business runs in practice. It covers who actually touches a given piece of work and in what order, where work stalls and waits and why, the undocumented exceptions that the standard procedure omits but that consume most of the handling time, and the unwritten rules people apply when the form does not fit the case. None of this is in the foundation model, because none of it was ever written down anywhere the model could have read it. A general model trained on the public corpus knows how invoicing works in the abstract. It does not know that this company routes anything over a certain amount through a second approver who is on leave every August, or that one upstream system exports dates in a format that silently breaks the downstream import twice a quarter. That is the knowledge an agent needs to do the work reliably, and it is the knowledge no amount of model improvement supplies. This is why we argue, in the context flywheel, that the compounding asset is context rather than code.

Why context is expensive to acquire and non-transferable

Two properties make context defensible. The first is that it is tacit, living in what people do rather than in what the documentation says they do, and the gap between the two is the whole point. Process documents describe the idealized path, and they go stale the moment the real process drifts, which it always does, a dynamic we treat at length in why documented processes rot. Reliable context therefore has to be observed at the desk, in the actual sequence of clicks, handoffs, and corrections, not read out of a manual. Observation is expensive. It takes access, time, and a method for separating the signal of the real process from the noise of any single person's idiosyncrasy. The second property is that context is specific to one business. The knowledge harvested at one company describes that company's real process rather than a portable template, so it does not transfer wholesale to the next customer the way code does. A competitor who copies the agent inherits none of it. A competitor who wants the equivalent has to go acquire it the same slow way, at the same cost, against an incumbent who started earlier.

Why this argument applies to a firm like flowscope

The claim that falling model costs erode software moats is usually told about other companies. We have made it that way too, treating cheap code as a reason the packaged-software moat is dissolving in SaaS is dead and tracing the cost curve itself in what changed in model cost. Intellectual honesty requires applying the same reasoning to a firm like flowscope, which ships agent code as its visible deliverable. If cheap code erodes the moat of a SaaS vendor, it erodes the moat of anyone whose defensibility rests on the code. flowscope's answer is to be built around the asset that does not erode. The model that observes a business, separates the real process from the idealized one, and compounds that knowledge across every engagement is designed to harvest context, with the running agent as the form in which that context gets delivered and kept current. Sequoia's services-as-software framing holds that the customer is buying the work done rather than the tool, and the corollary is that the firm's durable value is whatever lets it keep doing the work as the tool commoditizes, which is the context.

The counter-argument, and where it stops

The strongest objection is that context is not a moat either, because the same capable models that produce cheap code are getting better at extracting context, so harvesting will itself commoditize and the advantage will compress to nothing. There is real force here. Models are improving at reading a screen, summarizing a workflow, and inferring intent from behavior, and as that improves the cost of the first pass at context falls. The objection holds for the extraction technique and fails for the asset. Cheaper extraction tools lower the cost of observing a business you already have access to, which helps the firm that is already inside the account more than the one trying to get in, because the binding constraint on the newcomer was never the quality of the extraction model. What the newcomer lacks is the permission to watch the work at all, the trust required to be granted that access, and the time already spent watching, none of which a better model confers. Better extraction makes the accumulated context cheaper to keep current and richer over time, which compounds the incumbent's lead rather than erasing it. As the code and the extraction technique both commoditize, the observed, business-specific, continuously refreshed context is what retains the margin.