In May 2026, the two largest AI labs both did the same thing within days of each other. Anthropic announced a $1.5 billion venture to build and run enterprise AI services. OpenAI announced a parallel one. TechCrunch reported that both ventures “explicitly embrace the forward-deployed engineer (FDE) model popularized by Palantir” — describing engineers who sit down with the customer’s own staff “to build tools that fit into the workflows.”

This is not how software companies have sold for the last twenty years. The SaaS playbook was: build the product, sell the seats, let the customer figure out how it fits their work. The forward-deployed model inverts that. It says the hard part — figuring out how the technology fits a specific business’s actual process — is the company’s job, not the buyer’s, and the only way to do it is to put engineers inside the customer’s operation until the thing is running in production.

The shift is measurable. Per the Financial Times, relayed by PYMNTS, monthly job listings for forward deployed engineers rose more than 800% between January and September 2025. If you are a mid-market operator being pitched by an AI vendor in 2026, there is a good chance the pitch now includes a team that wants to embed with you. It’s worth understanding where this model came from, why it took over, and what its own advocates admit it costs — the part most write-ups skip.

The role started at Palantir

The forward deployed engineer is not a 2026 invention. Palantir created the role in the early 2010s, internally calling the earliest cohort “Delta.” As the Pragmatic Engineer documented, Palantir had more forward deployed engineers than ordinary software engineers until around 2016. The whole company was structured around sending engineers to the customer’s site.

The distinction that matters is what these people actually do. A traditional solutions consultant advises, configures, and hands off. A forward deployed engineer, as First Round Review describes the role, embeds with the customer to build the “last mile” in production — and writes and debugs production code to do it. Palantir’s own account of the job is an engineer who learns the customer’s problem in situ and ships software against it, not a consultant who leaves a deck.

For a decade this was treated as a Palantir peculiarity — a labor-heavy way to work that didn’t generalize. That assumption is what broke in 2025.

Why it’s suddenly everywhere

What changed is that the frontier AI labs started hiring for the role at scale. By mid-2025, a16z counted 22 of 311 open OpenAI roles in forward-deployed and solutions-engineering categories; the Pragmatic Engineer reported OpenAI running forward deployed engineers across eight cities. Treat those as point-in-time snapshots of job boards, not standing headcounts — but the direction was unmistakable, and it didn’t stop at the labs.

The legacy consultancies moved next. Accenture and Microsoft launched a joint forward-deployed engineering practice in March 2026, pledging “thousands” of AI engineers to work “shoulder-to-shoulder with clients.” EY launched its own forward-deployed engineering practice in April 2026 — roughly 50 senior AI engineers embedded in client delivery teams — billing itself as the first major consultancy to formally adopt the model. And earlier, in February, OpenAI assembled what it called a Frontier Alliance with BCG, McKinsey, Accenture, and Capgemini to sell and implement its enterprise agent platform — pairing its own forward deployed engineers with the consulting firms’ delivery muscle.

Within roughly fifteen months, a role that had been a Palantir oddity became the default enterprise motion for the AI labs, the cloud platforms, and the Big Four. Which raises the obvious question: why run a software business this way at all, when embedding engineers is expensive and doesn’t scale the way software does?

The bet: trade margin for moat

The clearest articulation of the rationale comes from a16z’s Joe Schmidt, in a June 2025 essay titled, with no subtlety, “Trading Margin for Moat: Why the Forward Deployed Engineer Is the Hottest Job in Startups.”

The argument runs like this. AI has changed what software is. In Schmidt’s framing, “software is no longer aiding the worker — software is the worker.” When you sell something that does the work rather than something that helps a person do the work, you can’t just ship a login and walk away, because the customer doesn’t know how to wire the worker into their process. His analogy: “Enterprises buying AI are like your grandma getting an iPhone… they need you to set it up.” The forward deployed engineer is the person who sets it up, and who stays long enough to make it actually run.

The economic claim underneath is deliberately contrarian. Schmidt argues it is “shortsighted to be optimizing for 80% gross margin,” and that the only thing a company should optimize for now is “growing total gross profit as fast as possible” while building “a durable moat by controlling where and how the data enters the system.” The trade is explicit: accept lower margins and more labor up front, because embedding deeply enough to own the customer’s workflow and data is a defensibility you can’t buy back later. Trade the margin for the moat.

This is the bull case, and it’s a coherent one. It is also, by the framing of the man making it, a bet — not a settled outcome. Which is the part worth slowing down on.

The part the bulls say out loud

Most vendor pitches for the embedded model stop at the moat argument. The honest version doesn’t, because the people most associated with the model are openly skeptical of how it’s being sold.

Start with a16z’s own essay. It doesn’t bury the criticism; it builds the whole piece around it, conceding that critics argue a professional-services motion “limits scalability,” that “lower gross margins reflect a commodity product,” and that “professional services should be done by ecosystem partners, rather than the company itself.” The bull case states the bear case in its own words. It also offers a concrete reminder of how long the “moat” takes to pay off: Salesforce, the essay notes, reportedly burned over $52 million to generate $22 million in revenue before its high-margin platform business materialized. The moat is real, but it arrives years late, and only if the bet pays off.

The sharpest named skepticism comes from practitioners who’ve run the model. James Honsa, co-founder of Genera and formerly head of legal engineering at Ironclad, told First Round Review in February 2026 that “forward deployed engineering is being framed as a panacea right now. But it’s a lot more complicated than that.” He’s blunt about the limits: “There are times in a company’s lifecycle where it makes sense, and there are customer segments where it makes sense, but it’s a pretty blunt instrument to try to use for your entire business.” First Round’s companion piece adds the financial warning directly: investing in an embedded engineering team early “is a costly bet — and can quickly burn a lot of cash if the math doesn’t add up.”

Then there’s the structural critique, which has followed the model since Palantir’s early days: that it’s a services business dressed up as software. The test is simple. If the value lives in specific embedded people rather than in a running product, pulling those people out ends the relationship, and what you have is consulting with a software logo. The analyst Larry Dignan, writing for Constellation Research in February 2026, frames the risk as forward deployed engineers becoming “human middleware” that masks “incomplete tooling, unstable APIs and so-so platforms.” On that reading, a heavy embedding motion can signal the opposite of product maturity.

The model is a tradeoff, not a free lunch, and any vendor who pitches it as pure upside is not being straight with you.

What embedding doesn’t solve

There is one more honest limit, and it’s the one most relevant to a buyer deciding whether an embedded AI engagement will actually stick.

Embedding engineers reduces the burden of figuring out what to build. It does not, by itself, reduce the organizational work of adopting what gets built, or the risk of operating it once it’s live. Those don’t disappear when an outside team shows up; they relocate to that team, and they stay unsolved if the team doesn’t have a structure for handling them.

The governance gap is the clearest evidence. Deloitte’s State of AI in the Enterprise 2026 found that only about one in five organizations has a mature governance model for autonomous agents; roughly 80% don’t. Grant Thornton’s 2026 AI Impact Survey of 950 senior leaders found 78% lacked confidence they could pass an independent AI governance audit within 90 days. Embedding a team to ship an agent into production, without a review layer that monitors what the agent does after go-live, doesn’t close that gap. It just moves the unmonitored risk from the customer’s staff to the vendor’s. (For more on why governance has to be an operating layer rather than a document, see our AI governance framework.)

This is what separates the embedded models worth buying from the ones that aren’t. By 2026, most vendors will offer to embed. What matters is what the embedding is for: shipping a system the customer can operate and audit, or creating a dependency the customer can’t see into and can’t unwind.

What a buyer should actually inspect

If you’re evaluating an embedded AI engagement, the useful diligence is operational, not technical. Four things to inspect:

Ask what the explicit endpoint is. An embedded engagement should be time-bounded, with a defined state where the customer operates independently. If the vendor can’t describe what “done” looks like, if the model is just one engineer per use case forever, you’re buying labor, not capacity.

Ask what stays after the team leaves. Documentation, runbooks, and a knowledge-transfer path are the difference between a system you own and a black box you rent. The skeptic’s red flag, per both Constellation Research and First Round, is a vendor who can’t articulate how you’ll run the thing without them.

Ask where the review layer sits. An agent in production needs a checkpoint where a human catches the failure modes before they reach a customer or a regulator. If the answer is “the model handles it,” the risk hasn’t been managed — it’s been hidden.

Ask whether the work generalizes. Is the embedded team building something repeatable, or a bespoke one-off that will need to be rebuilt for the next process? This is the “services dressed up as software” test, applied to your own engagement.

The forward-deployed model points in the right direction: the burden of making AI fit real work should sit with the people who build the AI, not the buyer. But “embed engineers” is a posture, not a guarantee. What makes it work is the structure around it: a defined endpoint, transferable artifacts, a review layer, and work that generalizes. Without those, embedding is just consulting that happens to ship code.

Gridex runs the forward-deployed posture as operated capacity, not billed labor: we learn the work, build the workflow, and operate it with human review designed into the process — so the result is capacity your team can audit and rely on, not a person you can’t afford to lose. See how we work →