Buy vs build: should your legal team build its own AI agents?

01 — The temptation to build

Why "we'll build it ourselves" is a reasonable instinct

The argument for building in-house has never been stronger on the surface. Foundation models from OpenAI, Anthropic, and Google are commercially available via API. Agent frameworks like LangChain, CrewAI, and AutoGen lower the barrier to prototyping. Your engineering team can have a working demo in a week.

And the instinct is understandable. If the core technology is commoditised, why pay a vendor margin? If your legal team's playbooks are proprietary, who better to encode them than your own engineers? If data sovereignty matters, why send contracts to a third party when you could keep everything on your own infrastructure?

These are reasonable questions. The answers are more nuanced than either side of the buy-vs-build debate typically admits. Some organisations should build. Most should not, and for reasons that are not obvious until you have tried.

This piece is not a sales argument. It is an honest accounting of what building an agentic legal system actually requires, drawn from what we have observed in the market over the past two years. The goal is to help enterprise legal leaders make the decision with clear eyes, whichever way they land.

02 — The demo is not the product

What a prototype does not tell you

The most dangerous moment in any build-vs-buy evaluation is the internal demo. An engineer spends a week connecting a foundation model to a document parser, adds some prompt templates, and shows the GC a working NDA review. It looks impressive. The GC thinks: we are 80% of the way there.

They are not. They are perhaps 10% of the way there, and the remaining 90% is where every internal build project either stalls or quietly becomes a full-time engineering commitment that nobody budgeted for.

The demo

What a prototype proves

That a foundation model can read a contract and produce reasonable output. That the API works. That the concept is feasible. This is necessary but radically insufficient.

The product

What production requires

Handling every edge case in document parsing. Multi-model orchestration with fallbacks. Supervision workflows. Email integration. Audit trails. Security hardening. Playbook versioning. And maintaining all of it as models, APIs, and legal requirements change.

The gap between demo and production is not a matter of polish. It is a fundamentally different category of work. The demo asks "can AI do this task?" The product asks "can AI do this task reliably, at scale, on every document format, with appropriate supervision, in a way that your legal team trusts enough to let it interact with the business unsupervised?"

03 — The seven layers of effort

What you are actually building

An agentic legal system is not one thing. It is at least seven distinct engineering challenges, each of which requires specialist knowledge that most enterprise engineering teams do not have in-house.

Document ingestion and parsing

Contracts arrive as PDFs, Word documents, scanned images, email attachments, and occasionally handwritten amendments. Building a parser that reliably extracts structured content from all of these formats, including tables, headers, signature blocks, and embedded images, is a standalone engineering project. Most teams underestimate this by an order of magnitude.

Multi-model orchestration

No single foundation model is best at every legal task. Clause extraction, risk analysis, redline generation, and natural language response each have different accuracy profiles across models. A production system needs to route tasks to the right model, handle rate limits and outages, and manage fallback logic. This is infrastructure engineering, not prompt engineering.

Legal logic encoding

Your playbooks, preferred terms, fallback positions, escalation rules, and jurisdiction-specific variations need to be encoded in a format the system can use. This is not a one-time configuration. It is an ongoing process that requires someone who understands both the legal logic and the technical implementation. Most organisations discover they need a dedicated legal engineer role that did not previously exist.

Supervision and quality control

An agent that produces output without a supervision layer is a liability. Building a supervision interface that shows what the agent did, why it made each decision, and where human review is required is a product design challenge as much as an engineering one. Getting this wrong means your legal team either reviews everything (defeating the purpose) or reviews nothing (creating risk).

Email and workflow integration

Enterprise legal work arrives via email. Building a system that monitors a shared inbox, understands the context of a request, executes the appropriate workflow, and replies via the same channel requires deep integration with Microsoft Exchange or Google Workspace, including authentication, domain allowlists, and thread management.

Security and compliance

Contract data is among the most sensitive information in any organisation. A production system needs encryption at rest and in transit, SOC 2-grade access controls, audit logging, data residency controls, and zero-data-retention agreements with every model provider. Your infosec team will have a long list of requirements. Meeting them takes months.

Ongoing maintenance and model drift

Foundation models change. APIs deprecate. Prompt behaviours shift between model versions. A system that works perfectly in April may produce different output in June because the underlying model was updated. Maintaining production quality requires continuous monitoring, regression testing, and prompt iteration. This is not a project with an end date. It is a permanent operational commitment.

The resource question

Most enterprise legal teams that attempt to build estimate the initial project at 2 to 3 engineers for 3 to 6 months. The reality, based on what we have observed in the market, is closer to 4 to 6 engineers for 12 to 18 months to reach production quality, followed by 2 to 3 engineers permanently for maintenance and iteration. At fully loaded costs of £120 to 180K per engineer, the annual commitment is £500K to £1M before the system processes a single contract.

04 — The hidden cost: legal expertise

Engineering is not the bottleneck. Legal knowledge is.

The conversation about build-vs-buy tends to focus on engineering capability. Can our team build it? Do we have the right skills? These are the wrong questions. The harder problem is not building the technology. It is encoding the legal expertise.

An agentic legal system does not run on code alone. It runs on playbooks: the institutional knowledge of how your legal team handles each type of work. What your preferred NDA terms are. When to accept a counterparty's liability cap and when to push back. Which jurisdictions require specific clauses. How to triage a request that falls between two categories.

This knowledge typically lives in the heads of your most experienced lawyers. Getting it out of their heads and into a format that an AI system can use is a translation exercise that requires someone who speaks both languages. The lawyers know the rules but cannot express them as structured logic. The engineers can build the structure but do not understand the legal nuance.

Vendors who have built agentic legal systems have spent years developing this translation capability. They have legal engineers, supervision teams, and playbook frameworks that have been refined across dozens of enterprise deployments. An internal build starts from zero on this dimension, and it is the dimension that determines whether the system's output is trustworthy.

05 — When building makes sense

The honest case for doing it yourself

Despite everything above, there are situations where building internally is the right decision. It is worth being honest about when that is the case.

You have the team

You already employ AI/ML engineers with production LLM experience, and they have capacity. Not "our data science team could probably figure it out," but engineers who have shipped LLM applications to production and maintained them.

Your use case is narrow

You want to automate a single, well-defined workflow (e.g. NDA review only) rather than a broad set of legal operations. The narrower the scope, the more viable an internal build becomes.

Legal AI is strategic

Your organisation views legal AI capability as a long-term strategic asset worth investing in permanently, not a problem to be solved and moved on from. This implies executive sponsorship, multi-year budget commitment, and willingness to maintain the system indefinitely.

Regulatory constraints are absolute

Your regulatory environment genuinely prohibits any third-party processing of contract data, even under single-tenant, zero-data-retention architectures. This is rarer than most organisations assume, but it does exist in certain defence and intelligence contexts.

If all four of these conditions are met, building internally may be the right path. If any are missing, the probability of a successful build drops significantly. The most common failure mode is having conditions one and two but not three: an engineering team that can build a prototype, a narrow initial scope, but no long-term commitment to maintaining and expanding the system. The prototype ships, works for six months, and then slowly degrades as models change and nobody is assigned to keep it current.

06 — The comparison

What you get and what you give up

Dimension	Build internally	Buy from a vendor
Time to production	12 to 18 months (typical)	2 to 6 weeks
Upfront cost	£500K to £1M+ (engineering team)	Variable, per-unit or subscription
Ongoing cost	£300K to £600K/year (maintenance team)	Included in service fee
Legal expertise	Must build internally or hire	Included (supervision, playbook frameworks)
Model management	Your team manages updates, drift, fallbacks	Vendor manages model layer
Security posture	You control everything	Depends on vendor (look for single-tenant, SOC 2 Type II)
Customisation	Unlimited (you built it)	Varies by vendor (playbook-configurable vs. generic)
Opportunity cost	Engineering team not working on core product	None (no internal engineering required)
Risk if it fails	Sunk cost, team reallocation	Cancel the contract

07 — The question to ask your team

A practical framework for the decision

If you are a GC or Head of Legal Ops weighing this decision, here is a framework that cuts through the theoretical arguments.

The three questions

1. Do we have engineers with production LLM experience who are available for 18+ months? If no, buy. Internal builds without experienced LLM engineers fail predictably.

2. Is our CTO willing to own this system permanently? If the answer is "we'll build it and hand it to legal ops," buy. Systems without engineering ownership degrade.

3. Can we wait 12 to 18 months for production quality? If the backlog is urgent and growing, the opportunity cost of waiting for an internal build may exceed the cost of buying.

These questions are deliberately practical. They do not ask about vision or strategy. They ask about resources, ownership, and urgency. The answers tend to be clarifying.

For most enterprise legal teams, the honest answers are: no dedicated LLM engineers, no CTO commitment to permanent ownership, and an urgent capacity problem that cannot wait 18 months. In that case, the buy decision is not about capability. It is about pragmatism.

For the smaller number of organisations with genuine engineering depth, strategic commitment, and patience, building is a legitimate path. The key is entering that path with realistic expectations about what it costs, how long it takes, and what it requires to maintain.