I built my first CLI for Attio and benchmarked it against MCP. The results aren't even close.

Everyone’s building MCP servers. It’s the hot new thing - plug your tool into Claude, plug it into Cursor, watch the magic occur. I get it. I built an MCP course for Attio myself. The developer experience is seductive: declare some tools, return some JSON, let the model figure it out.

But I kept running into the same problems. Conversations burned through tokens, multi-step workflows felt sluggish, and tool schemas ballooned the context window before the model even started thinking. So I did something I never expected: I built a CLI. My first one. attio-cli, a command-line tool for the Attio CRM API, is designed from the ground up for agents, scripts, and humans who prefer the terminal.

Then I ran actual benchmarks against the MCP approach.

The results made me rethink how agents should interact with external systems entirely.

The setup

I’m running these benchmarks against a real Attio workspace (our consulting CRM at 80x) on an M4 Pro MacBook. The CLI is attio-cli@0.3.1, a 110KB Node.js binary with no heavyweight dependencies. For MCP, I’m comparing against the typical Attio MCP server pattern: 30+ exposed tools with full JSON Schema definitions, the setup most teams use today.

The test scenario is mundane by design: search for a company, retrieve its details, and list the schema. Three API calls. The kind of thing an agent does fifty times a day managing your pipeline.

The raw numbers

Three operations through the CLI:

Operation	Latency	Response Size	~Tokens
Search company	670ms	413 bytes	103
Get company detail	407ms	10,351 bytes	2,587
List attributes	318ms	26,830 bytes	6,707

Total API time: 1.4 seconds. Total response payload: 37,594 bytes (~9,400 tokens).

Standard HTTP calls to the Attio API. But the gap opens up when you wrap these operations in the two competing paradigms.

The tax you didn’t know you were paying

Nobody talks about this at conferences: tool schemas are sent with every single message.

A typical Attio MCP server exposes 34 tools. Each tool has a name, description, and a full JSON Schema for its parameters. That’s roughly 25,000 bytes of schema definition - about 6,300 tokens - injected into the context window on every turn of the conversation.

For our three-step workflow, the model processes four turns (three tool calls plus the final text response). The schema overhead alone: 25,248 tokens. That’s 65% of the total input tokens, and none of it does useful work. It’s just the model re-reading the menu every time it wants to order.

The full accounting:

Approach	Total Input Tokens	Schema Overhead
MCP (34 tools)	38,698	25,248 (65%)
CLI (bash tool)	22,307	0
Difference	16,391 (42%)	-

Forty-two per cent fewer tokens. For the same work.

This gap widens as conversations get longer because the schema tax is multiplicative. It compounds on every turn.

Output modes: the CLI’s secret weapon

MCP has no concept of output verbosity. When you call search_companies, you receive the full JSON payload. Every field, every nested object, every timestamp. The model has to parse all of it, even if it only needed a record ID.

The CLI gives you three modes:

Mode	Output Size	Tokens	Use Case
`--json` (full)	60,744 bytes	15,186	When you need everything
`--table` (summary)	4,485 bytes	1,121	When you need an overview
`-q` (IDs only)	185 bytes	46	When you need to chain

That’s a 328x reduction from JSON to quiet mode.

For multi-step workflows, this changes everything. A CLI agent doing “search then get details” fetches 10,388 bytes total: the ID from the search (37 bytes via -q), then the full record. An MCP agent fetches 20,702+ bytes: the full record from the search (which it mostly ignores), then the full record again from the get call.

Before schema overhead.

Quiet mode is something MCP cannot offer at all. MCP tool results are opaque blobs returned to the model. There’s no way for the model to say, “Just give me the IDs this time.” Every call returns everything, every time.

Composability: one turn vs three

A CLI agent can link operations in a single bash invocation:

attio companies search "Acme" -q \
  | head -1 \
  | xargs -I{} attio companies get {} --json

That’s a search-then-get-details pipeline executed in one agent turn: one LLM inference call, one round-trip of context.

With MCP, the same workflow requires:

Turn 1 - Model reads all 34 tool schemas, decides to call search_companies, and waits for the result.
Turn 2 - Model re-reads all 34 tool schemas, parses the search result, extracts the ID, calls get_company, waits for the result.
Turn 3 - Model re-reads all 34 tool schemas, parses the detailed results, and generates the final response.

Three LLM turns, three schema loads, three inference latency penalties. Each turn adds 500–2,000ms of model thinking time on top of the API call latency.

CLI: 1 LLM turn + 2 API calls ≈ 2–3 seconds

MCP: 3 LLM turns + 2 API calls ≈ 5–8 seconds

And this is a trivial two-step workflow. Real CRM workflows that update a deal, create a follow-up task, log a note, and advise the team can chain five, six, or seven operations. It adds up fast.

The cost at scale

At Claude Sonnet pricing ($3/M input tokens), a single three-step workflow costs:

Approach	Cost per Workflow
MCP	$0.1161
CLI	$0.0669

Five cents saved per workflow. Sounds trivial. Now multiply.

A sales team running 100 agentic workflows per day:

	Daily	Monthly
MCP	$11.61	$348
CLI	$6.69	$201
Savings	$4.92	$148

A hundred and fifty dollars a month, on a conservative three-step workflow count. Real deployments with longer conversations and more complex chains can double or triple that.

And this is one integration. Most agentic systems connect to five or ten tools. The schema overhead from multiple MCP server stacks increases linearly. I’ve seen production setups where MCP tool schemas consume over 40,000 tokens before the user even says hello.

If you're building agentic workflows, I write about this stuff regularly.

2× per month, pure signal, zero fluff.

”But MCP is the standard”

I know. I’ve heard the arguments. And some of them are legitimate.

MCP has real advantages: discoverability (the model can see which tools are available), portability (the same server works in Claude Desktop, Cursor, Windsurf, and any MCP client), and safety (structured tool schemas provide stronger guardrails than arbitrary bash execution).

These matter. I’m not saying throw MCP away.

But I am saying the industry has overcorrected. We’ve treated MCP as the universal answer to “how should AI talk to external systems” without asking whether all interaction pattern benefits from it.

The protocol was designed within a world where models need rich, self-describing tool interfaces. That’s great for one-shot interactions where the model has never seen the tool before. But for repeated, high-frequency integrations such as CRM workflows, database queries, and deployment workflows, the overhead is unjustifiable.

The CLI pattern for agents

The Attio CLI was designed for three consumers: humans typing in terminals, scripts running in CI, and AI agents executing via bash. The same binary serves all three, and that’s the entire point.

# Human in a terminal
attio companies list --limit 5

# Script in CI
ID=$(attio records create companies --set name="Acme" -q)
attio notes create --object companies --record $ID \
  --title "Onboarding" --content "Welcome call arranged."

# AI agent via Claude Code (identical syntax)
attio deals list --filter 'stage=Closed - won' \
  --sort created_at:desc --limit 10 --json

No protocol negotiation, no schema exchange, no SDK dependency, no server process to keep running.

The model already knows how to use bash. It’s been trained on millions of terminal sessions. The --help flag is the only documentation it needs, and even that’s optional if you put a cheat sheet in the system prompt.

The CLAUDE.md pattern, dropping CLI documentation directly into the system prompt, gives the model everything it needs in about 2,000 tokens, once, at the start of the conversation. Compare that to 6,300 tokens of MCP tool schemas re-sent on every turn.

What this means for the agentic future

I think we’re going to see a bifurcation in how agents interact with external systems.

MCP wins for breadth. When an agent needs to discover and use a tool that it’s never seen before, self-describing schemas are extremely useful. The “app store” model - browse available servers, connect, explore - is genuinely powerful. MCP is the right answer for one-off integrations, for tools the agent uses once a week, for the long tail of SaaS APIs.

CLI wins for depth. When an agent is doing the same kind of work repeatedly, like managing a pipeline, processing leads, or running deployments, the overhead of MCP is pure waste. A well-designed CLI with output modes, composable pipes, and a documented interface in the system prompt will outperform MCP on every metric that matters: speed, cost, and reliability.

The best agentic systems will use both. MCP for the edges, CLI for the core.

And the really interesting unlock is that CLIs compose with the entire Unix ecosystem: grep, awk, jq, xargs, cron jobs, shell scripts. MCP tools are islands. CLI tools are part of a continent.

I suspect we’ll also see a new generation of “agent-first CLIs” emerge: tools designed from the ground up with -q quiet modes, --json structured output, --filter expressions, and documentation that fits in a system prompt. Not because the AI hype machine demands it, but because these are genuinely better interfaces for programmatic consumption, whether the consumer is a Python script, a bash pipeline, or Claude.

The bottom line

MCP is a protocol, CLI is a pattern. Protocols have overhead; patterns have flexibility.

For agentic CRM workflows on Attio, the CLI approach delivers:

42% fewer tokens per workflow
328x output reduction with quiet mode
2–3x faster multi-step operations (fewer LLM turns)
~$150/month savings per team at moderate usage
Full Unix composability - pipes, scripts, cron, CI/CD
Zero protocol overhead - no schemas, no SDK, no server process

The agentic future isn’t about picking one integration pattern and applying it everywhere. It’s about understanding trade-offs and choosing the right tool for the job. For high-frequency, repeated workflows against a known API, a good CLI isn’t simply competitive with MCP; it’s categorically better.

Build the MCP server for the demo. Ship the CLI for production.

I'm writing more about CLI-first agent patterns and shipping tools for Attio. Subscribe to follow along.

2× per month, pure signal, zero fluff.