The Architect’s Dilemma

The agentic AI landscape is exploding. Every new framework, demo, and announcement promises to let your AI assistant book flights, query databases, and manage calendars. This rapid advancement of capabilities is thrilling for users, but for the architects and engineers building these systems, it poses a fundamental question: When should a new capability be a simple, predictable tool (exposed via the Model Context Protocol, MCP) and when should it be a sophisticated, collaborative agent (exposed via the Agent2Agent Protocol, A2A)?

The common advice is often circular and unhelpful: “Use MCP for tools and A2A for agents.” This is like telling a traveler that cars use motorways and trains use tracks, without offering any guidance on which is better for a specific journey. This lack of a clear mental model leads to architectural guesswork. Teams build complex conversational interfaces for tasks that demand rigid predictability, or they expose rigid APIs to users who desperately need guidance. The outcome is often the same: a system that looks great in demos but falls apart in the real world.

In this article, I argue that the answer isn’t found by analyzing your service’s internal logic or technology stack. It’s found by looking outward and asking a single, fundamental question: Who is calling your product/service? By reframing the problem this way—as a user experience challenge first and a technical one second—the architect’s dilemma evaporates.

This essay draws a line where it matters for architects: the line between MCP tools and A2A agents. I will introduce a clear framework, built around the “Vending Machine Versus Concierge” model, to help you choose the right interface based on your consumer’s needs. I will also explore failure modes, testing, and the powerful Gatekeeper Pattern that shows how these two interfaces can work together to create systems that are not just clever but truly reliable.

Two Very Different Interfaces

MCP presents tools—named operations with declared inputs and outputs. The caller (a person, program, or agent) must already know what it wants, and provide a complete payload. The tool validates, executes once, and returns a result. If your mental image is a vending machine—insert a well-formed request, get a deterministic response—you’re close enough.

A2A presents agents—goal-first collaborators that converse, plan, and act across turns. The caller expresses an outcome (“book a refundable flight under $450”), not an argument list. The agent asks clarifying questions, calls tools as needed, and holds onto session state until the job is done. If you picture a concierge—interacting, negotiating trade-offs, and occasionally escalating—you’re in the right neighborhood.

Neither interface is “better.” They are optimized for different situations:

MCP is fast to reason about, easy to test, and strong on determinism and auditability.
A2A is built for ambiguity, long-running processes, and preference capture.

Bringing the Interfaces to Life: A Booking Example

To see the difference in practice, let’s imagine a simple task: booking a specific meeting room in an office.

The MCP “vending machine” expects a perfectly structured, machine-readable request for its book_room_tool. The caller must provide all necessary information in a single, valid payload:

{
  "jsonrpc": "2.0",
  "id": 42,
  "method": "tools/call",
  "params": {
    "name": "book_room_tool",
    "arguments": {
      "room_id": "CR-104B",
      "start_time": "2025-11-05T14:00:00Z",
      "end_time": "2025-11-05T15:00:00Z",
      "organizer": "user@example.com"
    }
  }
}

Any deviation—a missing field or incorrect data type—results in an immediate error. This is the vending machine: You provide the exact code of the item you want (e.g., “D4”) or you get nothing.

The A2A “concierge,“ an “office assistant” agent, is approached with a high-level, ambiguous goal. It uses conversation to resolve ambiguity:

User: “Hey, can you book a room for my 1-on-1 with Alex tomorrow afternoon?”
Agent: “Of course. To make sure I get the right one, what time works best, and how long will you need it for?”

The agent’s job is to take the ambiguous goal, gather the necessary details, and then likely call the MCP tool behind the scenes once it has a complete, valid set of arguments.

With this clear dichotomy established—the predictable vending machine (MCP) versus the stateful concierge (A2A)—how do we choose? As I argued in the introduction, the answer isn’t found in your tech stack. It’s found by asking the most important architectural question of all: Who is calling your service?

Step 1: Identify your consumer

The machine consumer: A need for predictability
Is your service going to be called by another automated system, a script, or another agent acting in a purely deterministic capacity? This consumer requires absolute predictability. It needs a rigid, unambiguous contract that can be scripted and relied upon to behave the same way every single time. It cannot handle a clarifying question or an unexpected update; any deviation from the strict contract is a failure. This consumer doesn’t want a conversation; it needs a vending machine. This nonnegotiable requirement for a predictable, stateless, and transactional interface points directly to designing your service as a tool (MCP).
The human (or agentic) consumer: A need for convenience
Is your service being built for a human end user or for a sophisticated AI that’s trying to fulfill a complex, high-level goal? This consumer values convenience and the offloading of cognitive load. They don’t want to specify every step of a process; they want to delegate ownership of a goal and trust that it will be handled. They’re comfortable with ambiguity because they expect the service—the agent—to resolve it on their behalf. This consumer doesn’t want to follow a rigid script; they need a concierge. This requirement for a stateful, goal-oriented, and conversational interface points directly to designing your service as an agent (A2A).

By starting with the consumer, the architect’s dilemma often evaporates. Before you ever debate statefulness or determinism, you first define the user experience you are obligated to provide. In most cases, identifying your customer will give you your definitive answer.

Step 2: Validate with the four factors

Once you have identified who calls your service, you have a strong hypothesis for your design. A machine consumer points to a tool; a human or agentic consumer points to an agent. The next step is to validate this hypothesis with a technical litmus test. This framework gives you the vocabulary to justify your choice and ensure the underlying architecture matches the user experience you intend to create.

Determinism versus ambiguity
Does your service require a precise, unambiguous input, or is it designed to interpret and resolve ambiguous goals? A vending machine is deterministic. Its API is rigid: GET /item/D4. Any other request is an error. This is the world of MCP, where a strict schema ensures predictable interactions. A concierge handles ambiguity. “Find me a nice place for dinner” is a valid request that the agent is expected to clarify and execute. This is the world of A2A, where a conversational flow allows for clarification and negotiation.
Simple execution versus complex process
Is the interaction a single, one-shot execution, or a long-running, multistep process? A vending machine performs a short-lived execution. The entire operation—from payment to dispensing—is an atomic transaction that is over in seconds. This aligns with the synchronous-style, one-shot model of MCP. A concierge manages a process. Booking a full travel itinerary might take hours or even days, with multiple updates along the way. This requires the asynchronous, stateful nature of A2A, which can handle long-running tasks gracefully.
Stateless versus stateful
Does each request stand alone or does the service need to remember the context of previous interactions? A vending machine is stateless. It doesn’t remember that you bought a candy bar five minutes ago. Each transaction is a blank slate. MCP is designed for these self-contained, stateless calls. A concierge is stateful. It remembers your preferences, the details of your ongoing request, and the history of your conversation. A2A is built for this, using concepts like a session or thread ID to maintain context.
Direct control versus delegated ownership
Is the consumer orchestrating every step, or are they delegating the entire goal? When using a vending machine, the consumer is in direct control. You are the orchestrator, deciding which button to press and when. With MCP, the calling application retains full control, making a series of precise function calls to achieve its own goal. With a concierge, you delegate ownership. You hand over the high-level goal and trust the agent to manage the details. This is the core model of A2A, where the consumer offloads the cognitive load and trusts the agent to deliver the outcome.

Factor	Tool (MCP)	Agent (A2A)	Key question
Determinism	Strict schema; errors on deviation	Clarifies ambiguity via dialogue	Can inputs be fully specified up front?
Process	One-shot	Multi-step/long-running	Is this atomic or a workflow?
State	Stateless	Stateful/sessionful	Must we remember context/preferences?
Control	Caller orchestrates	Ownership delegated	Who drives: the caller or callee?

Table 1: Four question framework

These factors are not independent checkboxes; they are four facets of the same core principle. A service that is deterministic, transactional, stateless, and directly controlled is a tool. A service that handles ambiguity, manages a process, maintains state, and takes ownership is an agent. By using this framework, you can confidently validate that the technical architecture of your service aligns perfectly with the needs of your customer.

No framework, no matter how clear…

…can perfectly capture the messiness of the real world. While the “Vending Machine Versus Concierge” model provides a robust guide, architects will eventually encounter services that seem to blur the lines. The key is to remember the core principle we’ve established: The choice is dictated by the consumer’s experience, not the service’s internal complexity.

Let’s explore two common edge cases.

The complex tool: The iceberg
Consider a service that performs a highly complex, multistep internal process, like a video transcoding API. A consumer sends a video file and a desired output format. This is a simple, predictable request. But internally, this one call might kick off a massive, long-running workflow involving multiple machines, quality checks, and encoding steps. It’s a hugely complex process.

However, from the consumer’s perspective, none of that matters. They made a single, stateless, fire-and-forget call. They don’t need to manage the process; they just need a predictable result. This service is like an iceberg: 90% of its complexity is hidden beneath the surface. But because its external contract is that of a vending machine—a simple, deterministic, one-shot transaction—it is, and should be, implemented as a tool (MCP).

The simple agent: The scripted conversation
Now consider the opposite: a service with very simple internal logic that still requires a conversational interface. Imagine a chatbot for booking a dentist appointment. The internal logic might be a simple state machine: ask for a date, then a time, then a patient name. It’s not “intelligent” or particularly flexible.

However, it must remember the user’s previous answers to complete the booking. It’s an inherently stateful, multiturn interaction. The consumer cannot provide all the required information in a single, prevalidated call. They need to be guided through the process. Despite its internal simplicity, the need for a stateful dialogue makes it a concierge. It must be implemented as an agent (A2A) because its consumer-facing experience is that of a conversation, however scripted.

These gray areas reinforce the framework’s central lesson. Don’t get distracted by what your service does internally. Focus on the experience it provides externally. That contract with your customer is the ultimate arbiter in the architect’s dilemma.

Testing What Matters: Different Strategies for Different Interfaces

A service’s interface doesn’t just dictate its design; it dictates how you validate its correctness. Vending machines and concierges have fundamentally different failure modes and require different testing strategies.

Testing MCP tools (vending machines):

Contract testing: Validate that inputs and outputs strictly adhere to the defined schema.
Idempotency tests: Ensure that calling the tool multiple times with the same inputs produces the same result without side effects.
Deterministic logic tests: Use standard unit and integration tests with fixed inputs and expected outputs.
Adversarial fuzzing: Test for security vulnerabilities by providing malformed or unexpected arguments.

Testing A2A agents (concierges):

Goal completion rate (GCR): Measure the percentage of conversations where the agent successfully achieved the user’s high-level goal.
Conversational efficiency: Track the number of turns or clarifications required to complete a task.
Tool selection accuracy: For complex agents, verify that the right MCP tool was chosen for a given user request.
Conversation replay testing: Use logs of real user interactions as a regression suite to ensure updates don’t break existing conversational flows.

The Gatekeeper Pattern

Our journey so far has focused on a dichotomy: MCP or A2A, vending machine or concierge. But the most sophisticated and robust agentic systems do not force a choice. Instead, they recognize that these two protocols don’t compete with each other; they complement each other. The ultimate power lies in using them together, with each playing to its strengths.

The most effective way to achieve this is through a powerful architectural choice we can call the Gatekeeper Pattern.

In this pattern, a single, stateful A2A agent acts as the primary, user-facing entry point—the concierge. Behind this gatekeeper sits a collection of discrete, stateless MCP tools—the vending machines. The A2A agent takes on the complex, messy work of understanding a high-level goal, managing the conversation, and maintaining state. It then acts as an intelligent orchestrator, making precise, one-shot calls to the appropriate MCP tools to execute specific tasks.

Consider a travel agent. A user interacts with it via A2A, giving it a high-level goal: “Plan a business trip to London for next week.”

The travel agent (A2A) accepts this ambiguous request and starts a conversation to gather details (exact dates, budget, etc.).
Once it has the necessary information, it calls a flight_search_tool (MCP) with precise arguments like origin, destination, and date.
It then calls a hotel_booking_tool (MCP) with the required city, check_in_date, and room_type.
Finally, it might call a currency_converter_tool (MCP) to provide expense estimates.

Each tool is a simple, reliable, and stateless vending machine. The A2A agent is the smart concierge that knows which buttons to press and in what order. This pattern provides several significant architectural benefits:

Decoupling: It separates the complex, conversational logic (the “how”) from the simple, reusable business logic (the “what”). The tools can be developed, tested, and maintained independently.
Centralized governance: The A2A gatekeeper is the perfect place to implement cross-cutting concerns. It can handle authentication, enforce rate limits, manage user quotas, and log all activity before a single tool is ever invoked.
Simplified tool design: Because the tools are just simple MCP functions, they don’t need to worry about state or conversational context. Their job is to do one thing and do it well, making them incredibly robust.

Making the Gatekeeper Production-Ready

Beyond its design benefits, the Gatekeeper Pattern is the ideal place to implement the operational guardrails required to run a reliable agentic system in production.

Observability: Each A2A conversation generates a unique trace ID. This ID must be propagated to every downstream MCP tool call, allowing you to trace a single user request across the entire system. Structured logs for tool inputs and outputs (with PII redacted) are critical for debugging.
Guardrails and security: The A2A Gatekeeper acts as a single point of enforcement for critical policies. It handles authentication and authorization for the user, enforces rate limits and usage quotas, and can maintain a list of which tools a particular user or group is allowed to call.
Resilience and fallbacks: The Gatekeeper must gracefully manage failure. When it calls an MCP tool, it should implement patterns like timeouts, retries with exponential backoff, and circuit breakers. Critically, it is responsible for the final failure state—escalating to a human in the loop for review or clearly communicating the issue to the end user.

The Gatekeeper Pattern is the ultimate synthesis of our framework. It uses A2A for what it does best—managing a stateful, goal-oriented process—and MCP for what it was designed for—the reliable, deterministic execution of a task.

Conclusion

We began this journey with a simple but frustrating problem: the architect’s dilemma. Faced with the circular advice that “MCP is for tools and A2A is for agents,” we were left in the same position as a traveler trying to get to Edinburgh—knowing that cars use motorways and trains use tracks but with no intuition on which to choose for our specific journey.

The goal was to build that intuition. We did this not by accepting abstract labels, but by reasoning from first principles. We dissected the protocols themselves, revealing how their core mechanics inevitably lead to two distinct service profiles: the predictable, one-shot “vending machine” and the stateful, conversational “concierge.”

With that foundation, we established a clear, two-step framework for a confident design choice:

Start with your customer. The most critical question is not a technical one but an experiential one. A machine consumer needs the predictability of a vending machine (MCP). A human or agentic consumer needs the convenience of a concierge (A2A).
Validate with the four factors. Use the litmus test of determinism, process, state, and ownership to technically justify and solidify your choice.

Ultimately, the most robust systems will synthesize both, using the Gatekeeper Pattern to combine the strengths of a user-facing A2A agent with a suite of reliable MCP tools.

The choice is no longer a dilemma. By focusing on the consumer’s needs and understanding the fundamental nature of the protocols, architects can move from confusion to confidence, designing agentic ecosystems that are not just functional but also intuitive, scalable, and maintainable.

The Architect’s Dilemma

AI & ML, Deep Dive

Radar