Why MCP is the future, like reading a restaurant's menu

Whenever I tell people that MCP is here to stay, the first thing they ask me is: “Why do you think so?”

Martin, formerly a CTO at VMWare, is now a partner at a16z. His tweet refers to AI and MCP.

MCP vs. function calls and WWW

Everyone is familiar with HTTP or the WWW today. Even if they don't know what the acronyms stand for; they understand it's an internet concept.
Both are the foundation of the internet which let us browse web sites.

But now with AI booming, and MCP (Model Context Protocol) becoming a real thing - turning any LLM into an agent that is now capable to operate or use tools its integrated with.

It's true that MCP servers are normally small wrappers around existing cloud APIs, but they can be a fully standalone server with unique functionality, like a filesystem server that can manipulate files on your desktop.

Similarly to how HTTP boomed in the 90's but people weren't sure where it's heading, we are seeing a rise in MCP servers.
While most MCPs still run on desktops, we will soon see more official servers running on the internet. Several companies, such as deepwiki, already host official MCP servers.

MCP is a method for extending the prompt of a LLM.
Extending is simply taking a normal prompt the user writes and add content to it.
Technically, it will look like <user prompt><extra content received from an MCP server>.
A prompt can be 'what's the weather in NYC', and the content is the already fetched text description from the weather website with all the info.
This is where MCP extends the model with extra data from an external source!
And the final result will be a summary or just the fahrenheit today in NYC...

But unfortunately, it is often misunderstood and confused with function calls.
Function calling is a method for LLMs to generate code dynamically and immediately use it to invoke APIs.
And it's time to put an end to this misunderstanding.

To make things clear, let’s dive in, while keeping it simple.

What is MCP?

Picture yourself walking into a Chinese restaurant you’ve never been to before. You open the menu, and even though you’ve never seen this exact one, you probably know what you want to order. That’s because you already understand how menus work and what to do next.

MCP works in a similar way. It’s a plug-and-play system for connecting tools to LLMs. An agentic LLM (one that can access external tools and take actions) can instantly connect to any integrated service and start using it—just like you can walk into any restaurant and navigate the menu without extra instructions. It does the same thing.

MCP = HTTP + Documentation + APIs

Unlike traditional function calls by LLMs or standard APIs, MCP servers allow a client to textually, as in natural language, understand what tools are available and what they do.
When a client connects to a server, it’s presented with a “menu” of supported tools (essentially APIs over JSONRPC). From there, it can decide how, when, and whether to use them.

MCP client architecture

The architecture diagram below illustrates how this works:

Let’s walk through the data flow step by step.

Connection phase:
When an MCP client (like Cursor or anything using LLM) connects to an MCP server, it first goes through handshaking and authentication, after which it receives the full list of available tools.
MCP pre-processing:
When a user submits a prompt, the LLM not only sees the input but also the list of tools exposed by the integrated MCP. Based on the prompt, the LLM decides whether it needs to use any of those tools. If it does, the client observes that the LLM has chosen to call an MCP tool using a structured, proprietary format shared between the LLM and the client. The LLM then prepares the tool call with the proper arguments.
Execution:
The MCP client forwards this tool call to the server, which performs the intended action. For example, accessing a third-party service on the user’s behalf and retrieving a response.
MCP post-processing:
That response, which can be of any type, is passed back to the LLM, which processes it and generates the final output for the user.

Of course, this is just one common implementation pattern—other application builders can design it differently, as long as they follow the official MCP standard.

The key idea is that for every tool call, an LLM is now in the loop both before and after: it decides how to make the call and then interprets the response.

To be clear - an LLM needs to pre-know how an API looks like (imagine, reading the documentation of the API) in order to call it (that's the function call way). And with MCP, it learns how an MCP tool (API) looks like by getting its information and signature from the MCP server!

Why is this so powerful? Well, it’s a double-edged sword:
On the one hand, an LLM can figure out how to interact with a system it has never seen before—effectively removing the need for traditional API or client-server integrations.
On the other hand, because it also analyzes the response, it becomes vulnerable to prompt injection attacks, which are notoriously difficult to guard against.
And gonna break the internet one day, mark my words, and get some 🍿.

The downside of using MCP

Using MCP is amazing but it comes with some dire costs (literally too), let's break them down:

Performance: Calling an API with MCP, even for a simple integer addition, requires a Large Language Model (LLM) to perform billions of GPU calculations. This translates into significant energy costs for any MCP-remote operation. Thus paying more in $$$ terms.
Latency: Powerful GPUs are a prerequisite for standard large language models (LLMs), but endpoint machines currently lack them. This necessitates routing requests to remote servers, such as OpenAI, which introduces network latency due to roundtrips. However, it's anticipated that smaller models (SLMs) will eventually be capable of running on edge devices, enabling the local execution of limited tasks.
Mistakes: A significant hurdle in deploying agents in customer-facing environments remains their inclination for errors, such as accidental data deletion or unintended data sharing can really happen.
This inherent risk stems from the non-deterministic nature of their underlying models, despite this characteristic also being a key advantage to make them come up with creative results.
But sometimes it’s too much, perhaps you have heard about the known issue with “How many R’s in Strawberry?” to which models would respond with ‘2’.

Or another known issue now is seeing how floating points are handled:

While certain tasks are sometimes easier for CPUs, the reliance on AI models for more significant and critical functions is somewhat concerning, hence human-in-the-loop is still required for the safety of critical tasks. And probably one day, this human-in-the-loop will actually be a strong AI model that we can trust. But the design is solid for supervising what’s going on.
Security: Due to the unified nature of prompts and contexts in LLMs—where all information is treated as a single block of text—an untrusted context, such as content from an email that you want to reply to, can manipulate a prompt's intended output. This poses a significant security risk, as it allows malicious actors to command the LLM-powered agent to perform harmful actions. Note that this is not a theoretical attack, in fact it’s dangerously too easy to exploit it right now.

Unfortunately, using MCP introduces additional security concerns, such as supply chain attacks, because users are required to run MCP servers on their own machines if they want to use certain tools, and some of these local servers are already found to be vulnerable in the wild with known CVEs.

By now, we hope, you share our excitement for MCP - developers can integrate whole systems in minutes, and agents can now access anything remote and operate it. But with it comes these risks that cannot be ignored.

At Autonomous Security, our mission is twofold: to simplify MCP accessibility and to bolster its security for enterprises.
You can begin chatting with your third-party accounts and exploring other features by creating a free account at mcptotal.io.
And finally, we prioritize enterprise security through our MCP gateway, desktop application, and endpoint visibility capabilities.

MCP Enters a Chinese Restaurant

MCP vs. function calls and WWW

What is MCP?

MCP client architecture

The downside of using MCP