Within the world of AI, there's lots of hype around MCP Servers. What are they and why should you care?
MCP stands for Model Context Protocol and to understand why this is important, we first need to understand how we got here.
In the early days of LLMs (Large Language Models such as Chat-GPT and Claude) the model could simply regurgitate what it had been trained on. If information changed since the model was trained - it wouldn't know about it. It simply didn't have the ability to go off and to connect to other sources.

Later, the use of tools was introduced which allowed the LLM to be trained to call external systems via APIs. LLMs such as Perplexity can even query the web, augmenting what it already knows with sources from websites. Of course, the LLM needs to be trained in how to use these APIs, how to format requests and how to interpret results. This meant that the LLM was limited in what it could use.

As the demand for smarter and more context aware AI increased, a new challenge was presented: how can a model interact with an increasing number of APIs, data models and services?
MCP helps to bridge this gap by standardising how these models communicate with the world - including tools, data sources, user histories and even other models. The Model Context Protocol (MCP) is an open standard, open-source framework introduced by Anthropic (the start-up behind Claude) in November 2024.
Instead of training each LLM independently to handle specific APIs, MCP defines a universal structure for exchanging information, allowing models to interact more fluidly with systems and resources.

Although the diagram above looks a little more complex, you'll see that the LLMs only use Model Context Protocol to connect to the tools. The grunt work of working out how to format requests to the tools and interpret the responses is performed by the MCP Server. This means that the models can plug into new and diverse systems without retraining. Instead of hard-coding how each model integrates with every service, MCP Servers provide a layer of abstraction.
In a later post, we'll look into how we can make use of MCP Servers in VS Code to allow Co-pilot to understand how our database is structured and generate code based on it's schema.