Best Practices To Building MCP Server

Don't think of MCP Server as a wrapper of REST or any other API. This is the #1 thing that you need to have in your mind. Let me confess that when I first started, I did fall in this trap and I would blame on the framework I chose first - fastapi_mcp. Its catchphrase "Expose your FastAPI endpoints as Model Context Protocol (MCP) tools, with Auth!" set the course for how I would go about building MCP server until I realized that the seeming productivity in one didn't yield the desired result from my LLM Agent. I paused and reflected to no avail. Subsequently after long, hard, back and forth attempts, I started building an MCP server with fastmcp, which helped rewire my thought process.
So what exactly do I mean when I say "do not see MCP service as wrapper for REST service"? What I mean by it is to NOT think of having a one-to-one mapping of APIs between MCP and REST services. Designing REST APIs involves thinking of domain models as RESOURCES and leveraging various HTTP verbs as CRUD actions on those resources. Designing MCP APIs involves thinking of Useful Actions that an LLM can do showcasing it as its capabilities; each such capability in LLM parlance is a tool or function and so you hear of the terms "function calling" or "tool calling", used inter-changeably. This also means, MCP service APIs are higher level abstractions over REST service APIs.
Don't be overwhelmed by MCP Specification and don't overlook it evolution. MCP Specification is new and is evolving faster from version 2024-11-05 to 2025-06-18 being the latest. Don't be overwhelmed by the community experts and job postings that scream of years of experience in this. Pick a few libraries and do PoCs to get comfortable with it and have better understanding of the ecosystem. If MCP Specification is in nascent stage, so should be the frameworks or libraries implementing it. You may want to choose the framework depending on your comfort, tech-stack and the complexity of your AI Agent architecture. Some pointers below that might help you:

Gradio comes with capability to build MCP servers. It provides a complete suite of features for MCP Server out of the box that you may want to leverage it all. Gradio natively supports talking to MCP Server built with its library.
FastMCP is designed from the ground-up, purpose built not just for MCP Server but also for MCP Client. When you have an MCP Server, you will also need an MCP Client that talks to MCP Server without much hassles and this framework covers you in this.
FastAPI-MCP is as well a tool with active development going on to keep up with MCP Spec evolution. It doesn't have an MCP Client that you can leverage to talk to its MCP Server, just that you know.

Do always think that the consumer of your MCP service is an LLM Agent and not human. This should induce you to asking, "What capabilities (tools) can I expose for an agent to use in fulfilling a human user's goal?". Beware!, the Agent is the consumer of the service and not the Human User.
Do document function definition in explicit and lucid style. The cleaner and richer the schema definition of the function or capability is, the more meaningful it would be for an LLM to deduce it as its capability thus knowing when to call it based on user's intent. For example, an MCP server declares a `find_orders(status)` tool. The agent discovers this tool and its `status` parameter automatically. New tools can be added to the server, and the agent discovers them immediately on the next connection without any code changes on the client.
Do understand that MCP isn't yet another API like REST, SOAP, etc. REST for long time now has been the de-facto standard for communication between heterogenous systems. But MCP's purpose is different from this and is focused solely on LLM Agents. So, do not mimic rich domain modeling of REST APIs where it has resource hierarchy and its parameters too might be nested and bloated because of its domain modeling abstractions. MCP APIs are dynamic and flat list of callable functions (tools) and accessible data streams (resources). You are expected to have a small percentage of REST APIs as MCP APIs because with MCP, you are working at way higher level of abstractions - user intent. This means you typically would make use of multiple REST API calls for every MCP API.
Do understand that an LLM Agent is not the system integrator and the glue code to manage API calls shouldn't ideally be at Agent layer but should be within the function definition of an MCP API. Because every function call is an inference of end-user's intent, it is at a very high level of abstraction. And this translates to the LLM capability which is the MCP API encompassing multiple calls to REST APIs or other such APIs and transform the data into meaningful information that can be returned for an LLM to further process.
Do ensure that your error handling is not just human-readable but also actionable for LLMs. For instance, if `get_user(name="karthik")` tool returns
```json
{
error_code:-99,
error_message:"Multiple users found having the name - karthik.",
suggested_action:"Please provide additional filters like email or user_id",
available_filters:["email", "user_id", "department"]
}
```,
the LLM Agent can understand not just what went wrong, but what it can do next to fulfill the user's intent.
Do implement progressive error handling where your MCP tools can guide agents through disambiguation. Instead of failing on ambiguous input, provide the agent with options to refine its request. Remember, the agent's job is to fulfill user intent, so your tools should be designed to help it succeed, not just report failures. This takes some real getting used-to.
Do not bloat your response data from an MCP and instead only return just enough data. You may want to transform the response data, if required. For example, it is much cost efficient to return plain text data or data in markdown format instead of returning raw HTML data as response by an MCP service. This can significantly reduce the tokens usage thus reducing the cost and also improve the performance of an LLM in terms of its response latency.
Do optimize your responses for LLM comprehension and next actions, not just token efficiency. While returning plain text instead of raw HTML is good advice, the real question is: "What does the agent need to effectively process this information and take the next step?" Sometimes a well-structured JSON response with clear field names enables better agent reasoning than condensed plain text.
Do consider that different tools serve different purposes in an agent workflow. A `search_documents` tool might return brief summaries for discovery, while a `get_document_details` tool might return rich content for processing. Design your response structure to match the tool's role in the agent's workflow.
Do test your MCP Service in silo to get a feel for how it should work and iterate quickly. I have used Gradio to build a UI quickly for this manual purpose and also do a quick demo in HuggingFace. I also use MCP4Humans extension in VS-Code IDE for manual testing of MCP Server.
Do test your MCP tools with actual LLM interactions, not just API testing tools. Create test scenarios where you give an agent a user intent and verify it can successfully use your tools to achieve that goal. For example, if you have a find_orders(status) tool, test with prompts like "Show me my pending orders" and verify the agent correctly maps this intent to the right tool call.
Do validate your tool descriptions and parameter schemas by observing how LLMs interpret them. If agents consistently misuse a tool or ignore it when it should be relevant, the problem is often in how you've described the tool's purpose and parameters, not in the LLM's reasoning, assuming you chose your LLM right.
Do implement meaningful logging that captures not just what tools were called, but the context of agent decision-making. Log the user's original intent alongside the tool calls, so you can understand if agents are using your tools effectively to fulfill user goals. [If you have better alternative to this, I would love to know!]
Do optimize for agent workflow efficiency, not just individual API performance. A tool that takes 200ms but provides complete information is often better than a tool that takes 50ms but requires the agent to make three follow-up calls to get what it needs. Your mileage may vary on this one!

Blog @ Codonomics

Search This Blog

Buy @ Amazon

Best Practices To Building MCP Server