Build and Deploy a Custom MCP Server from Scratch

What is MCP and How Does it Work?
You can think of MCP like the USB-C port on a laptop. One port gives you access to multiple functions such as charging, data transfer, display output, and more, without needing separate connectors for each purpose.
In a similar way, the Model Context Protocol provides a standard, secure, real-time communication interface that allows AI systems to connect with external tools, API services, and data sources.
Unlike traditional API integrations, which require separate code, authentication flows, documentation, and ongoing maintenance for each connection, MCP provides a single unified interface. You write the integration once, and any AI model that supports MCP can use it directly. This makes tool development more consistent and scalable across different environments.
Why It Matters
Before MCP:
-
Every AI app (M) needed custom code to connect with every tool (N), resulting in M × N unique integrations.
-
There was no shared protocol across tools and models, so developers had to reinvent the wheel for each new connection.
After MCP:
-
You can define or expose multiple tools within a single MCP server.
-
Any AI app that supports MCP can use those tools directly.
-
Integration complexity drops to M + N, since tools and models speak a shared protocol.
Architecture
MCP follows a client-server architecture:
-
Client: An AI application (such as an LLM agent, RAG pipeline, or chatbot) that needs to perform external tasks.
-
Server: Hosts callable tools such as “query CRM,” “fetch Slack messages,” or “run SQL.” These tools are invoked by the client and return structured responses.
The client sends structured requests to the MCP server. The server performs the requested operation and returns a response that the model can understand.
In this tutorial, you will see how you can build a custom MCP server using FastMCP, test it locally, and then upload and deploy it in the Clarifai platform.
FastMCP is a high-level Python framework that takes care of the low-level protocol details. It lets you focus on defining useful tools and exposing them as callable actions, without having to write boilerplate code for handling the protocol.
Why Build a Custom MCP Server?
There are already many ready-to-use MCP servers available. For example, you can find MCP servers built specifically to connect with tools like GitHub, Slack, Notion, or even general-purpose REST APIs. These servers expose predefined tools that work well for common use cases.
However, not every workflow can be covered by existing servers. In many real-world scenarios, you will need to build a custom MCP server tailored to your specific environment or application logic.
You should consider building a custom server when:
-
You need to connect with internal or unsupported tools: If your organization relies on proprietary systems, internal APIs, or custom workflows that aren’t publicly exposed, you’ll need a custom MCP server to interface with them. While MCP servers exist for many common tools, there won’t be one available for every system you want to integrate. A custom server allows you to securely wrap internal endpoints and expose them through a standardized, AI-accessible interface.
- You need full control over tool behavior and structure: Off-the-shelf MCP servers prioritize flexibility, but if you require custom logic, validation, response shaping, or tightly defined schemas tailored to your business rules, building your own tools gives you clean, maintainable control over both functionality and structure.
-
You want to manage performance or handle large workloads: Running your own MCP server lets you choose the deployment environment and allocate specific GPU, CPU, and memory resources to match your performance and scaling needs.
Now that you’ve seen why building a custom MCP server can be necessary, let’s walk through how to build one from scratch.
Build a Custom MCP Server with FastMCP
In this section, let’s build a custom MCP server using the FastMCP framework. This MCP server comes with three tools designed for blog-writing tasks:
-
Run a real-time search to find top blogs on a given topic
-
Extract content from URLs
-
Perform keyword research with autocomplete and trends data
Let’s first build this locally, test it, and then deploy it to the Clarifai platform where it can run securely, scale automatically, and serve any MCP-compatible AI agent.
What Tools Will This MCP Server Expose?
This server offers three tools (functions the LLM can invoke):
-
multi_engine_search
Queries a search engine (like Google) using SERP API and returns the top 5 article URLs. -
extract_web_content_from_links
Usesnewspaper3k
to extract readable content from a list of URLs. -
keyword_research
Performs lightweight SEO analysis using SERP API’s autocomplete and trends features.
Step 1: Install Dependencies
Install the required Python packages
Also, set your Clarifai Personal Access Token (PAT) as an environment variable:
Step 2: Project Structure
To create a valid Clarifai MCP server project, your directory should follow this structure:
your_model_directory/
├── 1/
│ └── model.py
├── requirements.txt
├── config.yaml
Let’s break that down:
-
1/model.py
: Your core MCP logic goes here. You define and register your tools using FastMCP. -
requirements.txt
: Lists Python packages needed by the server during deployment. -
config.yaml
: Contains metadata and configuration settings needed for uploading the model to Clarifai.
You can also generate this template using the Clarifai CLI:
Step 3: Implement model.py
Here is the complete MCP server logic:
Understanding the Components
Let’s break down each component of the above model.py
file
a. Initialize the FastMCP Server
The server is initialized using the FastMCP
class. This instance acts as the central hub that registers all tools and serves requests. The name you assign to the server helps distinguish it during debugging or deployment.
Optionally, you can also pass parameters like instructions
, which describe what the server does, or stateless_http
, which allows the server to operate over stateless HTTP for simpler, lightweight deployments.
b. Define Tools Using Decorators
The power of an MCP server comes from the tools it exposes. Each tool is defined as a regular Python function and registered using the @server.tool(...)
decorator. This decorator marks the function as callable by LLMs through the MCP interface.
Each tool includes:
-
A unique name (used as the tool ID)
-
A short description that helps models understand when to invoke the tool
-
Clearly typed and described input parameters using Python type annotations and
pydantic.Field
This example includes three tools:
-
multi_engine_search: Uses SerpAPI to search for articles or blogs. It accepts a query and options like search engine, location, and device type. Returns a list of top URLs.
-
extract_web_content_from_links: Takes a list of URLs and uses the
newspaper3k
library to extract main content from each page. Returns the extracted text (truncated for brevity). -
keyword_research: Combines autocomplete and trends APIs to suggest relevant keywords and rank them by popularity. Useful for SEO-focused content planning.
These tools can work independently or be chained together to create agent workflows like finding article sources, extracting content, and identifying SEO keywords.
c. Define Clarifai’s Model Class
The custom-named model class serves as the integration point between your MCP server and the Clarifai platform.
You must define it by subclassing Clarifai’s MCPModelClass
and implementing the get_server()
method. This method returns the FastMCP
server instance (such as server
) that Clarifai should use when running your model.
When Clarifai runs the model, it calls get_server()
to load your MCP server and expose its defined tools and capabilities to LLMs or other agents.
Step 4: Define config.yaml
and requirements.txt
To deploy your custom MCP server on the Clarifai platform, you need two key configuration files: config.yaml
and requirements.txt
. Together, they define how your server is built, what dependencies it needs, and how it runs on Clarifai’s infrastructure.
The config.yaml
file is used to configure the build and deployment settings for a custom model (or, in this case, a MCP server) on the Clarifai platform. It tells Clarifai how to build your model’s environment and where to place it within your account.
Understanding the config.yaml
File
build_info
This section specifies the Python version that Clarifai should use to build the environment for your MCP server. It ensures compatibility with your dependencies. Clarifai currently supports Python 3.11 and 3.12 (with 3.12 being the default). Choosing the right version helps avoid issues with libraries like pydantic v2
, fastmcp
, or newspaper3k
.
inference_compute_info
This defines the compute resources allocated when your MCP server is running inference — in other words, when it’s live and responding to agent requests.
-
cpu_limit: 1
means the model gets one CPU core for its execution. -
cpu_memory: 1Gi
allocates 1 gigabyte of RAM. -
num_accelerators: 0
specifies that no GPUs or other accelerators are needed.
This setup is usually enough for lightweight servers that just make API calls, run data parsing, or call Python tools. If you’re deploying heavier models (like LLMs or vision models), you can configure GPU-backed or high-performance compute using Clarifai’s Compute Orchestration.
model
This section registers your MCP server within the Clarifai platform.
-
app_id
groups your server under a specific Clarifai app. Apps act like logical containers for models, datasets, and workflows. -
id
is your model’s unique identifier. This is how Clarifai refers to your MCP server in the UI and API. -
model_type_id
must be set tomcp
, which tells the platform this is a Model Context Protocol server. -
user_id
is your Clarifai username, used to associate the model with your account.
Every MCP model must live inside an app. An app acts as a self-contained project for storing and managing data, annotations, models, concepts, datasets, workflows, searches, modules, and more.
requirements.txt
: Define Dependencies
The requirements.txt
file lists all the Python packages your MCP server depends on. Clarifai uses this file during deployment to automatically install the necessary libraries, ensuring your server runs reliably in the specified environment.
Here’s the requirements.txt
for the custom MCP server we’re building:
This setup includes:
-
clarifai
,mcp
, andfastmcp
for MCP compatibility and deployment -
anyio
andrequests
for networking and async support -
lxml
andnewspaper3k
for content extraction and HTML parsing -
google-search-results
for integrating SERP APIs
Make sure this file is located in the root directory alongside config.yaml
. Clarifai will automatically install these dependencies during deployment, ensuring your MCP server is production-ready.
Test the MCP Server
Step 5: Test the MCP Server Locally
Before deploying to production, always test your MCP server locally to ensure your tools work as expected.
Option 1: Use Local Runners
Think of local runners like “ngrok for AI models.” They let you simulate your deployment environment, route real API calls to your machine, and debug in real time — all without pushing to the cloud.
To start:
clarifai model local-runner
This will:
-
Spin up your MCP server locally
-
Simulate real-world requests to your tools
-
Let you validate outputs and catch errors early
Check out the Local Runner guide to learn how to configure the environment and run your models locally.
Option 2: Run Automated Unit Tests with test-locally
For a faster feedback loop during development, you can write test cases directly in your model.py
by implementing a test()
method in your model class. This lets you validate logic without spinning up a live server.
Run it using:
clarifai model test-locally --mode container
This command:
-
Launches a local container
-
Automatically calls the
test()
method you’ve defined -
Runs assertions and logs results in your terminal
You can find the full test-locally guide here to properly set up your environment and run local tests.
Upload and Deploy MCP Server
After you’ve configured your model.py
, config.yaml
, and requirements.txt
, the final step is to upload and deploy your MCP server so that it can serve requests from agents in real time.
Step 6: Upload the Model
From the root directory of your project, run the following command:
clarifai model upload
This command uploads your MCP server to the platform, using the configuration you specified in your config.yaml
. Once the upload is successful, the CLI will return the public MCP endpoint:
https://api.clarifai.com/v2/ext/mcp/v1/users/YOUR_USER_ID/apps/YOUR_APP_ID/models/YOUR_MODEL_ID
This URL is the inference endpoint that agents will call when invoking tools from your server. It’s what connects your code to real-world use.
Step 7: Deploy on Compute
Uploading your server will register it to the Clarifai app you defined in the config.yaml
file. To make it accessible and ready to serve requests, you need to deploy it to dedicated compute.
Clarifai’s Compute Orchestration, lets you create and manage your own compute resources. It brings the flexibility of serverless autoscaling to any environment — whether you’re running on cloud, hybrid, or on-prem hardware. It dynamically scales resources to meet workload demands while giving you full control over how and where your models run.
To deploy your MCP server, you’ll first need to:
-
Create a compute cluster – a logical group to organize your infrastructure.
-
Create a node pool – a set of machines with your chosen instance type.
-
Select an instance type – since MCP servers are typically lightweight, a basic CPU instance is sufficient.
-
Deploy the MCP server – once your compute is ready, you can deploy your model to the selected cluster and node pool.
This process ensures that your MCP server is always on, scalable, and able to handle real-time requests with low latency.
You can follow this guide or this tutorial to learn how to create your own dedicated compute environment and deploy your model to the platform.
Interact With Your MCP Server
Once your MCP server is deployed, you can interact with it using a FastMCP client. This allows you to list the tools you’ve registered and invoke them programmatically using your server’s endpoint.
Here’s how the client works:
1. Client Setup
You’ll use the fastmcp.Client
class to connect to your deployed MCP server. This handles tool listing and invocation over HTTP.
2. Transport Layer
The client uses StreamableHttpTransport
to communicate with the server. This transport is well-suited for most deployments and enables smooth interaction between your app and the server.
3. Authentication
All requests are authenticated using your Clarifai Personal Access Token (PAT), which is passed as a bearer token in the request header.
4. Tool Execution Flow
In the example client, three tools from the MCP server are invoked:
-
multi_engine_search: Takes a query and returns top blog/article links using SerpAPI.
-
extract_web_content_from_links: Downloads and parses article content from given URLs using
newspaper3k
. -
keyword_research: Performs keyword research using autocomplete and trends data to return high-potential keywords.
Each tool is invoked via client.call_tool(...)
, and results are parsed using Python’s json
module to display readable output.
Now that your custom MCP server is live, you can integrate it into your AI agents. The agents can use these tools to complete tasks more effectively. For example, they can use real-time search, content extraction, and keyword analysis to write better blogs or create more relevant content.
Conclusion
In this tutorial, we built a custom MCP server using FastMCP and deployed it to dedicated compute on Clarifai. We explored what MCP is, why building a custom server matters, how to define tools, configure the deployment, and test it locally before uploading.
Clarifai takes care of the deployment environment including provisioning, scaling, and versioning so you can focus entirely on building tools that LLMs and Agents can call securely and reliably.
You can use the same process to deploy your own custom models, open source models, or models from Hugging Face or other providers. Clarifai’s Compute Orchestration supports all of these. Check out the docs or tutorials to get started.