Extending LLM Capabilities with Model Context Protocol, Part 4

From Laptop to Production

If you’ve read through the previous posts (Part 1, Part 2, Part 3) on this topic you know that my motivation to create my own MCP was inspired by my desire to generate accurate guitar tabs–something Claude could not do reliably. In the previous post I walked through the development process, to the point where I was able to get the Tab Generator working well with Claude-but only on my desktop. It only worked when my computer was on, the server was running, and I was sitting at my desk.

For a purely personal project, that might be sufficient. But I wanted to use the Tab Generator from multiple devices, demonstrate it reliably, and potentially share it with others. That meant solving the hosting problem—a decision that involved its own set of tradeoffs and unexpected lessons.

This final post in my series on MCP covers the deployment decision, provides instructions for actually running the Tab Generator yourself, and wraps up what this entire journey revealed about extending LLMs with MCPs.

The Hosting Decision

Once your MCP works locally, you face a fundamental choice: keep it running on your machine or deploy it to a hosted platform. This decision affects reliability, accessibility, maintenance burden, and cost more than you might expect.

Local vs. Hosted: What This Actually Means

Local hosting means running the MCP server on your own computer. The LLM connects to localhost (typically http://localhost:8000) to access your MCP’s capabilities. Your machine becomes the server, and the MCP only works when your computer is on and the server process is running.

Hosted deployment means your MCP runs on cloud infrastructure—services like Render, Cloudflare Workers, AWS, or similar platforms. The LLM connects to a public URL (like https://tab-generator.onrender.com/), and your MCP remains accessible regardless of whether your personal computer is on.

The Tradeoffs

Local hosting offers simplicity and control. There’s no deployment pipeline to configure, no hosting costs, and no public endpoint to secure. When you’re developing and testing, local hosting makes iteration fast—you can modify code and restart the server in seconds. For personal projects you’ll only use from one machine, local hosting can be entirely sufficient.

But, of course, your MCP stops working when your computer sleeps, restarts, or loses internet connection. If you want to access your MCP from multiple devices or share it with others, local hosting becomes impractical.

Hosted deployment solves these reliability problems but introduces new complexity. You need to configure deployment settings, manage environment variables, handle secrets properly, and understand how your hosting platform works. Most platforms offer free tiers with limitations—restricted uptime, cold starts, or usage caps. You’ll need to navigate their documentation, which varies in quality and often assumes more infrastructure knowledge than MCP developers might have.

The security considerations also differ. Local MCPs are only accessible from your machine (or your local network), which provides inherent protection. Hosted MCPs expose a public endpoint, requiring you to consider the need for authentication and input validation.

My Experience with Hosting Platforms

I experimented with both Cloudflare Workers and Render for the Tab Generator, ultimately settling on Render for this project.

Cloudflare Workers appealed initially because of Cloudflare’s reputation for performance and their generous free tier. However, I found the development experience challenging for Python-based MCPs. Cloudflare Workers are optimized for JavaScript. For JavaScript MCPs, Cloudflare might be excellent, but my work was in Python.

Render proved more straightforward for Python projects. Their platform expects traditional web applications, which aligns well with how MCPs work. I could deploy directly from my GitHub repository, and Render handled the build process automatically. The free tier has limitations—notably, services spin down after inactivity and take 30-60 seconds to wake up on the first request. This cold start delay is annoying but manageable for a personal project.

The deployment process on Render was relatively painless once I understood their expectations. I needed to specify the Python version, define the start command, and set any necessary environment variables. Their documentation covered these basics adequately, though I still spent time debugging issues specific to my setup—mostly related to dependency versions and Python package installation.

Render’s free tier services have limited uptime each month. If you exceed these limits, your MCP stops responding until the next billing cycle or you upgrade to a paid plan. For the Tab Generator, this hasn’t been a problem, but it’s worth considering if you’re building something with heavier usage.

What I’d Recommend

Start with local hosting while developing. Get your MCP working reliably on your machine before introducing deployment complexity. Once you’re confident in the core functionality, consider your actual needs:

Personal use, single device: Local hosting is probably sufficient
Personal use, multiple devices: Hosted deployment on a free tier works well
Sharing with others: Hosted deployment is necessary, and you may need paid hosting for reliability
Production applications: Paid hosting with proper monitoring and uptime guarantees

The Tab Generator runs on Render’s free tier because I’m the primary user and can tolerate occasional cold starts. If I were building an MCP for team use or integrating it into a larger system, I’d pay for reliable hosting rather than dealing with cold start delays and uptime limits.

Running the Tab Generator Yourself

I’ve hosted the Tab Generator on Render’s free tier so you can try it without installing anything locally. I have designed the Tab Generator MCP to be very easy to use–unlike many of the MCPs currently available it does not:

Require any tokens, API keys, or passwords
Access any private data (unless you consider your tabs to be private)
Require installation on your system–it is a hosted MCP, running in the cloud
Require a paid account, at this time, although it does require a LLM account.

What You’ll Need

At the time I write this, there are still several steps required to use an MCP:

A Claude account – You shouldn’t need a paid account, though this may change
Claude Desktop – You can’t use MCPs in the browser yet
Developer Mode enabled – Required to add custom MCPs
The MCP connector configured – Points Claude to the Tab Generator

Installation Steps

Download and install Claude Desktop if you haven’t already
Enable Developer Mode:
- Open Claude Desktop
- Go to Settings (gear icon or hamburger menu ≡)
- Enable Developer Mode
Add the Tab Generator as a connector:
- In Settings, go to Connectors
- Click “Add custom connector”
- Enter this URL: https://tab-generator.onrender.com/sse
- Save
Restart Claude Desktop to activate the connection
Test it:
- Start a new chat
- Ask Claude: “Use the Tab Generator MCP to create a simple tab in the key of G”
- Claude should discover the MCP, call it, and display the results

Important Notes

Cold Starts: The free hosting tier spins down after inactivity. If no one has used the Tab Generator recently, the first request may take 30-60 seconds to wake up the server. Subsequent requests will be fast. If you get a timeout error, wait a minute and try again.

Browser Limitation: As of this writing, MCPs only work in Claude Desktop, not in the web browser. This may change as the technology matures.

Configuration Changes: Everything about LLMs and MCPs is evolving rapidly. If these instructions don’t work, check the Claude documentation for potentially updated steps: https://docs.anthropic.com (or wherever the current MCP setup instructions live).

Other LLMs: While I built this specifically for Claude, the Tab Generator uses the standard MCP protocol. It should theoretically work with other MCP-compatible LLMs, though I haven’t tested this.

Division of Labor

Using this simple MCP is quite straightforward you, the LLM, and the MCP each focusing on what you do best:

You or Claude come up with the idea. Such as ‘generate a bluegrass tune in C’ or ‘show a G-shaped scale for A’.
Claude finds the Tab Generator MCP, and sends it the JSON representing the requested tab.
The MCP creates a properly formatted tab from the instructions and returns it to Claude, as JSON.
Claude displays the results. Success.
Maybe. Claude may choose to modify or ignore the results. But generally success…

For example, try these queries:

Practice specific techniques: “Generate hammer-on exercises in A minor”
Learn scales: “Show me a G major scale pattern for guitar”
Explore different instruments: “Create a simple melody for ukulele”
Combine techniques: “Generate a tab that combines slides and bends in E”
Get suggestions: “What should I practice to improve my fingerpicking?”

If Claude generates broken tabs without using the MCP, explicitly tell it: “Use the Tab Generator MCP for this.” Once it’s used the MCP successfully in a conversation, it’s more likely to continue using it.

The Take-Aways

These four posts traced a path starting with a simple frustration: Claude couldn’t generate proper guitar tabs. I had to learn about MCPs and how to build them. This was followed by the excitement of the first working version, the frustrations inherent in using LLMs to modify code, and the surprise at the various ways Claude could fail to properly use the MCP.

LLMs Are Powerful But Fundamentally Limited

LLMs excel at pattern matching, creativity, and synthesis. They can write convincingly about topics they don’t actually ‘understand’, generate creative solutions by recombining learned patterns, and interact in surprisingly human-like ways. This power makes them transformative for certain types of work.

But they struggle with precision, consistency, and logical constraints—not because of insufficient training or model size, but because of how they fundamentally work. They predict likely next tokens based on patterns, not by reasoning or understanding. This means they can confidently generate plausible-sounding nonsense, struggle with exact formatting requirements, and produce different outputs for identical inputs.

Understanding this isn’t just intellectually interesting—it’s essential for anyone working with LLMs. The companies rushing to implement LLM-based customer service, content generation, or decision-making systems without accounting for this unpredictability are setting themselves up for expensive, embarrassing failures.

MCPs Work Around Limitations, Not Fix Them

MCPs provide a practical solution for specific reliability problems. By delegating precision tasks to traditional software while letting LLMs handle creative decisions, you can carve out islands of deterministic behavior in an otherwise probabilistic system.

This division of labor is powerful when it works. The Tab Generator reliably produces perfect tablature every time it’s called—alignment is exact, timing is correct, formatting follows all the rules. Traditional software does what it’s always done well: following specifications exactly and consistently.

But MCPs can’t solve the fundamental problem: LLMs decide when to use tools, how to interpret results, and whether to follow returned data. All of these are probabilistic decisions. My tab generator works flawlessly, yet Claude sometimes ignores it, reformats its output, or hallucinates interactions that never occurred.

This isn’t a failure of MCP design—it’s the reality of building with probabilistic systems. You can provide perfect tools, perfect documentation, and perfect infrastructure, but you can’t force an LLM to use them correctly every time.

Development Complexity Lives Where You’d Expect

Building the Tab Generator taught me that MCP development breaks down roughly into:

40% Business logic: Solving your actual problem (tab generation, data formatting, calculations)
20% MCP integration: Writing the simple interfaces and documentation
40% Infrastructure: Fighting with environments, dependencies, hosting, and tooling

The MCP part is genuinely simple—by design. The complexity lives in your domain logic and in the messy reality of modern software infrastructure. This is actually good news: MCPs get out of your way and let you focus on solving real problems.

The current simplicity won’t last forever. As MCPs move from experimental technology to production infrastructure, they’ll accumulate features, security requirements, and complexity. But for now, the barrier to entry is remarkably low. You can validate whether an MCP solves your problem without massive investment.

Documentation Is As Important As Code

This surprised me: for the Tab Generator, documentation represents nearly half the total line count, and it’s arguably as important as the code.

The documentation isn’t just explaining how to use the MCP. It’s providing context that guides the LLM’s probabilistic decisions. Every technique for effective prompting applies to MCP documentation: be specific, provide examples, use clear formatting, anticipate misunderstandings.

You’re not writing for human developers who can read between the lines and apply common sense. You’re writing for a pattern-matching system that will interpret your documentation literally—or ignore it entirely based on probabilities you can’t fully control.

The Assessment

The guitar tabs that motivated this entire exploration now generate reliably when Claude remembers to properly use the Tab Generator. That “when” honestly summarizes where LLM technology stands today.

These systems are:

Powerful: They can do things that seemed impossible a few years ago
Useful: They genuinely improve productivity for many tasks
Unpredictable: Their probabilistic nature means you can’t fully control their behavior
Improving: Each generation gets better, but the fundamental nature remains

MCPs give us better tools to work with this reality. They let us build hybrid systems that leverage LLM strengths while containing some of their weaknesses. This is probably the right approach for the foreseeable future: not fighting against LLM limitations, but designing around them.

For anyone building with LLMs today, this is the practical lesson: understand what you’re working with, design for probabilistic behavior rather than deterministic guarantees, provide reliable external tools for precision tasks, and always maintain human oversight for critical decisions.

The technology is remarkable. It’s also fundamentally different from traditional software in ways that matter. Success comes from embracing that difference rather than pretending it doesn’t exist.

What’s Next

For now, MCPs offer a interesting way to extend LLM capabilities. They’re currently simple and easy to experiment with, but expect the technology to continue evolving rapidly, gaining functionality and complexity. They’re already powerful enough to provide solutions for applications where you need an LLM to access external resources or reliable results MCPs are worth exploring.

The Tab Generator will remain available at https://tab-generator.onrender.com/sse for the foreseeable future (subject to hosting tier limits and my continued interest in maintaining it). If you try it, I’d be interested to hear about your experience—both the successes and the inevitable moments when your LLM does something completely unexpected…