MCPs on Databricks: hype train or real transformation?

Aug 24

9 min read

If you're not familiar with the MCP concept, please read this article first: MCP Explained: The New Standard Connecting AI to Everything

In case you are a really busy person, here's shortly summary. MCP is an open protocol that standardizes how applications provide context to LLMs. It enables smooth integration with data sources and tools, making it easier to build agents and complex workflows on top of LLMs. And perhaps you have seen, it's in the central of all the hype nowadays. Since there are already really great open source code repos available, I decided to take a bit different approach here and challenge the status quo. Let's deep dive into the logic behind MCP and try to understand how and where it should be used, including limitations and possibilities.

The core logic behind MCP and why it became so popular

It’s ancient history now, but I guess you remember the time when OpenAI published ChatGPT 3.5 to a big audience. That moment started the AI avalanche. From then on began a mass production of chatbots, but something was missing - autonomy. Yes, chatbots were able to enrich information (RAG) and answer questions, but they weren’t able to act independently. A bit later, LLM models with tool support came to market. You could create a feedback loop and an LLM model could trigger tools, analyze results and then answer or trigger tools again. Really simple, but effective.

But how does tool calling really work? It’s just an output in the correct JSON format, which can be passed to parameterized functions. There’s no dark magic behind it - and there shouldn’t be. Usually the truth tends to be more boring than the tale. Here’s a simple illustration of how “agent magic” works (simplified core logic using Python):

Simple illustration of agent magic — Simplified agent tool process

The example picture is kept oversimplified on purpose to show that the basic logic is actually very straightforward. Important parts like memory, state management and quality checks (e.g., LLM-as-a-Judge) are left out to keep the picture clean. It's easy to get started and once adding more features and components, only the sky is truly the limit. However, doing so requires creating dedicated tools and capabilities. Offering it as an out-of-the-box solution won’t be an easy task, but let’s see what Databricks can offer with Agent Bricks in the future. For now, the best way to ensure production-level quality is to develop robust tools that are fully optimized and minimize the risk of LLM hallucinations. In essence, this is software engineering with LLMs acting as smart orchestrators. It’s an incredibly powerful combination, though it doesn’t always scale easily.

So, the challenge here is scalability and reusability. And let's be honest, most of the tasks we handle daily are quite trivial and not business-critical, so we don’t need to be overly thorough in those cases. That’s where MCPs step in. Instead of everyone building their own functions to trigger separate REST APIs, we could and should use standardized frameworks. It’s like to working on machine learning or deep learning projects where you might rely on prebuilt Python libraries, instead of building or calculating everything from scratch. And yes, that’s an amazing idea! I love harmonization and standardization. But as with all shiny new things, there are important issues to address before blindly jumping on the hype train.

MCPs on Databricks

Quite recently, Databricks published a nice MCP beta support and now offers two options: one is managed MCP servers provided by Databricks and the other is hosting your own MCP servers using Databricks Apps. It’s quite straightforward and there are good examples & code repositories available, so I’ll let the official Databricks documentation do the work here:

But the real question is, what does Databricks bring to the table compared to others and why does it matter? The first challenge is that MCP adds another abstraction layer. If you don’t know me yet, I’m a messenger of simplicity - the simpler the solution, the better it is. Since MCP introduces an extra layer, it must provide clear, new value to justify its place in the architecture. Inspired by the Model Context Protocol’s nicely designed illustration, I quickly drafted my own version to include Databricks.

It provides a good foundation for what we’re going to discuss. The main focus will be on the next five aspects:

IAM (Identity and Access Management)
Secure authentication
Secure data movement
Authorization
Logical challenges

1) IAM - Just when we got past sharing Excel files

As you have probably noticed, MCPs have mainly been used on local development. And it's amazing when using personal assistant agents to automate your own personal tasks. But for business process automation, you aren't running production operations on your local computer, are you? And if you run these agents on your local computer, traceability with audit logs is missing / can be corrupted. So yes, we face the same issues as we did with Excel file sharing back in the day (please don’t tell me you’re still using Excels 😅). Another thing is transparency – you want to see how the agent is performing and have automated evaluations in place to ensure quality, as a centralized solution. Everything nicely in one place, while you sip your coffee and watch green colors on the screen.

For development and personal assistant agents, this local setup works amazingly well. But for a more robust, production-ready approach to automating business processes, it needs to run in the cloud or on protected on-prem to ensure full transparency, monitoring capabilities and IAM. And yes, I know how we all love corporate permission processes via ticketing systems (ugh…), but sometimes they’re truly mandatory. But all in all, Databricks earns points here, as it delivers everything required for production use.

2) Secure authentication - the problems start here

Since MCPs are an extra layer on top of REST APIs (in most cases), it means you have to host those somewhere. If you host your own MCP servers in the cloud, it causes extra costs and work to maintain them up to date. If you have hundreds of agents using a centralized MCP server, it means a dedicated Ops team / unfortunate overworked guy is required here. If something fails, everything breaks (until backup processes are up). So here you are creating centralized risks instead of limiting risks to one agent only.

But the biggest problem here is malicious opportunities. MCP culture is leading toward a carefree approach where you just copy-paste prebuilt MCP servers from a repo or use third-party managed hosting solutions. And it means you have to authenticate to those MCP servers so they can access target endpoints, meaning passing credentials. Let me tell you, this will lead to so many data and secret leaks. It has never been easier to create your own “secure” MCP-hosted servers and steal everything. So please, be really careful out there.

That's why I recommend using your own hosted MCP servers, such as hosting MCPs using Databricks Apps. But how about extra costs and maintenance work? IMO, running your own MCP server for a couple of agents doesn’t justify the costs and extra complexity unless the agents are really providing good business value. But once you have an agent army, that’s another story. Then you can scale the development process and the hours saved can provide nice ROI. Or just set up a centralized location for certified tools where they can be used instead of MCPs (UC functions / own custom solution). This falls under agent architecture – more on this in the future.

And yes, there are amazing third-party MCP server providers as well and the market is constantly evolving. But still, be careful out there, especially with sensitive data. Data laws are different in each country and most likely you haven’t read all the fine-print clauses.

3) Secure data moving

When using REST APIs, you probably move the data via the public internet. For most use cases it doesn't matter, but when handling really secure and business-sensitive data, you should not do it. It requires setting up proper Private Links to ensure secure data transfer, or similar solution. So having an MCP layer here might cause some security concerns until the ecosystem evolves a bit more. For most use cases it doesn’t truly matter and if the data leaks, then it leaks. But for business-sensitive data, that's another story. Then it’s time to investigate how to ensure traffic travels via Private Links or at least through the cloud provider’s backbone network.

4) Authorization

One master token to rule them all... Unfortunately, this still seems to be the approach in most cases. It's a fun hobby for quite a few people to collect API tokens from public repos after someone's vibe coding session.

When building more sophisticated agentic solutions, you might want to have dynamic authorization based on the user instead of static service principal authorization. So it means you have to use on-behalf-of-user authorization, which by the way works great in Databricks already (still in beta). When handling user access tokens, you want to handle them as securely as possible. So here, I’d stick to custom solutions instead of using external MCPs, unless you have validated MCP server logic and can truly trust them. And always use OAuth to limit permissions and service principals when possible. If you use personal access tokens with admin rights and it leaks, oh man…

5) Brute force is a valid tactic, until you hit the wall

To remind, the core logic of agents using MCP was to standardize and harmonize tools for agents instead of creating and optimizing them for each use case. Building solutions quicker, a lot easier and getting results faster. Sounds really good, doesn’t it? But this tends to lead to vibe coding and outsourcing thinking & problem-solving to the agents. And even when agents are evolving quickly, we are still quite far away from “Skynet”-level. It’s one thing letting an agent brute force a solution that “works”, but not truly understanding why it works. Yes, LLM models are already amazingly good coders and solving different tasks, but we are still quite far from agents truly understanding the bigger picture.

Another thing is the agent’s logic for how MCPs are being used. Instead of building custom tools that might contain multiple steps, including carefully validating REST API results, you let the agent go freely and blindly rely on MCP documentation. If something goes wrong, you can’t just blame your agent by saying, 'It read the MCP documentation wrong'. You’re still the one responsible, not the agent. But if that excuse works and you get away with it, please share me your secrets.

How many times have you put solutions into production that use REST APIs and only quickly read abstract documentation on the data? Hopefully not too many times. Because you really have to understand the data, how it’s used, how it should be used, etc. Otherwise, it doesn’t matter if the process is done by human or agent - the results are questionable. So using MCPs to get rid of this heavy part will probably just be a shortcut to bad results. Sorry, but you still need to hunt down experts to find the “hidden knowledge”, which is the most valuable. As we all know, the current documentation quality is sometimes what it is, if it's even documented.

When an agent uses MCPs, it’s like a new employee getting their hands on a tool for the first time. Experimenting, figuring things out and gradually improving (hopefully). The catch is, this learning process costs extra tokens and sometimes skips important steps. If you’re not saving the “golden path” in long-term memory, the agent will repeat the process every time, driving up both costs and latency. Sure, it’s exciting to watch agents act autonomously and sometimes use cases requires exactly that. But if you care about robustness and quality, you might want to fine-tune the tools and lock in that golden path. Otherwise, hallucinations will be your friend. In code development, this isn’t a huge issue. You can rely on test results and error messages to guide improvements. But apply the same trial-and-error approach to something like day trading and those costs will skyrocket very quickly.

Then there’s providing a secure “try-and-error” environment for the agent. You don’t let your new employees start testing in the production environment. For that reason, dev and test environments exist. But how to ensure the agent's thought process stays exactly same in prod env as well? The more freedom and autonomy you give an agent, the harder it becomes to guarantee identical behavior across different environments or use cases. There are a few approaches to tackle this, but let's get into those in the near future. It’s an intriguing problem to be solved.

All things considered, agents and AI have huge potential, yet what we’re doing today only scratches the surface of what we can truly understand and observe. Developing these capabilities will take time, so the AGI-agent champagne party for next year might be worth canceling in advance.

Will MCP take agents to the promised land?

Is MCP overhyped? Currently, I'd say yes. It's not a magic bullet that will automatically solve all these challenges. You shouldn’t start hosting your own MCP server farms just because it’s cool or because everyone is talking about it.

Do MCPs have the potential to help solve some of the problems mentioned above? Definitely! But it’s still too early to say how things will develop. Their impact will depend on how well adoption takes off and how the technology evolves (so far, so good). Right now, they’re amazing for experimentation, coding, automating personal tasks and “vibing”, but a bit too immature for most business process automation use cases. Creating agents is easy, but achieving production readiness is another story. This requires a comprehensive platform underneath as requirements escalate. To fulfill this need, Databricks is second to none.

It's good to keep in mind we’re still in the early days of agent development, with plenty of challenges left to tackle. Progress is happening every day, but let’s resist the urge to overhype and instead focus on solving the real, underlying problems.