How Databricks Agent Bricks works? Time to test out

Sep 19

7 min read

You have probably noticed how all the fun AI features comes to EU a bit delayed. This time, I didn’t have enough patience to wait it out and decided to create a workspace in Azure to test the new Agent Bricks. There is so much hype around it, so of course I wanted to get my hands dirty.

Agent Bricks provides a simple approach to build and optimize domain-specific, high-quality AI agent systems for common AI use cases. Agent Bricks streamlines the implementation of AI agent systems so that users can focus on the problem, data, and metrics instead.

This sounds really good, so time to test out how well it really performs! Building production-grade agents is hard and demand is currently outpacing supply. It truly doesn’t make sense for every customer to reinvent the wheel themselves. That’s where Agent Bricks comes in.

So, how does Databricks Agent Bricks work? First you have to activate it from previews in Workspace (and have US workspace, for now). After refreshing your page, a new 'Agents' section appears under the AI/ML sidebar. Currently there are four options available, so let’s try out all of them.

Agent Bricks starting screen — The four horsemen of agents

1) Information extractor agent

2) Custom LLM

3) Knowledge Assistant

4) Multi-Agent Supervisor

Currently PDFs and images aren’t supported on agents, but you can trigger a parameterized workflow that automatically generates a parsing workflow for you. It shows the SQL query (using ai_parse) used to convert PDFs into a usable format in a Delta table. Simple, but efficient and user-friendly. Keep in mind that it parses all files in the folder with suitable file formats, so remember to use dedicated folders with only the files you want parsed.

You don’t always need your own agent for everything, so I liked this approach a lot. Nowadays almost everything seems to be called an agent. Yesterday I came across a Python function without any LLM(/DL) or ML. You probably guessed right, but it was advertised as an agent 😃 Yes, there’s a bubble.

1) Information extractor agent

Agent Bricks information extractor agent — From text mess to clean schema

This is quite a straightforward agent but it offers solid business process automation capabilities. It does exactly what the name suggests, extracting information from the given dataset. You can use unlabeled documents or labeled dataset tables as a source and then the agent automatically creates an output schema with descriptions. You can also edit fields manually, for example if the descriptions aren't descriptive enough. Once your agent is ready, it's time to generate a quality report and optimize the agent if it doesn’t perform well enough. Quick and fast. Keep in mind that you can have only one target schema here, which is a bit limiting. But it’s a really robust choice if you want to set up strong business automation. For example, in finance too many processes are still handled manually even nowadays and it requires single output schema. For multiple schema requirements creating own agent is always an option. So now you can let the agent handle the information extraction process for you. Then the processed data can be easily utilized or pushed forward to another system,, automating the whole process with Databricks as the heart of your automation engine.

📖 Documentation

2) Custom LLM

Sometimes you might want to fine-tune an LLM model with your own data. This is all about that. You use an unlabeled dataset, a labeled dataset or a few examples as source data here. But keep in mind that to optimize an LLM model, you need enough source data. Otherwise, it's kind of useless to fine-tune a model with just a couple of examples. For labeled dataset min 100 rows were required. I have to confess that here I got a bit lazy and used another LLM model to create a dummy customer feedback dataset, which I provided as example data for the optimization session. And also quickly accepted the predefined guideline suggestions without any further improvement. It wasn’t a big surprise that the quality report wasn’t showing only green colors before the optimization session, heh. But it was time to start optimization and hit the gym while waiting to see if the model performance improved.

Databricks custom LLM optimization — Run optimization, run

After optimization, I noticed that the model wasn’t usable in AI Playground at all. There were two options: use SQL or workflows. I tested with SQL ai_query but only got the following response: “Could be improved with more features.” It didn’t work so well as an out-of-the-box solution, but I guess I should blame my laziness with bad source data 😃 All in all, the optimization process was fully automated which was cool to see.

I know you all want to know how much optimization cost with these 100 rows. It was about 100 euros. Guess it's much or little, depends on your use case. If you want to use optimized LLM models instead of commercial or the best open-source models, you really have to pay attention here to get that extra value. And you have to pay this to find out whether optimization is even providing better results. If pre-validation for different optimization options were free or close to free, I think this would make more sense. But right now, paying that much without knowing the end results might be a bit risky.

📖 Documentation

3) Knowledge Assistant

Agent Bricks knowledge assistant — RAG chatbot with couple of button clicks

This is basically about creating RAG chatbot using structured or unstructured data. For structured data, it’s possible to use Unity Catalog volumes and for unstructured data, vector search indexes. Once again, you can set it up in minutes and grant access to documentation files instantly. The nice thing here is that it provides references (footnotes), improving reliability. But the documentation count is kind of limited (well, it's still in BETA) and updating documents requires manual steps (not synced automatically). The world isn't ready yet, but not bad at all. After seeing how consultants have worked on these for months without anything concrete and now you get it done in just a few minutes, it is pretty neat. Still, it’s important to remember the prework, meaning the freshness and quality of the data being used.

📖 Documentation

4) Multi-agent supervisor

Multi-agent in action

This was the most interesting one and worked surprisingly well as an out-of-the-box solution. Genie spaces and knowledge assistants were supported at the time of testing (MCPs coming soon). Easy to implement and use, couple click of buttons. In my use case, I added two newly created Genie spaces (sales and product domains) and an earlier created knowledge assistant. I like this approach a lot, since Genies handle your structured data in a specific domain (working as a data analyst) and the knowledge assistant handles unstructured data (documentation, etc.). But from cost perspective it raises the question of why separately deployed agents are needed here.

The natively integrated feedback loop (with SME support) was a really nice touch. After merging feedback data, it adds that information as example data instantly. It seems to be using a vector search index under the hood based on my investigations. It would be fun to have an optimization possibility here as well at some point in the future (like with custom LLMs).

For monitoring, a new MLflow experiment is created automatically and traces are activated there. Naturally, some of the information is hidden in the traces, but you can still see quite a lot. If you aren’t familiar with MLflow tracing, it’s basically a logging system for agent behavior that can then be used for evaluation and monitoring purposes.

📖Documentation

Cost & quality

Currently, it’s ridiculous how much money and time is spent on agent PoCs that never see the light of day. Now companies can instantly validate their business ideas with prebuilt Agent Bricks agents. I think that already more than justifies its use and minor costs. But let's investigate costs in detailed matter.

Even when agents are scaled to zero after some inactivity, it seems vector search indexes were causing costs all the time. So if you have a knowledge assistant or multi-agent, the real daily idle costs were about $5/day, plus additional costs based on agent activity time. The current model optimization might be a bit hard to justify, since you have to pay "up front" before seeing the optimization results. A bit of a casino feeling - sometimes you get lucky, sometimes not. But in all seriousness, this might cause really nasty cost spikes if it’s used for fun testing with huge datasets without understanding how it generates costs, so don't be too careless. Personally, I’d prefer to have two-layer optimization: the first one to give a rough estimation of optimization options & expected improvements, and then you can proceed with optimization that takes longer and is more expensive. But once again, it’s still in BETA and the whole agent industry is so young (a couple of years old, people tend to forget that).

To summarize how Databricks Agent Bricks works

Agent Bricks offers quick and easy agents for different general use cases. It’s still in the early development phase (BETA), and I’d expect it to develop really quickly. The use cases I see are general needs (which all companies have in common) and non-tech users. It makes a lot of sense to harmonize and standardize basic tasks for agents. Instead of everyone creating their own, it’s better to use one working solution and scale it. And that’s what Agent Bricks can offer.

For more demanding agentic use cases, it’s better to build custom agentic solutions. There will always be limitations with one-size-fits-all approach. Providing black-box solutions is always a bit risky for business-critical tasks, since you are fully relying on the external service provider logic which you cannot modify. But quite few companies really have resources or expertise to achieve production-grade agents on their own. When the hype around agents calms down, non-developer self-built agents will be quickly forgotten and it will be noticed that they have only brought costs instead of creating real value. This already happened with data platforms and was the reason why I originally fell in love in Databricks. It’s an ecosystem where you have the option to build your own solutions or use prebuilt features. Your data, your code and your future.

In the end, it’s all about a simple ROI equation: time & resources -> generated value. And seeing how many agents never mature out of the PoC stage, it’s safer to get started with Agent Bricks and get value instantly, instead of buying expensive consulting projects and hoping for the best. I have to admit it's pretty neat that you can create agents in just minutes without any tech skills and that's a game changer for many companies.