Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,624
|
Comments: 51,249
Privacy Policy · Terms
filter by tags archive
time to read 13 min | 2490 words

In the previous post, I talked about the PropertySphere Telegram bot (you can also watch the full video here). In this post, I want to show how we can make it even smarter. Take a look at the following chat screenshot:

What is actually going on here? This small interaction showcases a number of RavenDB features, all at once. Let’s first focus on how Telegram hands us images. This is done using Photoor Document messages (depending on exactly how you send the message to Telegram).

The following code shows how we receive and store a photo from Telegram:


// Download the largest version of the photo from Telegram:
var ms = new MemoryStream();
var fileId = message.Photo.MaxBy(ps => ps.FileSize).FileId;
var file = await botClient.GetInfoAndDownloadFile(fileId, ms, cancellationToken);

// Create a Photo document to store metadata:
var photo = new Photo
{
    ConversationId = GetConversationId(chatId),
    Id = "photos/" + Guid.NewGuid().ToString("N"),
    RenterId = renter.Id,
    Caption = message.Caption ?? message.Text
};

// Store the image as an attachment on the document:
await session.StoreAsync(photo, cancellationToken);
ms.Position = 0;
session.Advanced.Attachments.Store(photo, "image.jpg", ms);
await session.SaveChangesAsync(cancellationToken);

// Notify the user that we're processing the image:
await botClient.SendMessage(
chatId,
       "Looking at the photo you sent..., may take me a moment...",
       cancellationToken
);

A Photo message in Telegram may contain multiple versions of the image in various resolutions. Here I’m simply selecting the best one by file size, downloading the image from Telegram’s servers to a memory stream, then I create a Photo document and add the image stream to it as an attachment.

We also tell the client to wait while we process the image, but there is no further code that does anything with it.

Gen AI & Attachment processing

We use a Gen AI task to actually process the image, handling it in the background since it may take a while and we want to keep the chat with the user open. That said, if you look at the actual screenshots, the entire conversation took under a minute.

Here is the actual Gen AI task definition for processing these photos:


var genAiTask = new GenAiConfiguration
{
    Name = "Image Description Generator",
    Identifier = TaskIdentifier,
    Collection = "Photos",
    Prompt = """
        You are an AI Assistant looking at photos from renters in 
        rental property management, usually about some issue they have. 
        Your task is to generate a concise and accurate description of what 
        is depicted in the photo provided, so maintenance can help them.
        """,


    // Expected structure of the model's response:
    SampleObject = """
        {
            "Description": "Description of the image"
        }
        """,


    // Apply the generated description to the document:
    UpdateScript = "this.Description = $output.Description;",


    // Pass the caption and image to the model for processing:
    GenAiTransformation = new GenAiTransformation
    {
        Script = """
            ai.genContext({
                Caption: this.Caption
            }).withJpeg(loadAttachment("image.jpg"));
            """
    },
    ConnectionStringName = "Property Management AI Model"
};

What we are doing here is asking RavenDB to send the caption and image contents from each document in the Photos collection to the AI model, along with the given prompt. Then we ask it to explain in detail what is in the picture.

Here is an example of the results of this task after it completed. For reference, here is the full description of the image from the model:

A leaking metal pipe under a sink is dripping water into a bucket. There is water and stains on the wooden surface beneath the pipe, indicating ongoing leakage and potential water damage.

What model is required for this?

I’m using the gpt-4.1-mini model here; there is no need for anything beyond that. It is a multimodal model capable of handling both text and images, so it works great for our needs.

You can read more about processing attachments with RavenDB’s Gen AI here.

We still need to close the loop, of course. The Gen AI task that processes the images is actually running in the background. How do we get the output of that from the database and into the chat?

To process that, we create a RavenDB Subscription to the Photos collection, which looks like this:


store.Subscriptions.Create(new SubscriptionCreationOptions
{
    Name = SubscriptionName,
    Query = """
        from "Photos" 
        where Description != null
        """
});

This subscription is called by RavenDB whenever a document in the Photos collection is created or updated with the Description having a value. In other words, this will be triggered when the GenAI task updates the photo after it runs.

The actual handling of the subscription is done using the following code:


_documentStore.Subscriptions.GetSubscriptionWorker<Photo>("After Photos Analysis")
    .Run(async batch =>
    {
        using var session = batch.OpenAsyncSession();
        foreach (var item in batch.Items)
        {
            var renter = await session.LoadAsync<Renter>(
item.Result.RenterId!);
            await ProcessMessageAsync(_botClient, renter.TelegramChatId!,
                $"Uploaded an image with caption: {item.Result.Caption}\r\n" +
                $"Image description: {item.Result.Description}.",
                cancellationToken);
        }
    });

In other words, we run over the items in the subscription batch, and for each one, we emit a “fake” message as if it were sent by the user to the Telegram bot. Note that we aren’t invoking the RavenDB conversation directly, but instead reusing the Telegram message handling logic. This way, the reply from the model will go directly back into the users' chat.

You can see how that works in the screenshot above. It looks like the model looked at the image, and then it acted. In this case, it acted by creating a service request. We previously looked at charging a credit card, and now let’s see how we handle creating a service request by the model.

The AI Agent is defined with a CreateServiceRequest action, which looks like this:


Actions = [
    new AiAgentToolAction
    {
        Name = "CreateServiceRequest",
        Description = "Create a new service request for the renter's unit",
        ParametersSampleObject = JsonConvert.SerializeObject(
            new CreateServiceRequestArgs
            {
                    Type =         """
Maintenance | Repair | Plumbing | Electrical | 
HVAC | Appliance | Community | Neighbors | Other
""",
            Description =         """
Detailed description of the issue with all 
relevant context
"""
                })
    },
]

As a reminder, this is the description of the action that the model can invoke. Its actual handling is done when we create the conversation, like so:


conversation.Handle<PropertyAgent.CreateServiceRequestArgs>(
"CreateServiceRequest", 
async args =>
{
    using var session = _documentStore.OpenAsyncSession();
    var unitId = renterUnits.FirstOrDefault();
    var propertyId = unitId?.Substring(0, unitId.LastIndexOf('/'));


    var serviceRequest = new ServiceRequest
    {
        RenterId = renter.Id!,
        UnitId = unitId,
        Type = args.Type,
        Description = args.Description,
        Status = "Open",
        OpenedAt = DateTime.UtcNow,
        PropertyId = propertyId
    };


    await session.StoreAsync(serviceRequest);
    await session.SaveChangesAsync();


    return $"Service request created ID `{serviceRequest.Id}` for your unit.";
});

In this case, there isn’t really much to do here, but hopefully this conveys the kind of code this allows you to write.

Summary

The PropertySphere sample application and its Telegram bot are interesting, mostly because of everything that isn’t here. We have a bot that has a pretty complex set of behaviors, but there isn’t a lot of complexity for us to deal with.

This behavior is emergent from the capabilities we entrusted to the model, and the kind of capabilities we give it. At the same time, I’m not trusting the model, but verifying that what it does is always within the scope of the user’s capabilities.

Extending what we have here to allow additional capabilities is easy. Consider adding the ability to get invoices directly from the Telegram interface, a great exercise in extending what you can do with the sample app.

There is also the full video where I walk you through all aspects of the sample application, and as always, we’d love to talk to you on Discord or in our GitHub discussions.

time to read 23 min | 4458 words

In the previous post, I introduced the PropertySphere sample application (you can also watch the video introducing it here). In this post, I want to go over how we build a Telegram bot for this application, so Renters can communicate with the application, check their status, raise issues, and even pay their bills.

I’m using Telegram here because the process of creating a new bot is trivial, the API is really fun to work with, and it takes very little effort.

Compare that to something like WhatsApp, where just the process for creating a bot is a PITA.

Without further ado, let’s look at what the Telegram bot looks like:

There are a bunch of interesting things that you can see in the screenshot. We communicate with the bot on the other end using natural text. There aren't a lot of screens / options that you have to go through, it is just natural mannerism.

The process is pretty streamlined from the perspective of the user. What does that look like from the implementation perspective? A lot of the time, that kind of interface involves… big amount of complexity in the backend.

Here is what I usually think when I consider those demos:

In our example, we can implement all of this in about 250 lines of code. The magic behind it is the fact that we can rely on RavenDB’s AI Agents feature to do most of the heavy lifting for us.

Inside RavenDB, this is defined as follows:

For this post, however, we’ll look at how we actually built this AI-powered Telegram bot. The full code is here if you want to browse through it.

What model is used here?

It’s worth mentioning that I’m not using anything fancy, the agent is using baseline gpt-4.1-mini for the demo. There is no need for training or customization, the way we create the agent already takes care of that.

Here is the overall agent definition:


store.AI.CreateAgent(
    new AiAgentConfiguration
    {
        Name = "Property Assistant",
        Identifier = "property-agent",
        ConnectionStringName = "Property Management AI Model",
        SystemPrompt = """
            You are a property management assistant for renters.
            ... redacted ...
            Do NOT discuss non-property topics. 
            """,
        Parameters = [
            // Visible to the model:
            new AiAgentParameter("currentDate", 
"Current date in yyyy-MM-dd format"),
            // Agent scope only, not visible to the model directly
            new AiAgentParameter("renterId", 
"Renter ID; answer only for this renter", sendToModel: false),
            new AiAgentParameter("renterUnits", 
"List of unit IDs occupied by the renter", sendToModel: false),
        ],
        SampleObject = JsonConvert.SerializeObject(new Reply
        {
            Answer = "Detailed answer to query (markdown syntax)",
            Followups = ["Likely follow-ups"],
        }),
        // redacted
    });

The code above will create an agent with the given prompt. It turns out that a lot of work actually goes into that prompt to explain to the AI model exactly what its role is, what it is meant to do, etc.

I reproduced the entire prompt below so you can read it more easily, but take into account that you’ll likely tweak it a lot, and that it is usually much longer than what we have here (although what we have below is quite functional, as you can see from the screenshots).

The agent’s prompt

You are a property management assistant for renters.

Provide information about rent, utilities, debts, service requests, and property details.

Be professional, helpful, and responsive to renters’ needs.

You can answer in Markdown format. Make sure to use ticks (`) whenever you discuss identifiers.

Do not suggest actions that are not explicitly allowed by the tools available to you.

Do NOT discuss non-property topics. Answer only for the current renter.

When discussing amounts, always format them as currency with 2 decimal places.

The way RavenDB deals with AI Agents, we define two very important aspects of them. First, we have the parameters, which define the scope of the system. In this case, you can see that we pass the currentDate, as well as provide the renterId and renterUnits that this agent is going to deal with.

We expose the current date to the model, but not the renter ID or the units that define the scope (we’ll touch on that in a bit). The model needs the current date so it will understand when it is running and have context for things like “last month”. But we don’t need to give it the IDs, they have no meaning and are instead used to define the scope of a particular conversation with the model.

The sample object we use defines the structure of the reply that we require the model to give us. In this case, we want to get a textual message from the model in Markdown format, as well as a separate array of likely follow-ups that we can provide to the user.

In order to do its job, the agent needs to be able to access the system. RavenDB handles that by letting you define queries that the model can ask the agent to execute when it needs more information. Here are some of them:


Queries = [
    new AiAgentToolQuery
    {
        Name = "GetRenterInfo",
        Description = "Retrieve renter profile details",
        Query = "from Renters where id() = $renterId",
        ParametersSampleObject = "{}",
        Options = new AiAgentToolQueryOptions
        {
            AllowModelQueries = false,
            AddToInitialContext = true
        }
    },
     new AiAgentToolQuery
    {
        Name = "GetOutstandingDebts",
        Description = "Retrieve renter's outstanding debts (unpaid balances)",
        Query = """
            from index 'DebtItems/Outstanding'
            where RenterIds in ($renterId) and AmountOutstanding > 0
            order by DueDate asc
            limit 10
            """,
        ParametersSampleObject = "{}"
    },
    new AiAgentToolQuery
    {
        Name = "GetUtilityUsage",
        Description = """
Retrieve utility usage for renter's unit within a date 
range (for calculating bills)
""",
        Query = """
            from 'Units'
            where id() in ($renterUnits)
            select 
                timeseries(from 'Power' 
between $startDate and $endDate 
group by 1d 
select sum()),
                timeseries(from 'Water' 
between $startDate and $endDate 
group by 1d 
select sum())
            """,
        ParametersSampleObject = 
"""
{
"startDate": "yyyy-MM-dd", 
"endDate": "yyyy-MM-dd"
}
"""
    },
}]

The first query in the previous snippet, GetRenterInfo, is interesting. You can see that it is marked as: AllowModelQueries = false, AddToInitialContext = true. What does that mean?

It means that as part of creating a new conversation with the model, we are going to run the query to get all the renter’s details and add that to the initial context we send to the model. That allows us to provide the model with the information it will likely need upfront.

Note that we use the $renterId and $renterUnits parameters in the queries. While they aren’t exposed directly to the model, they affect what information the model can see. This is a good thing, since it means we place guardrails very early on. The model simply cannot see any information that is out of scope for it.

The model can ask for additional information when it needs to…

An important observation about the design of AI agents with RavenDB: note that we provided the model with a bunch of potential queries that it can run. GetRenterInfo is run at the beginning, since it gives us the initial context, but the rest are left for the judgment of the model.

The model can decide what queries it needs to run in order to answer the user’s questions, and it does so of its own accord. This decision means that once you have defined the set of queries and operations that the model can run, you are mostly done. The AI is smart enough to figure out what to do and then act according to your data.

Here is an example of what this looks like from the backend:

Here you can see that the user asked about their utilities, the model then ran the appropriate query and formulated an answer for the user.

The follow-ups UX pattern

You might have noticed that we asked the model for follow-up questions that the user may want to ask. This is a hidden way to guide the user toward the set of operations that the model supports.

The model will generate the follow-ups based on its own capabilities (queries and actions that it knows it can run), so this is a pretty simple way to “tell” that to the user without being obnoxious about it.

Let’s look at how things work when we actually use this to build the bot, then come back to the rest of the agent’s definition.

Plugging the model into Telegram

We looked at the agent’s definition so far - let’s see how we actually use that. The Telegram’s API is really nice, basically boiling down to:


_botClient = new TelegramBotClient(botSecretToken);
_botClient.StartReceiving(
    HandleUpdateAsync,
    HandleErrorAsync,
    new ReceiverOptions
    {
        AllowedUpdates = [
            UpdateType.Message, 
            UpdateType.CallbackQuery 
            ]
    },
    _cts.Token
);


async Task HandleUpdateAsync(ITelegramBotClient botClient, 
Update update, CancellationToken cancellationToken)
{
    switch (update)
    {
        case { Message: { Text: { } messageText } message }:
            await ProcessMessageAsync(botClient, 
message.Chat.Id.ToString(), 
messageText, 
cancellationToken);
            break;
    }
}

And then the Telegram API will call the HandleUpdateAsync method when there is a new message to the bot. Note that you may actually get multiple (concurrent messages), maybe from different chats, at the same time.

We’ll focus on the process message function, where we start by checking exactly who we are talking to:


async Task ProcessMessageAsync(ITelegramBotClient botClient, 
string chatId, string messageText, CancellationToken cancellationToken)
{
    using var session = _documentStore.OpenAsyncSession();


    var renter = await session.Query<Renter>()
        .FirstOrDefaultAsync(r => r.TelegramChatId == chatId,
 cancellationToken);


    if (renter == null)
    {
        await botClient.SendMessage(chatId,
            "Sorry, your Telegram account is not linked to a renter profile.",
            cancellationToken: cancellationToken
        );
        return;
    }
    var conversationId = $"chats/{chatId}/{DateTime.Today:yyyy-MM-dd}";
    // more code in the next snippet
}

Telegram uses the term chat ID in their API, but it is what I would call the renter’s ID. When we register renters, we also record their Telegram chat ID, which means that when we get a message from a user, we can check whether they are a valid renter in our system. If not, we fail early and are done.

If they are, this is where things start to get interesting. Look at the conversation ID that we generated in the last line. RavenDB uses the notion of a conversation with the agent to hold state. The conversation we create here means that the bot will use the same conversation with the user for the same day.

Another way to do that would be to keep the same conversation ID open for the same user. Since RavenDB will automatically handle summarizing and trimming the conversation, either option is fine and mostly depends on your scenario.

The next stage is to create the actual conversation. To do that, we need to provide the model with the right context it is looking for:


var renterUnits = await session.Query<Lease>()
    .Where(l => l.RenterIds.Contains(renter.Id!))
    .Select(l => l.UnitId)
    .ToListAsync(cts);


var conversation = _documentStore.AI.Conversation("property-agent",
    conversationId,
    new AiConversationCreationOptions
    {
        Parameters = new Dictionary<string, object?>
        {
            ["renterId"] = renter.Id,
            ["renterUnits"] = renterUnits,
            ["currentDate"] = DateTime.Today.ToString("yyyy-MM-dd")
        }
    });

You can see that we pass the renter ID and the relevant units for the renter to the model. Those form the creation parameters for the conversation and cannot be changed. That is one of the reasons why you may want to have a different conversation per day, to get the updated values if they changed.

With that done, we can send the results back to the model and then to the user, like so:


var result = await conversation.RunAsync<PropertyAgent.Reply>(cts);


var replyMarkup = new ReplyKeyboardMarkup(result.Answer.Followups
    .Select(text => new KeyboardButton(text))
    .ToArray())
    {
        ResizeKeyboard = true,
        OneTimeKeyboard = true
    };


await botClient.SendMessage(
    chatId,
    result.Answer.Answer,
    replyMarkup: replyMarkup,
    cancellationToken: cts);

The RunAsync() method handles the entire interaction with the model, and most of the code is just dealing with the reply markup for Telegram.

If you look closely at the chat screenshot above, you can see that we aren’t just asking the model questions, we get the bot to perform actions. For example, paying the rent. Here is what this looks like:

How does this work?

Paying the rent through the bot

When we looked at the agent, we saw that we exposed some queries that the agent can run. But that isn’t the complete picture, we also give the model the ability to run actions. Here is what this looks like from the agent’s definition side:


Actions = [
    new AiAgentToolAction
    {
        Name = "ChargeCard",
        Description = """
Record a payment for one or more outstanding debts. The 
renter can pay multiple debt items in a single transaction. 
Can pay using any stored card on file.
""",
        ParametersSampleObject = JsonConvert.SerializeObject(new ChargeCardArgs
        {
            DebtItemIds = ["debtitems/1-A", "debtitems/2-A"],
            PaymentMethod = "Card",
            Card = "Last 4 digits of the card"
        })
    }
]

The idea here is that we expose to the model the kinds of actions it can request, and we specify what parameters it should pass to them, etc. What we are not doing here is giving the model control over actually running any code or modifying any data.

Instead, when the model needs to charge a card, it will have to call your code and go through validation, business logic, and authorization. Here is what this looks like on the other side. When we create a conversation, we specify handlers for all the actions we need to take, like so:


conversation.Handle<PropertyAgent.ChargeCardArgs>("ChargeCard", async args =>
{
    using var paySession = _documentStore.OpenAsyncSession();


    var renterWithCard = await paySession.LoadAsync<Renter>(renter.Id!, cts);
    var card = renterWithCard?.CreditCards
.FirstOrDefault(c => c.Last4Digits == args.Card);


    if (card == null)
    {
        throw new InvalidOperationException(
$"Card ending in {args.Card} not found in your profile.");
    }


    var totalPaid = await PaymentService.CreatePaymentForDebtsWithCardAsync(
        paySession,
        renter.Id!,
        args.DebtItemIds,
        card,
        args.PaymentMethod,
        cts);


    return $"Charged {totalPaid:C2} to {card.Type}" +
    $" ending in {card.Last4Digits}.";
});

Note that we do some basic validation, then we call the CreatePaymentForDebtsWithCardAsync()method to perform the actual operation. It is also fun that we can just return a message string to give the model an idea about what the result of the action is.

Inside CreatePaymentForDebtsWithCardAsync(),we also verify that the debts we are asked to pay are associated with the current renter; we may have to apply additional logic, etc. The concept is that we assume the model is not to be trusted, so we need to carefully validate the input and use our code to verify that everything is fine.

Summary

This post has gone on for quite a while, so I think we’ll stop here. As a reminder, the PropertySphere sample application code is available. And if you are one of those who prefer videos to text, you can watch the video here.

In the next post, I’m going to show you how we can make the bot even smarter by adding visual recognition to the mix.

time to read 11 min | 2077 words

This post introduces the PropertySphere sample application. I’m going to talk about some aspects of the sample application in this post, then in the next one, we will introduce AI into the mix.

You can also watch me walk through the entire application in this video.

This is based on a real-world scenario from a customer. One of the nicest things about AI being so easy to use is that I can generate throwaway code for a conversation with a customer that is actually a full-blown application.

The full code for the sample application is available on GitHub.

Here is the application dashboard, so you can get some idea about what this is all about:

The idea is that you have Properties (apartment buildings), which have Units (apartments), which you then Lease to Renters. Note the capitalized words in the last sentence, those are the key domain entities that we work with.

Note that this dashboard shows quite a lot of data from many different places in the system. The map defines which properties we are looking at. It’s not just a static map, it is interactive. You can zoom in on a region to apply a spatial filter to the data in the dashboard.

Let’s take a closer look at what we are doing here. I’m primarily a backend guy, so I’m ignoring what the front end is doing to focus on the actual behavior of the system.

Here is what a typical endpoint will return for the dashboard:


[HttpGet("status/{status}")]
public IActionResult GetByStatus(string status, [FromQuery] string? boundsWkt)
{
    var docQuery = _session
.Query<ServiceRequests_ByStatusAndLocation.Result,
 ServiceRequests_ByStatusAndLocation>()
        .Where(x => x.Status == status)
        .OrderByDescending(x => x.OpenedAt)
        .Take(10);


    if (!string.IsNullOrWhiteSpace(boundsWkt))
    {
        docQuery = docQuery.Spatial(
x => x.Location, spatial => spatial.Within(boundsWkt));
    }


    var results = docQuery.Select(x => new
    {
        x.Id,
        x.Status,
        x.OpenedAt,
        x.UnitId,
        x.PropertyId,
        x.Type,
        x.Description,
        PropertyName = RavenQuery.Load<Property>(x.PropertyId).Name,
        UnitNumber = RavenQuery.Load<Unit>(x.UnitId).UnitNumber
    }).ToList();


    return Ok(results);
}

We use a static index (we’ll see exactly why in a bit) to query for all the service requests by status and location, and then we project data from the document, including related document properties (the last two lines in the Select call).

A ServiceRequest doesn’t have a location, it gets that from its associated Property, so during indexing, we pull that from the relevant Property, like so:


Map = requests =>
    from sr in requests
    let property = LoadDocument<Property>(sr.PropertyId)
    select new Result
    {
        Id = sr.Id,
        Status = sr.Status,
        OpenedAt = sr.OpenedAt,
        UnitId = sr.UnitId,
        PropertyId = sr.PropertyId,
        Type = sr.Type,
        Description = sr.Description,
        Location =  CreateSpatialField(property.Latitude, property.Longitude),
    };

You can see how we load the related Property and then index its location for the spatial query (last line).

You can see more interesting features when you drill down to the Unit page, where both its current status and its utility usage are displayed. That is handled using RavenDB’s time series feature, and then projected to a nice view on the frontend:

In the backend, this is handled using the following action call:


[HttpGet("unit/{*unitId}")]
public IActionResult GetUtilityUsage(string unitId, 
[FromQuery] DateTime? from, [FromQuery] DateTime? to)
{
var unit = _session.Load<Unit>(unitId);
if (unit == null)
    return NotFound("Unit not found");


var fromDate = from ?? DateTime.Today.AddMonths(-3);
var toDate = to ?? DateTime.Today;


var result = _session.Query<Unit>()
    .Where(u => u.Id == unitId)
    .Select(u => new
    {
        PowerUsage = RavenQuery.TimeSeries(u, "Power")
            .Where(ts => ts.Timestamp >= fromDate && ts.Timestamp <= toDate)
            .GroupBy(g => g.Hours(1))
            .Select(g => g.Sum())
            .ToList(),
        WaterUsage = RavenQuery.TimeSeries(u, "Water")
            .Where(ts => ts.Timestamp >= fromDate && ts.Timestamp <= toDate)
            .GroupBy(g => g.Hours(1))
            .Select(g => g.Sum())
            .ToList()
    })
    .FirstOrDefault();


return Ok(new
{
    UnitId = unitId,
    UnitNumber = unit.UnitNumber,
    From = fromDate,
    To = toDate,
    PowerUsage = result?.PowerUsage?.Results?
.Select(r => new UsageDataPoint
    {
        Timestamp = r.From,
        Value = r.Sum[0],
    }).ToList() ?? new List<UsageDataPoint>(),
    WaterUsage = result?.WaterUsage?.Results?
.Select(r => new UsageDataPoint
    {
        Timestamp = r.From,
        Value = r.Sum[0],
    }).ToList() ?? new List<UsageDataPoint>()
});

As you can see, we run a single query to fetch data from multiple time series, which allows us to render this page.

By now, I think you have a pretty good grasp of what the application is about. So get ready for the next post, where I will talk about how to add AI capabilities to the mix.

time to read 4 min | 763 words

We are a database company, and many of our customers and users are running in the cloud. Fairly often, we field questions about the recommended deployment pattern for RavenDB.

Given the… rich landscape of DevOps options, RavenDB supports all sorts of deployment models:

  • Embedded in your application
  • Physical hardware (from a Raspberry Pi to massive servers)
  • Virtual machines in the cloud
  • Docker
  • AWS / Azure marketplaces
  • Kubernetes
  • Ansible
  • Terraform

As well as some pretty fancy permutations of the above in every shape and form.

With so many choices, the question is: what do you recommend? In particular, we were recently asked about deployment to a “naked machine” in the cloud versus using Kubernetes. The core requirements are to ensure high performance and high availability.

Our short answer is almost always: Best to go with direct VMs and skip Kubernetes for RavenDB.

While Kubernetes has revolutionized the deployment of stateless microservices, deploying stateful applications, particularly databases, on K8s introduces significant complexities that often outweigh the benefits, especially when performance and operational simplicity are paramount.

A great quote in the DevOps world is “cattle, not pets”, in reference to how you should manage your servers. That works great if you are dealing with stateless services. But when it comes to data management, your databases are cherished pets, and you should treat them as such.

The Operational Complexity of Kubernetes for Stateful Systems

Using an orchestration layer like Kubernetes complicates the operational management of persistent state. While K8s provides tools for stateful workloads, they require a deep understanding of storage classes, Persistent Volumes (PVs), and Persistent Volume Claims (PVCs).

Consider a common, simple maintenance task: Changing a VM's disk type or size.

As a VM, this is typically a very easy operation and can be done with no downtime.The process is straightforward, well-documented, and often takes minutes.

For K8s, this becomes a significantly more complex task. You have to go deep into Kubernetes storage primitives to figure out how to properly migrate the data to a new disk specification.

There is an allowVolumeExpansion: true option that should make it work, but the details matter, and for databases, that is usually something DBAs are really careful about.

Databases tend to care about their disk. So what happens if we don’t want to just change the size of the disk, but also its type? Such as moving from Standard to Premium. Doing that using VMs is as simple as changing the size. You may need to detach, change, and reattach the disk, but that is a well-trodden path.

In Kubernetes, you need to run a migration, delete the StatefulSets, make the configuration change, and reapply (crossing your fingers and hoping everything works).

Database nodes are not homogeneous

Databases running in a cluster configuration often require granular control over node upgrades and maintenance. I may want to designate a node as “this one is doing backups”, so it needs a bigger disk. Easy to do if each node is a dedicated VM, but much harder in practice inside K8s.

A recent example we ran into is controlling the upgrade process of a cluster. As any database administrator can tell you, upgrades are something you approach cautiously. RavenDB has great support for running cross-version clusters.

In other words, take a node in your cluster, upgrade that to an updated version (including across major versions!), and it will just work. That allows you to dip your toes into the waters with a single node, instead of doing a hard switch to the new version.

In a VM environment: Upgrading a single node in a RavenDB cluster is a simple, controlled process. You stop the database on the VM, perform the upgrade (often just replacing binaries), start the database, and allow the cluster to heal and synchronize. This allows you to manage the cluster's rolling upgrades with precision.

In K8s: Performing a targeted upgrade on just one node of the cluster is hard. The K8s deployment model (StatefulSets) is designed to manage homogeneous replicas. While you can use features like "on delete" update strategy, blue/green deployments, or canary releases, they add layers of abstraction and complexity that are necessary for stateless services but actively harmful for stateful systems.

Summary

For mission-critical database infrastructure where high performance, high availability, and operational simplicity are non-negotiable, the added layer of abstraction introduced by Kubernetes for managing persistence often introduces more friction than value.

While Kubernetes is an excellent platform for stateless services, we strongly recommend deploying RavenDB directly on dedicated Virtual Machines. This provides a cleaner operational surface, simpler maintenance procedures, and more direct control over the underlying resources—all critical factors for a stateful, high-performance database cluster.

Remember, your database nodes are cherished pets, don’t make them sleep in the barn with the cattle.

time to read 5 min | 849 words

I ran into this tweet from about a month ago:

dax @thdxr

programmers have a dumb chip on their shoulder that makes them try and emulate traditional engineering there is zero physical cost to iteration in software - can delete and start over, can live patch our approach should look a lot different than people who build bridges

I have to say that I would strongly disagree with this statement. Using the building example, it is obvious that moving a window in an already built house is expensive. Obviously, it is going to be cheaper to move this window during the planning phase.

The answer is that it may be cheaper, but it won’t necessarily be cheap. Let’s say that I want to move the window by 50 cm to the right. Would it be up to code? Is there any wiring that needs to be moved? Do I need to consider the placement of the air conditioning unit? What about the emergency escape? Any structural impact?

This is when we are at the blueprint stage - the equivalent of editing code on screen. And it is obvious that such changes can be really expensive. Similarly, in software, every modification demands a careful assessment of the existing system, long-term maintenance, compatibility with other components, and user expectations.This intricate balancing act is at the core of the engineering discipline.

A civil engineer designing a bridge faces tangible constraints: the physical world, regulations, budget limitations, and environmental factors like wind, weather, and earthquakes.While software designers might not grapple with physical forces, they contend with equally critical elements such as disk usage, data distribution, rules & regulations, system usability, operational procedures, and the impact of expected future changes.

Evolving an existing software system presents a substantial engineering challenge.Making significant modifications without causing the system to collapse requires careful planning and execution.The notion that one can simply "start over" or "live deploy" changes is incredibly risky.History is replete with examples of major worldwide outages stemming from seemingly simple configuration changes.A notable instance is the Google outage of June 2025, where a simple missing null check brought down significant portions of GCP. Even small alterations can have cascading and catastrophic effects.

I’m currently working on a codebase whose age is near the legal drinking age. It also has close to 1.5 million lines of code and a big team operating on it. Being able to successfully run, maintain, and extend that over time requires discipline.

In such a project, you face issues such as different versions of the software deployed in the field, backward compatibility concerns, etc. For example, I may have a better idea of how to structure the data to make a particular scenario more efficient. That would require updating the on-disk data, which is a 100% engineering challenge. We have to take into consideration physical constraints (updating a multi-TB dataset without downtime is a tough challenge).

The moment you are actually deployed, you have so many additional concerns to deal with. A good example of this may be that users are used to stuff working in a certain way. But even for software that hasn’t been deployed to production yet, the cost of change is high.

Consider the effort associated with this update to a JobApplication class:

This looks like a simple change, right? It just requires that you (partial list):

  • Set up database migration for the new shape of the data.
  • Migrate the existing data to the new format.
  • Update any indexes and queries on the position.
  • Update any endpoints and decide how to deal with backward compatibility.
  • Create a new user interface to match this whenever we create/edit/view the job application.
  • Consider any existing workflows that inherently assume that a job application is for a single position.
  • Can you be partially rejected? What is your status if you interviewed for one position but received an offer for another?
  • How does this affect the reports & dashboard?

This is a simple change, no? Just a few characters on the screen. No physical cost. But it is also a full-blown Epic Task for the project - even if we aren’t in production, have no data to migrate, or integrations to deal with.

Software engineersoperate under constraints similar to other engineers, including severe consequences for mistakes (global system failure because of a missing null check). Making changes to large, established codebases presents a significant hurdle.

The moment that you need to consider more than a single factor, whether in your code or in a bridge blueprint, there is a pretty high cost to iterations. Going back to the bridge example, the architect may have a rough idea (is it going to be a Roman-style arch bridge or a suspension bridge) and have a lot of freedom to play with various options at the start. But the moment you begin to nail things down and fill in the details, the cost of change escalates quickly.

Finally, just to be clear, I don’t think that the cost of changing software is equivalent to changing a bridge after it was built. I simply very strongly disagree that there is zero cost (or indeed, even low cost) to changing software once you are past the “rough draft” stage.

time to read 12 min | 2301 words

Hiring the right people is notoriously difficult.I have been personally involved in hiring decisions for about two decades, and it is an unpleasant process. You deal with an utterly overwhelming influx of applications, often from candidates using the “spray and pray” approach of applying to all jobs.

At one point, I got the resume of a divorce lawyer in response to a job posting for a backend engineer role. I was curious enough to follow up on that, and no, that lawyer didn’t want to change careers. He was interested in being a divorce lawyer. What kind of clients would want their divorce handled by a database company, I refrained from asking.

Companies often resort to expensive external agencies to sift through countless candidates.

In the age of AI and LLMs, is that still the case? This post will demonstrate how to build an intelligent candidate screening process using RavenDB and modern AI, enabling you to efficiently accept applications, match them to appropriate job postings, and make an initial go/no-go decision for your recruitment pipeline.

We’ll start our process by defining a couple of open positions:

  • Staff Engineer, Backend & DevOps
  • Senior Frontend Engineer (React/TypeScript/SaaS)

Here is what this looks like at the database level:

Now, let’s create a couple of applicants for those positions. We have James & Michael, and they look like this:

Note that we are not actually doing a lot here in terms of the data we ask the applicant to provide. We mostly gather the contact information and ask them to attach their resume. You can see the resume attachment in RavenDB Studio. In the above screenshot, it is in the right-hand Attachments pane of the document view.

Now we can use RavenDB’s new Gen AI attachments feature. I defined an OpenAI connection with gpt-4.1-mini and created a Gen AI task to read & understand the resume. I’m assuming that you’ve read my post about Gen AI in RavenDB, so I’ll skip going over the actual setup.

The key is that I’m applying the following context extraction script to the Applicants collection:


const resumePdf = loadAttachment("resume.pdf");
if(!resumePdf) return;


ai.genContext({name: this.applicantName})
    .withPdf(resumePdf);

When I test this script on James’s document, I get:

Note that we have the attachment in the bottom right - that will also be provided to the model. So we can now write the following prompt for the model:


You are an HR data parsing specialist. Your task is to analyze the provided CV/resume content (from the PDF) 
and extract the candidate's professional profile into the provided JSON schema.
In the requiredTechnologies object, every value within the arrays (languages, frameworks_libraries, etc.) must be a single, 
distinct technology or concept. Do not use slashes (/), commas, semicolons, or parentheses () to combine items within a single string. Separate combined concepts into individual strings (e.g., "Ruby/Rails" becomes "Ruby", "Rails").

We also ask the model to respond with an object matching the following sample:


{
  "location": "The primary location or if interested in remote option (e.g., Pasadena, CA or Remote)",
  "summary": "A concise overview of the candidate's history and key focus areas (e.g., Lead development of data-driven SaaS applications focusing on React, TypeScript, and Usability).",
  "coreResponsibilities": [
    "A list of the primary duties and contributions in previous roles."
  ],
  "requiredTechnologies": {
    "languages": [
      "Key programming and markup languages that the candidate has experience with."
    ],
    "frameworks_libraries": [
      "Essential UI, state management, testing, and styling libraries."
    ],
    "tools_platforms": [
      "Version control, cloud platforms, build tools, and project management systems."
    ],
    "data_storage": [
      "The database technologies the candidate is expected to work with."
    ]
  }
}

Testing this on James’s applicant document results in the following output:

I actually had to check where the model got the “LA Canada” issue. That shows up in the real resume PDF, and it is a real place. I triple-checked, because I was sure this was a hallucination at first ☺️.

The last thing we need to do is actually deal with the model’s output. We use an update script to apply the model’s output to the document. In this case, it is as simple as just storing it in the source document:


this.resume = $output;

And here is what the output looks like:

Reminder: Gen AI tasks in RavenDB use a three-stage approach:

  • Context extraction script - gets data (and attachment) from the source document to provide to the model.
  • Prompt & Schema - instructions for the model, telling it what it should do with the provided context and how it should format the output.
  • Update script - takes the structured output from the model and applies it back to the source document.

In our case, this process starts with the applicant uploading their CV, and then we have the Read Resume task running. This parses the PDF and puts the result in the document, which is great, but it is only part of the process.

We now have the resume contents in a structured format, but we need to evaluate the candidate’s suitability for all the positions they applied for. We are going to do that using the model again, with a new Gen AI task.

We start by defining the following context extraction script:


// wait until the resume (parsed CV) has been added to the document
if (!this.resume) return; 


for(const positionId of this.targetPosition) {
    const position = load(positionId);
    if(!position) continue;
    ai.genContext({
        position,
        positionId,
        resume: this.resume
    })
}

Note that this relies on the resume field that we created in the previous task. In other words, we set things up in such a way that we run this task after the Read Resume task, but without needing to put them in an explicit pipeline or manage their execution order.

Next, note that we output multiple contexts for the same document. Here is what this looks like for James, we have two separate contexts, one for each position James applied for:

This is important because we want to process each position and resume independently. This avoids context leakage from one position to another. It also lets us process multiple positions for the same applicant concurrently.

Now, we need to tell the model what it is supposed to do:


You are a specialized HR Matching AI. Your task is to receive two structured JSON objects — one describing a Job Position and one 
summarizing a Candidate Resume — and evaluate the suitability of the resume for the position.


Assess the overlap in jobTitle, summary, and coreResponsibilities. Does the candidate's career trajectory align with the role's needs (e.g., has matching experience required for a Senior Frontend role)?
Technical Match: Compare the technologies listed in the requiredTechnologies sections. Identify both direct matches (must-haves) and gaps (missing or weak areas). Consider substitutions such as js or ecmascript to javascript or node.js. 


Evaluate if the candidate's experience level and domain expertise (e.g., SaaS, Data Analytics, Mapping Solutions) meet or exceed the requirements.

And the output schema that we want to get from the model is:


{
  "explanation": "Provide a detailed analysis here. Start by confirming the high-level match (e.g., 'The candidate is an excellent match because...'). Detail the strongest technical overlaps (e.g., React, TypeScript, Redux, experience with BI/SaaS). Note any minor mismatches or significant overqualifications (e.g., candidate's deep experience in older technologies like ASP.NET classic is not required but demonstrates full-stack versatility).",  "isSuitable": false
}

Here I want to stop for a moment and talk about what exactly we are doing here. We could ask the model just to judge whether an applicant is suitable for a position and save a bit on the number of tokens we spend. However, getting just a yes/no response from the model is not something I recommend.

There are two primary reasons why we want the explanation field as well. First, it serves as a good check on the model itself. The order of properties matters in the output schema. We first ask the model to explain itself, then to render the verdict. That means it is going to be more focused.

The other reason is a bit more delicate. You may be required to provide an explanation to the applicant if you reject them. I won’t necessarily put this exact justification in the rejection letter to the applicant, but it is something that is quite important to retain in case you need to provide it later.

Going back to the task itself, we have the following update script:


this.suitability = this.suitability || {};
this.suitability[$input.positionId] = $output;

Here we are doing something quite interesting. We extracted the positionId at the start of this process, and we are using it to associate the output from the model with the specific position we are currently evaluating.

Note that we are actually evaluating multiple positions for the same applicant at the same time, and we need to execute this update script for each of them. So we need to ensure that we don’t overwrite previous work.

I’m not mentioning this in detail because I covered it in my previous Gen AI post, but it is important to note that we have two tasks sourced from the same document. RavenDB knows how to handle the data being modified by both tasks without triggering an infinite loop. It seems like a small thing, but it is the sort of thing that not having to worry about really simplifies the whole process.

With these two tasks, we have now set up a complete pipeline for the initial processing of applicants to open positions. As you can see here:

This sort of process allows you to integrate into your system stuff that, until recently, looked like science fiction. A pipeline like the one above is not something you could just build before, but now you can spend a few hours and have this capability ready to deploy.

Here is what the tasks look like inside RavenDB:

And the final applicant document after all of them have run is:

You can see the metadata for the two tasks (which we use to avoid going to the model again when we don’t have to), as well as the actual outputs of the model (resume, suitability fields).

A few more notes before we close this post. I chose to use two GenAI tasks here, one to read the resume and generate the structured output, and the second to actually evaluate the applicant’s suitability.

From a modeling perspective, it is easier to split this into distinct steps. You can ask the model to both read the resume and evaluate suitability in a single shot, but I find that it makes it harder to extend the system down the line.

Another reason you want to have different tasks for this is that you can use different models for each one. For example, reading the resume and extracting the structured output is something you can run on gpt-4.1-mini or gpt-5-nano, while evaluating applicant suitability can make use of a smarter model.

I’m really happy with the new RavenDB AI integration features. We got some early feedback that is really exciting, and I’m looking forward to seeing what you can do with them.

time to read 2 min | 291 words

You might have noticed a theme going on in RavenDB. We care a lot about performance. The problem with optimizing performance is that sometimes you have a great idea, you implement it, the performance gains are there to be had - and then a test fails… and you realize that your great idea now needs to be 10 times more complex to handle a niche edge case.

We did a lot of work around optimizing the performance of RavenDB at the lowest levels for the next major release (8.0), and we got a persistently failing test that we started to look at.

Here is the failing message:

Restore with MaxReadOpsPerSecond = 1 should take more than '11' seconds, but it took '00:00:09.9628728'

The test in question is ShouldRespect_Option_MaxReadOpsPerSec_OnRestore, part of the MaxReadOpsPerSecOptionTests suite of tests. What it tests is that we can limit how fast RavenDB can restore a database.

The reason you want to do that is to avoid consuming too many system resources when performing a big operation. For example, I may want to restore a big database, but I don’t want to consume all the IOPS on the server, because there are additional databases running on it.

At any rate, we started to get test failures on this test. And a deeper investigation revealed something quite amusing. We made the entire system more efficient. In particular, we managed to reduce the size of the buffers used significantly, so we can push more data faster. It turns out that this is enough to break the test.

The fix was to reduce the actual time that we budget as the minimum viable time. And I have to say that this is one of those pull requests that lights a warm fire in my heart.

time to read 7 min | 1290 words

I got a question from one of our users about how they can use RavenDB to manage scheduled tasks. Stuff like: “Send this email next Thursday” or “Cancel this reservation if the user didn’t pay within 30 minutes.”

As you can tell from the context, this is both more straightforward and more complex than the “run this every 2nd Wednesday" you’ll typically encounter when talking about scheduled jobs.

The answer for how to do that in RavenDB is pretty simple, you use the Document Refresh feature. This is a really tiny feature when you consider what it does. Given this document:


{
   "Redacted": "Details",
   "@metdata": {
      "@collection": "RoomAvailabilities",
      "@refresh": "2025-09-14T10:00:00.0000000Z"
   }
}

RavenDB will remove the @refresh metadata field at the specified time. That is all this does, nothing else. That looks like a pretty useless feature, I admit, but there is a point to it.

The act of removing the @refresh field from the document will also (obviously) update the document, which means that everything that reacts to a document update will also react to this.

I wrote about this in the past, but it turns out there are a lot of interesting things you can do with this. For example, consider the following index definition:


from RoomAvailabilitiesas r
where true and not exists(r."@metadata"."@refresh")
select new { 
  r.RoomId,
  r.Date,
  // etc...
}

What you see here is an index that lets me “hide” documents (that were reserved) until that reservation expires.

I can do quite a lot with this feature. For example, use this in RabbitMQ ETL to build automatic delayed sending of documents. Let’s implement a “dead-man switch”, a document will be automatically sent to a RabbitMQ channel if a server doesn’t contact us often enough:


if (this['@metadata']["@refresh"]) 
    return; // no need to send if refresh didn't expire


var alertData = {
    Id: id(this),
    ServerId: this.ServerId,
    LastUpdate: this.Timestamp,
    LastStatus: this.Status || 'ACTIVE'
};


loadToAlertExchange(alertData, 'alert.operations', {
    Id: id(this),
    Type: 'operations.alerts.missing_heartbeat',
    Source: '/operations/server-down/no-heartbeat'
});

The idea is that whenever a server contacts us, we’ll update the @refresh field to the maximum duration we are willing to miss updates from the server. If that time expires, RavenDB will remove the @refresh field, and the RabbitMQ ETL script will send an alert to the RabbitMQ exchange. You’ll note that this is actually reacting to inaction, which is a surprisingly hard thing to actually do, usually.

You’ll notice that, like many things in RavenDB, most features tend to be small and focused. The idea is that they compose well together and let you build the behavior you need with a very low complexity threshold.

The common use case for @refresh is when you use RavenDB Data Subscriptions to process documents. For example, you want to send an email in a week. This is done by writing an EmailToSend document with a @refresh of a week from now and defining a subscription with the following query:


from EmailToSend as e
where true and not exists(e.'@metadata'.'@refresh')

In other words, we simply filter out those that have a @refresh field, it’s that simple. Then, in your code, you can ignore the scheduling aspect entirely. Here is what this looks like:


var subscription = store.Subscriptions
    .GetSubscriptionWorker<EmailToSend>("EmailToSendSubscription");


await subscription.Run(async batch =>
{
    using var session = batch.OpenAsyncSession();
    foreach (var item in batch.Items)
    {
        var email = item.Result;
        await EmailProvider.SendEmailAsync(new EmailMessage
        {
            To = email.To,
            Subject = email.Subject,
            Body = email.Body,
            From = "no-reply@example.com"
        });


        email.Status = "Sent";
        email.SentAt = DateTime.UtcNow;
    }
    await session.SaveChangesAsync();
});

Note that nothing in this code handles scheduling. RavenDB is in charge of sending the documents to the subscription when the time expires.

Using @refresh + Subscriptions in this manner provides us with a number of interesting advantages:

  • Missed Triggers: Handles missed schedules seamlessly, resuming on the next subscription run.
  • Reliability: Automatically retries subscription processing on errors.
  • Rescheduling: When @refresh expires, your subscription worker will get the document and can decide to act or reschedule a check by updating the @refresh field again.
  • Robustness: You can rely on RavenDB to keep serving subscriptions even if nodes (both clients & servers) fail.
  • Scaleout: You can use concurrent subscriptions to have multiple workers read from the same subscription.

You can take this approach really far, in terms of load, throughput, and complexity. The nice thing about this setup is that you don’t need to glue together cron, a message queue, and worker management. You can let RavenDB handle it all for you.

time to read 7 min | 1278 words

Since version 7.0, RavenDB has native support for vector search. One of my favorite queries ever since has been this one:


$query = 'Italian food'
from "Products" 
where vector.search(embedding.text(Name), $query)
limit 5

If you run that on the sample database for RavenDB (Northwind), you’ll get the following results:

  • Mozzarella di Giovanni
  • Ravioli Angelo
  • Chef Anton's Gumbo Mix
  • Mascarpone Fabioli
  • Chef Anton's Cajun Seasoning

I think we can safely state that the first two are closely related to Italian food, but the last three? What is that about?

The query above is using a pretty simple embedding model (bge-micro-v2 with 384 dimensions), so there is a limit to how sophisticated it can get.

I defined an index using OpenAI’s text-embedding-3-small model with 1,536 dimensions. Here is the index in question:


from p in docs.Products
select new
{
    NameVector = LoadVector("Name", "products-names")
}

And here is the query:


$query = 'Italian food'


from index 'Products/SemanticSearch'
where vector.search(NameVector, $query)
limit 5

The results we got are much better, indeed:

  • Ravioli Angelo
  • Mozzarella di Giovanni
  • Gnocchi di nonna Alice
  • Gorgonzola Telino
  • Original Frankfurter grüne Soße

However… that last result looks very much like a German sausage, not really a hallmark of the Italian kitchen. What is going on?

Vector search is also known as semantic search, and it gets you the closest items in vector space to what you were looking for. Leaving aside the quality of the embeddings model we use, we’ll find anything that is close. But what if we don’t have anything close enough?

For example, what will happen if I search for something that is completely unrelated to the data I have?


$query = 'Giant leap for man'

Remember how vector search finds the nearest matching elements. In this case, here are the results:

  • Sasquatch Ale
  • Vegie-spread
  • Chang
  • Maxilaku
  • Laughing Lumberjack Lager

I think we can safely agree that this isn’t really that great a result. It isn’t the fault of the vector search, by the way. You can define a minimum similarity threshold, but… those are usually fairly arbitrary.

I want to find “Ravioli” when I search for “Italian food”, but that has a score of 0.464, while the score of “Sasquatch Ale” from “Giant leap for man” is 0.267.

We need to add some intelligence into the mix, and luckily we can do that in RavenDB with the help of AI Agents. In this case, we aren’t going to build a traditional chatbot, but rely on the model to give us good results.

Here is the full agent definition, in C#:


var agent = new AiAgentConfiguration
{
    Name = "Search Agent",
    Identifier = "search-agent",
    ConnectionStringName = "OpenAI-Orders-ConStr",
    Parameters = [],
    SystemPrompt = @"
Your task is to act as a **product re-ranking agent** for a product
catalog. Your goal is to provide the user with the most relevant and
accurate product results based on their search query.


### Instructions


1.  **Analyze the User Query:** Carefully evaluate the user's
    request, identifying key product attributes, types, and intent.
2.  **Execute Search:** Use the `Search` query tool to perform a
    semantic search on the product catalog. Formulate effective and
    concise search terms derived from the user's query to maximize the
    initial retrieval of relevant products.
3.  **Re-rank Results:** For each product returned by the `Search`
    function, analyze its features (e.g., title, description,
    specifications) and compare them against the user's original
    query. Re-order the list of products from most to least
    relevant. **Skip any products that are not a good match for
    the user's request, regardless of their initial search score.**
4.  **Finalize & Present:** Output the re-ranked list of products,
    ensuring the top results are the most likely to satisfy the
    user's request.
",
    SampleObject = JsonConvert.SerializeObject(new
    {
        Products = new[]
        {
            new { Id = "The Product ID", Name = "The Product Name"}
        }
    }),
    Queries = [new AiAgentToolQuery
    {
        Name = "Search",
        Description = "Searches the product catalog for matches to the terms",
        ParametersSampleObject = JsonConvert.SerializeObject(new
        {
            query = "The terms to search for"
        }),
        Query = @"from index 'Products/SemanticSearch'
where vector.search(NameVector, $query)
select Name
limit 10"
    }],
};

Assuming that you are not familiar with AI Agent definitions in RavenDB, here is what is going on:

  • We configure the agent to use the OpenAI-Orders-ConStr (which uses the gpt-4.1-mini model) and specify no intrinsic parameters,  since we only perform searches in the public product catalog.
  • We tell the agent what it is tasked with doing. You’ll note that the system prompt is the most complex aspect here. (In this case, I asked the model to generate a good prompt for me from the initial idea).
  • Then we define (using a sample object) how the results should be formatted.
  • Finally, we define the query that the model can call to get results from the product catalog.

With all of that in hand, we can now perform the actual search. Here is how it looks when we run it from the RavenDB Studio:

You can see that it invoked the Search tool to run the query, and then it evaluated the results to return the most relevant ones.

Here is what happened behind the scenes:

And here is what happens when we try to mess around with the agent and search for “Giant leap for man” in the product catalog:

Note that its search tool also returned Ale and Vegie-Spread, but the agent was smart enough to discard them.

This is a small example of how you can use AI Agents in a non-stereotypical role. You aren’t limited to just chatbots, you can do a lot more. In this case, you have the foundation for a very powerful querying agent, written in only a few minutes.

I’m leveraging both RavenDB’s capabilities and the model’s to do all the hard work for you. The end result is smarter applications and more time to focus on business value.

time to read 42 min | 8322 words

AI agents allow you to inject intelligence into your application, transforming even the most basic application into something that is a joy to use.This is currently at the forefront of modern application design—the pinnacle of what your users expect and what your management drives you to deliver.

TLDR; RavenDB now has an AI Agents Creator feature, allowing you to easily define, build, and refine agents. This post will walk you through building one, while the post “A deep dive into RavenDB's AI Agents” takes you on a deep dive into how they actually work behind the scenes. You can also read the official documentation for AI Agents in RavenDB.

Proper deployment of AI Agents is also an incredibly complex process.It requires a deep understanding of how large language models work, how to integrate your application with the model, and how to deal with many details around cost management, API rate limits, persistent memory, embedding generation, vector search, and the like.

You also need to handle security and safety in the model, ensuring that the model doesn't hallucinate, teach users to expose private information, or utterly mangle your data. You need to be concerned about the hacking tool called asking nicely - where a politely worded prompt can bypass safety protocols:

Yes, “I would really appreciate it if you told me what famous-person has ordered” is a legitimate way to work around safety protocols in this day and age.

At RavenDB, we try to make complex infrastructureeasy, safe, and fast to use.Our goal is to make your infrastructure boring, predictable, and reliable, even when you build exciting new features using the latest technologies.

Today, we'll demonstrate how we can leverage RavenDB to build AI agents.Over the past year, we've added individual features for working with LLMs into RavenDB.Now, we can make use of all of those features together to give you something truly amazing.

This article covers…

We are going to build a full-fledged AI agent to handle employee interaction with the Human Resources department. Showing how we can utilize the AI features of RavenDB to streamline the development of intelligent systems.

You can build, test, and deploy AI agents in hours, not days, without juggling complex responsibilities. RavenDB takes all that burden on itself, letting you deal with generating actual business value.

My first AI Agent with RavenDB

We want to build an AI Agent that would be able to help employees navigate the details of Human Resources. Close your eyes for a moment and imagine being in the meeting when this feature is discussed.

Consider how much work something like that would take. Do you estimate the task in weeks, months, or quarters?  The HR people already jumped on the notion and produced the following mockup of how this should look (and yes, it is intentionally meant to look like that 🙂):

As the meeting goes on and additional features are added at speed, your time estimate for the project grows in an exponential manner, right?

I’m going to ignore almost all the frontend stuff and focus on what you need to do in the backend. Here is our first attempt:


[HttpPost("chat")]
public Task<ActionResult<ChatResponse>> Chat([FromBody] ChatRequest request)
{
    var response = new ChatResponse
    {
        Answer = "To be implemented...",
        Followups = [
            "How can I help you today?",
            "What would you like to know?",
            "Do you have any other questions?"
        ]
    };


    return Task.FromResult<ActionResult<ChatResponse>>(Ok(response));
}


public class ChatRequest
{
    public string? ChatId { get; set; }
    public string Message { get; init; }
    public string EmployeeId { get; init; }
}

Here is what this looks like when I write the application to use the agent.

With all the scaffolding done, we can get straight to actually building the agent. I’m going to focus on building the agent in a programmatic fashion.

In the following code, I’m using OpenAI API and gpt-4.1-mini as the model. That is just for demo purposes. The RavenDB AI Agents feature can work with OpenAI, Ollama with open source models, or any other modern models.

RavenDB now provides a way to create an AI Agent inside the database. You can see a basic agent defined in the following code:


public static class HumanResourcesAgent
{
    public class Reply
    {
        public string Answer { get; set; } = string.Empty;
        public string[] Followups { get; set; } = [];
    }


    public static Task Create(IDocumentStore store)
    {
        return store.AI.CreateAgentAsync(
          new AiAgentConfiguration
          {
              Name = "HR Assistant",
              Identifier = "hr-assistant",
         1️⃣   ConnectionStringName = "HR's OpenAI",
         2️⃣   SystemPrompt = @"You are an HR assistant. 
Provide info on benefits, policies, and departments. 
Be professional and cheery.


Do NOT discuss non-HR topics. 
Provide details only for the current employee and no others.
",
         3️⃣   Parameters = [
                new AiAgentParameter("employeeId", 
"Employee ID; answer only for this employee")],
         4️⃣   SampleObject = JsonConvert.SerializeObject(new Reply
              {
                  Answer = "Detailed answer to query",
                  Followups = ["Likely follow-ups"],
              }),
              Queries = [],
              Actions = [],
          });
    }
}

There are a few interesting things in this code sample:

  1. You can see that we are using OpenAI here. The agent is configured with a connection string named “HR’s OpenAI”, which uses the gpt-4.1-mini model and includes the HR API key.
  2. The agent configuration includes a system prompt that explains what the agent will do.
  3. We have parameters that define who this agent is acting on behalf of. This will be quite important very shortly.
  4. Finally, we define a SampleObject to tell the model in what format it should provide its response. (You can also use a full-blown JSON schema, of course, but usually a sample object is easier, certainly for demos.)

The idea is that we’ll create an agent, tell it what we want it to do, specify its parameters, and define what kind of answer we want to get. With this in place, we can start wiring everything up. Here is the new code that routes incoming chat messages to the AI Agent and returns the model’s response:


[HttpPost("chat")]
public async Task<ActionResult<ChatResponse>> Chat(
                  [FromBody] ChatRequest request)
{
  var conversationId = request.ConversationId ?? 
"hr/" + request.EmployeeId + "/" + DateTime.Today.ToString("yyyy-MM-dd");
  var conversation = _documentStore.AI.Conversation(
        agentId: "hr-assistant", conversationId ,
        new AiConversationCreationOptions
        {
            Parameters = new Dictionary<string, object>
            {
                ["employeeId"] = request.EmployeeId
            },
            ExpirationInSec = 60 * 60 * 24 * 30 // 30 days
        });
  conversation.SetUserPrompt(request.Message);
  var result = await conversation.RunAsync<HumanResourcesAgent.Reply>();
  var answer = result.Answer;


  return Ok(new ChatResponse
  {
        ConversationId = conversation.Id,
        Answer = answer.Answer,
        Followups = answer.Followups,
        GeneratedAt = DateTime.UtcNow
  });
}

There is quite a lot that is going on here. Let’s go over that in detail:

  • We start by creating a new conversation. Here, we can either use an existing conversation (by specifying the conversation ID) or create a new one.
  • If we don’t already have a chat, we’ll create a new conversation ID using the employee ID and the current date. This way, we have a fresh chat every day, but you can go back to the AI Agent on the same date and resume the conversation where you left off.
  • We provide a value for the employeeId parameter so the agent knows what context it operates in.
  • After setting the user prompt in the conversation, we run the agent itself.
  • Finally, we take the result of the conversation and return that to the user.

Note that calling this endpoint represents a single message in an ongoing conversation with the model. We use RavenDB’s documents as the memory for storing the entire conversation exchange - including user messages and model responses. This is important because it allows you to easily switch between conversations, resume them later, and maintain full context.

Now, let’s ask the agent a tough question:

I mean, the last name is right there at the top of the page… and the model is also hallucinating quite badly with regard to the HR Portal, etc. Note that it is aware Íof the employee ID, which we added as an agent parameter.

What is actually going on here? If I wanted to show you how easy it is to build AI Agents, I certainly showed you, right? How easy it is to build a bad one, that is.

The problem is that the model is getting absolutely no information from the outside world. It is able to operate only on top of its own internal knowledge - and that does not include the fictional last name of our sample character.

The key here is that we can easily fix that. Let’s teach the model that it can access the current employee details.

I’ve added the following section to the agent definition in the HumanResourcesAgent.Create() method:


Queries = [
    new AiAgentToolQuery
    {
        Name = "GetEmployeeInfo",
        Description = "Retrieve employee details",
        Query = "from Employees where id() = $employeeId",
        ParametersSampleObject = "{}"
    },
]

Let’s first see what impact this code has, and then discuss what we actually did.

Here is the agent fielding the same query again:

On a personal note, for an HR agent, that careful phrasing is amusingly appropriate.

Now, how exactly did this happen? We just added the GetEmployeeInfo query to the agent definition. The key here is that we have now made it available to the AI model, and it can take advantage of it.

Let’s look at the conversation’s state behind the scenes in the RavenDB Studio, and see what actually happened:

As you can see, we asked a question, and in order to answer it, the model used the GetEmployeeInfo query tool to retrieve the employee’s information, and then used that information to generate the answer.

I can continue the chat with the model and ask additional questions, such as:

Because the employee info we already received contains details about vacation time, the model can answer based on the information it has in the conversation itself, without any additional information requested.

How does all of that work?

I want to stop for a second to discuss what we actually just did. The AI Agent feature in RavenDB isn’t about providing an API for you to call the model. It is a lot more than that.

As you saw, we can define queries that will be exposed to the model, which will be executed by RavenDB when the model asks, and that the model can then use to compose its answers.

I’m skipping a bunch of details for now because I want to focus on the important aspect. We didn’t have to do complex integration or really understand anything about how AI models work. All we needed to do was write a query, and RavenDB does the rest for us.

The key here is that you need the following two lines:


conversation.SetUserPrompt(request.Message);
var result = await conversation.RunAsync<Reply>();

And RavenDB handles everything else for you. The model can ask a query, and RavenDB will hand it an answer. Then you get the full reply back. For that matter, notice that you aren’t getting back just text, but a structured reply. That allows you to work with the model’s reply in a programmatic fashion.

A final thought about the GetEmployeeInfo query for the agent. Look at the query we defined:


from Employees where id() = $employeeId

In particular, you can see that as part of creating the conversation, we provide the employeeId parameter. This is how we limit the scope of the agent to just the things it is permitted to see.

This is a hard limit - the model has no way to override the conversation-level parameters, and the queries will always respect their scope. You can ask the model to pass arguments to queries, but the way AI Agents in RavenDB are built, we assume a hard security boundary between the model and the rest of the system. Anything the model provides is suspect, while the parameters provided at conversation creation are authoritative and override anything else.

In the agent’s prompt above (the system prompt), you can see that we instruct it to ignore any questions about other employees. That is considered good practice when working with AI models. However, RavenDB takes this much further. Even if you are able to trick the model into trying to give you answers about other employees, it cannot do that because we never gave it the information in the first place.

Let me summarize that for you…

Something else that is happening behind the scenes, which you may not even be aware of, is the handling of memory for the AI model. It’s easy to forget when you look at the ChatGPT interface, but the model is always working in one-shot mode.

With each new message you send to the model, you also need to send all the previous messages so it will know what was already said. RavenDB handles that for you, so you can focus on building your application and not get bogged down in the details.


Q: Wait, if on each message I need to include all previous messages… Doesn’t that mean that the longer my conversation goes on, the more messages I send the model?

A: Yes, that is exactly what it means.

Q: And don’t I pay the AI model by the token?

A: Yes, you do. And yes, that gets expensive.


RavenDB is going to help you here as well. As the conversation grows too large, it is able to summarize what has been said so far, so you can keep talking to the model (with full history and context) without the token costs exploding.

This happens transparently, and by default, it isn’t something that you need to be aware of. I’m calling this out explicitly here because it is something that is handled for you, which otherwise you’ll have to deal with. Of course, you also have configurable options to tune this behavior for better control.

Making the agent smarter

Previously, we gave the agent access to the employee information, but we can make it a lot smarter. Let’s look at the kind of information we have in the sample database I’m working with. We have the following collections:

Let’s start by giving the model access to the vacation requests and see what it will let it do. We’ll start by defining another query:


new AiAgentToolQuery
{
    Name = "GetVacations",
    Description = "Retrieve recent employee vacation details",
    Query = @"
from VacationRequests
where EmployeeId = $employeeId 
order by SubmittedDate desc
limit 5
",
    ParametersSampleObject = "{}"
},

This query is another simple example of directly exposing data from the database to the model. Note that we are again constraining the query to the current employee only. With that in place, we can ask the model new questions, as you can see:

The really interesting aspect here is that we need so little work to add a pretty significant new capability to the system. A single query is enough, and the model is able to tie those disparate pieces of information into a coherent answer for the user.

Smart queries make powerful agents

The next capability we want to build is integrating questions about payroll into the agent. Here, we need to understand the structure of the PayStub in the system. Here is a simplified version of what it looks like:


public record PayStub(string Id,string EmployeeId,DateTime PayDate,
    decimal GrossPay,decimal NetPay, ACHBankDetails? DirectDeposit, 
    // ... redacted ...
    );

As you can imagine, payroll data is pretty sensitive. There are actually two types of control we want to have over this information:

  • An employee can ask for details only about their own salary.
  • Some details are too sensitive to share, even with the model (for example, bank details).

Here is how I add the new capability to the agent:


new AiAgentToolQuery
{
    Name = "GetPayStubs",
    Description = "Retrieve employee's paystubs within a given date range",
    Query = @"
    from PayStubs 
    where EmployeeId = $employeeId 
        and PayDate between $startDate and $endDate
    order by PayDate desc
    select PayPeriodStart, PayPeriodEnd, PayDate, GrossPay, NetPay, 
            Earnings, Deductions, Taxes, YearToDateGross, YearToDateNet, 
            PayPeriodNumber, PayFrequency
    limit 5",
    ParametersSampleObject = 
"{\"startDate\": \"yyyy-MM-dd\", \"endDate\": \"yyyy-MM-dd\"}"
},

Armed with that, we can start asking all sorts of interesting questions:

Now, let’s talk about what we actually did here. We have a query that allows the model to get pay stubs (for the current employee only) within a given date range.

  • The employeeId parameter for the query is taken from the conversation’s parameters, and the AI model has no control over it.
  • The startDate and endDate, on the other hand, are query parameters that are provided by the model itself.

Notice also that we provide a manual select statement which picks the exact fields from the pay stub to include in the query results sent to the model. This is a way to control exactly what data we’re sending to the model, so sensitive information is never even visible to it.

Effective agents take action and get things done

So far, we have only looked at exposing queries to the model, but a large part of what makes agents interesting is when they can actually take action on your behalf. In the context of our system, let’s add the ability to report an issue to HR.

In this case, we need to add both a new query and a new action to the agent. We’ll start by defining a way to search for existing issues (again, limiting to our own issues only), as well as our HR policies:


new AiAgentToolQuery
{
    Name = "FindIssues",
    Description = "Semantic search for employee's issues",
    Query = @"
    from HRIssues
    where EmployeeId = $employeeId 
        and (vector.search(embedding.text(Title), $query) 
or vector.search(embedding.text(Description), $query))
    order by SubmittedDate desc
    limit 5",
    ParametersSampleObject = 
"{\"query\": [\"query terms to find matching issue\"]}"
},
new AiAgentToolQuery
{
    Name = "FindPolicies",
    Description = "Semantic search for employer's policies",
    Query = @"
    from HRPolicies
    where (vector.search(embedding.text(Title), $query) 
or vector.search(embedding.text(Content), $query))
    limit 5",
    ParametersSampleObject = 
"{\"query\": [\"query terms to find matching policy\"]}"
},

You might have noticed a trend by now: exposing data to the model follows a pretty repetitive process of defining the query, deciding which parameters the model should fill in the query (defined in the `ParametersSampleObject`), and… that is it.

In this case, the FindIssues query is using another AI feature - vector search and automatic embedding - to find the issues using semantic search for the current employee. Semantic search allows you to search by meaning, rather than by text.

Note that the FindPolicies query is an interesting one. Unlike all the other queries, it isn’t scoped to the employee, since the company policies are all public. We are using vector search again, so an agent search on “pension plan” will find the “benefits package policy” document.

With that, we can now ask complex questions of the system, like so:

Now, let’s turn to actually performing an action. We add the following action to the code:


Actions = [
    new AiAgentToolAction
    {
        Name = "RaiseIssue",
        Description = "Raise a new HR issue for the employee (full details)",
        ParametersSampleObject = JsonConvert.SerializeObject(
   new RaiseIssueArgs{
            Title = "Clear & short title describing the issue",
            Category = "Payroll | Facilities | Onboarding | Benefits",
            Description = "Full description, with all relevant context",
            Priority = "Low | Medium | High | Critical"
        })
    },
]

The question is how do I now perform an action? One way to do that would be to give the model the ability to directly modify documents. That looks like an attractive option until you realize that this means that you need to somehow duplicate all your existing business rules, validation, etc.

Instead, we make it simple for you to integrate your own code and processes into the model, as you can see below:


conversation.Handle<RaiseIssueArgs>("RaiseIssue", async (args) =>
{
    using var session = _documentStore.OpenAsyncSession();
    var issue = new HRIssue
    {
        EmployeeId = request.EmployeeId,
        Title = args.Title,
        Description = args.Description,
        Category = args.Category,
        Priority = args.Priority,
        SubmittedDate = DateTime.UtcNow,
        Status = "Open"
    };
    await session.StoreAsync(issue);
    await session.SaveChangesAsync();


    return "Raised issue: " + issue.Id;
});
var result = await conversation.RunAsync<Reply>();

The code itself is pretty simple. We have a functionthat accepts the parameters from the AI model, saves the new issue, and returns its ID. Boring, predictable code, nothing to write home about.

This is still something that makes me very excited, because what actually happens here is that RavenDB will ensure that when the model attempts this action, your code will be called. The fun part is all the code that isn’t there. The call will return a value, which will then be processed by the model, completing the cycle.

Note that we are explicitly using a lambda here so we can use the employeeId that we get from the request. Again, we are not trusting the model for the most important aspects. But we are using the model to easily create an issue with the full context of the conversation, which often captures a lot of important details without undue burden on the user.

Here are the results of the new capabilities:

Integrating with people in the real world

So far we have built a pretty rich system, and it didn’t take much code or effort at all to do so. Our next step is going to be a bit more complex, because we want to integrate our agent with people.

The simplest example I could think of for HR is document signing. For example, signing an NDA during the onboarding process. How can we integrate that into the overall agent experience?

The first thing to do is add an action to the model that will ask for a signature, like so:


new AiAgentToolAction
{
    Name = "SignDocument",
    Description = "Asks the employee to sign a document",
    ParametersSampleObject = JsonConvert.SerializeObject(new SignDocumentArgs{
        Document = "unique-document-id (take from the FindDocumentsToSign query tool)",
    })
},

Note that we provide a different query (and reference it) to allow the model to search for documents that are available for the user to sign. This way we can add documents to be signed without needing to modify the agent’s configuration. And by now you should be able to predict what the next step is.

Boring as a feature - the process of building and working with AI Agents is pretty boring. Expose the data it needs, add a way to perform the actions it calls, etc. The end result can be pretty amazing. But building AI Agents with RavenDB is intentionally streamlined and structured to the point that you have a clear path forward at all times.

We need to define another query to let the model know which documents are available for signature.


new AiAgentToolQuery
{
    Name = "FindDocumentsToSign",
    Description = "Search for documents that can be signed by the employee",
    Query = @"
    from SignatureDocuments
    where vector.search(embedding.text(Title), $query)
    select id(), Title
    limit 5",
    ParametersSampleObject = 
"{\"query\": [\"query terms to find matching documents\"]}"
},

You’ll recall (that’s a pun 🙂) that we are using semantic search here to search for intent. We can search for “confidentiality contract” to find the “non-disclosure agreement”, for example.

Now we are left with actually implementing the SignDocument action, right?

Pretty much by the nature of the problem, we need to have a user action here. In a Windows application, we could have written code like this:


conversation.Handle<SignDocumentArgs>("SignDocument", async (args) => {
    using var session = _documentStore.OpenAsyncSession();
    var document = await session.LoadAsync<SignatureDocument>(args.Document);
    var signDocumentWindow = new SignDocumentWindow(document);
    signDocumentWindow.ShowDialog();
    return signDocumentWindow.Result
        ? "Document signed successfully."
        : "Document signing was cancelled.";
});

In other words, we could have pulled the user’s interaction directly into the request-response loop of the model.

You aren’t likely to be writing Windows applications; it is far more likely that you are writing a web application of some kind, so you have the following actors in your system:

  1. User
  2. Browser
  3. Backend server
  4. Database
  5. AI model

When the model needs to call the SignDocument action, we need to be able to convey that to the front end, which will display the signature request to the user, then return the result to the backend server, and eventually pass it back to the model for further processing.

For something that is conceptually pretty simple, it turns out to be composed of a lot of moving pieces. Let’s see how using RavenDB’s AI Agent helps us deal with it.

Here is what this looks like from the user’s perspective. I couldn’t resist showing it to you live, so below you can see an actual screen recording of the behavior. It is that fancy 🙂.

We start by telling the agent that we want to sign a “confidentiality contract”. It is able to figure out that we are actually talking about the “non-disclosure agreement” and brings up the signature dialog. We then sign the document and send it back to the model, which replies with a confirmation.

On the server side, as we mentioned, this isn’t something we can just handle inline. We need to send it to the user. Here is the backend handling of this task:


conversation.Receive<SignDocumentArgs>("SignDocument", async (req, args) =>
{
    using var session = _documentStore.OpenAsyncSession();
    var document = await session.LoadAsync<SignatureDocument>(args.Document);
    documentsToSign.Add(new SignatureDocumentRequest
    {
        ToolId = req.ToolId,
        DocumentId = document.Id,
        Title = document.Title,
        Content = document.Content,
        Version = document.Version
    });
});

After we call RunAsync() to invoke the model, we need to handle any remaining actions that we haven’t already registered a handler for using Handle (like we did for raising issues). We use the Receive() method to get the arguments that the model sent us, but we aren’t actually completely processing the call.

Note that we aren’t returning anything from the function above. Instead, we’re adding the new document to sign to a list, which we’ll send to the front end for the user to sign.

The conversation cannot proceed until you provide a response to all requested actions. Future calls to RunAsync will return with no answer and will re-invoke the Receive()/Handle() calls for all still-pending actions until all of them are completed. We’ll need to call AddActionResponse() explicitly to return an answer back to the model.

The result of the chat endpoint now looks like this:


var finalResponse = new ChatResponse
{
    ConversationId = conversation.Id,
    Answer = result.Answer?.Answer,
    Followups = result.Answer?.Followups ?? [],
    GeneratedAt = DateTime.UtcNow,
    DocumentsToSign = documentsToSign // new code
};

Note that we send the ToolId to the browser, along with all the additional context it needs to show the document to the user. That will be important when the browser calls back to the server to complete the operation.

You can see the code to do so below. Remember that this is handled in the next request, and we add the signature response to the conversation to make it available to the model. We pass both the answer and the ToolId so the model can understand what action this is an answer to.


foreach (var signature in request.Signatures ?? [])
{
    conversation.AddActionResponse(signature.ToolId, signature.Content);
}

Because we expose the SignDocument action to the model, it may call the Receive() method to process this request. We’ll then send the relevant details to the browser for the user to actually sign. Then we’ll send all those signature confirmations back to the model by calling the chat action endpoint again, this time passing the collected signatures.

The key here is that we accept the list of signatures from the request and register the action response (whether the employee signed or declined the document), then we call RunAsync and let the model continue.

The API design here is explicitly about moving as much as possible away from developers needing to manage state, and leaning on the model to keep track of what is going on. In practice, all the models we tried gave really good results in this mode of operation. More on that below.

The end result is that we have a bunch of moving pieces, but we don’t need to keep track of everything that is going on. The state is built into the manner in which you are working with the agent and conversations. You have actions that you can handle inline (raising an issue) or send to the user (signing documents), and the conversation will keep track of that for you.

In essence, the idea is that we turn the entire agent model into a pretty simple state machine, with the model deciding on the transitions between states and requesting actions to be performed. Throughout the process, we lean on the model to direct us, but only our own code is taking actions, subject to our own business rules & validations.

Design principles

When we started designing the AI Agents Creator feature in RavenDB, we had a very clear idea of what we wanted to do. We want to allow developers to easily build smart AI Agents without having to get bogged down with all the details.

At the same time, it is really important that we don’t surrender control over what is going on in our applications. The underlying idea is that we can rely on the agent to facilitate things, not to actually act with unfettered freedom.

The entire design is centered on putting guardrails in place so you can enjoy all the benefits of using an AI model without losing control over what is going on in your system.

You can see that with the strict limits we place on what data the model can access (and how we can narrow its scope to just the elements it should see, without a way to bypass that), the model operates only within the boundaries we define. When there is a need to actually do something, it isn’t the model that is running the show. It can request an action, but it is your own code that runs that action.

Your own code running means that you don’t have to worry about a cleverly worded prompt bypassing your business logic. It means that you can use your own business logic & validation to ensure that the operations being run are done properly.

The final aspect we focused on in the design of the API is the ability to easily and incrementally build more capabilities into the agent. This is a pretty long article, but take note of what we actually did here.

We built an AI agent that is capable of (among other things):

  • Provide details about scheduled vacation and remaining time off - “How many vacation days will I have in October after the summer vacation?”
  • Ask questions about payroll information - “How much was deducted from my pay for federal taxes in Q1?”
  • Raise and check the status of workplace issues - “I need maintenance to fix the AC in room 431” or “I didn’t get a reply to my vacation request from two weeks ago”
  • Automate onboarding and digital filing - “I’ve completed the safety training…, what’s next?”
  • Query about workplace policies - “What’s the dress code on Fridays?”

And it only took a few hundred lines of straightforward code to do so.

Even more importantly, there is a clean path forward if we want to introduce additional behaviors into the system. Our vision includes being able to very quickly iterate on those sorts of agents, both in terms of adding capabilities to them and creating “micro agents” that deal with specific tasks.

All the code you didn’t have to write

Before I close this article, I want to shine a spotlight on what isn’t here - all the concerns that you don’t have to deal with when you are working with AI Agents through RavenDB. A partial list of these includes:

  • Memory - conversation memory, storing & summarizing are handled for you, avoiding escalating token costs over time.
  • Query Integration - directly expose data (in a controlled & safe manner) from your database to the model, without any hassles.
  • Actions - easily integrate your own operations into the model, without having to deal with the minutiae of working with the model in the backend.
  • Structured approach - allows you to easily integrate a model into your code and work with the model’s output in a programmatic fashion.
  • Vector search & embedding - everything you need is in the box. You can integrate semantic search, history queries, and more without needing to reach for additional tools.
  • State management - the RavenDB conversation tracks the state, the pending actions, and everything you need to have an actual back & forth rather than one-shot operations.
  • Defined scope & parameters - allows you to define exactly what the scope of operations is for the agent, which then gives you a safe way to expose just the data that the agent should see.

The goal is to reduce complexity and streamline the path for you to have much smarter systems. At the end of the day, the goal of the AI Agents feature is to enable you to build, test, and deploy an agent in hours.

You are able to quickly iterate over their capabilities without being bogged down by trying to juggle many responsibilities at the same time.

Summary

RavenDB's AI Agents Creator makes it easy to build intelligent applications. You can craft complex AI agents quickly with minimal work. RavenDB abstracts intricate AI infrastructure, giving you the ability to create feature-rich agents in hours, not months.

You can find the final version of the code for this article in the following repository.

The HR Agent built in this article handles employment details, vacation queries, payroll, issue reporting, and document signing. The entire system was built in a few hours using the RavenDB AI Agent Creator. A comparable agent, built directly using the model API, would take weeks to months to build and would be much harder to change, adapt, and secure.

Developers define agents with straightforward configurations — prompts, queries, and actions — while RavenDB manages conversation memory, summarization, and state, reducing complexity and token costs.

Features like vector search and secure parameter control enable powerful capabilities, such as semantic searches over your own data with minimal effort. This streamlined approach ensures rapid iteration and robust integration with business logic.

For more:

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. Recording (20):
    05 Dec 2025 - Build AI that understands your business
  2. Webinar (8):
    16 Sep 2025 - Building AI Agents in RavenDB
  3. RavenDB 7.1 (7):
    11 Jul 2025 - The Gen AI release
  4. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
  5. RavenDB News (2):
    02 May 2025 - May 2025
View all series

Syndication

Main feed ... ...
Comments feed   ... ...
}