Type-Safe LLM Response Handling with BoundaryML for Java Developers

you can also read this article in Medium -> click here

Are you tired of messy LLM JSON outputs in your Java project? Well, this might be the tool to try — BoundaryML. Have a look at this:

// BAML
class Ticket {
  category Category
  urgency Urgency
  summary string
}
function ClassifyTicket(message: string) -> Ticket {
  client LocalOllama
  prompt #"""
  ...
  Message:
  {{ message }}

  {{ctx.output_format}}
  """#
}
------------------------------------------------------------------
// Java
Ticket supportTicket = defaultApi.classifyTicket(classifyTicketRequest);

It can make our apps much more reliable as it really excels in providing structured outputs and best of all — it works with any LLM and any language!

Boundary ML (BAML) is a domain-specific language to generate structured outputs from LLMs — with the best developer experience.

With BAML you can build reliable Agents, Chatbots with RAG, extract data from Pdfs, and more.

So with BAML, instead of just text, we can get structured output from our LLM calls — return object types which we can then inspect easily and process further, or call other AI Agents or APIs with type-safe data.

In BAML, a prompt is a function — if we’re extracting, for example, information about a support ticket from a text with some LLM, we would want that ticket type returned from the BAML prompt function.

Example

Now let’s review our real-world example here and see how BAML lets us play around with its prompts:

We want to call a Large Language Model with a text from a customer’s support request. Let’s say we send:

Hello Customer Support Team! I was charged twice for my monthly subscription this morning! Can you please refund the extra charge ASAP? I’m worried this might happen again.

Normally, we’d get back free-form text. Maybe something like: The user has a billing problem. This seems urgent because they were charged twice.

That’s OK for us humans to read, but not so nice for your backend — you’d still need to parse it, handle edge-cases, validate it, etc.

Instead, we want the model to output a clear object with three fields:

1. Category (ENUM — Billing, Technical, Account, or Other)

2. Urgency (ENUM — Low, Medium, or High)

3. Summary of the problem (string)

I think the support ticket use case is a pretty common real-world scenario, so let’s use it for our example. It’s easy to get started, and you can get the hang of Boundary ML in no time. We’ll use locally hosted Ollama for our examples to make things even easier — no API Keys or Subscriptions required:

Install Ollama

Just go to their website: https://ollama.com/download.

Download the appropriate file for your OS and install it like you would install any other application. This installs the ollama command line tool, and you will be able to run ollama models locally with just a simple command like:

ollama run gemma3:latest

This will run gemma3 locally on your computer

Open VSCode or a similar editor and install the BAML extension — besides syntax highlighting, it gives us a pretty neat playground and testing feature, which makes things very easy to debug
Create a folder baml_src and inside a file called ticket-classifier.baml

Paste the following code inside:

client<llm> LocalOllama {
  provider "openai-generic"
  options {
    base_url "http://localhost:11434/v1"
    model "gemma3:latest"
  }
}

enum Category {
  Billing
  Technical
  Account
  Other
}

enum Urgency {
  Low
  Medium
  High
}

class Ticket {
  category Category
  urgency Urgency
  summary string
}

function ClassifyTicket(message: string) -> Ticket {
  client LocalOllama
  prompt #"""
  You are a customer support ticket assistant.  
  Given the customer’s message, classify it into:
  - category: Billing, Technical, Account, or Other
  - urgency: Low, Medium, or High
  - summary: A short 1-sentence summary of the issue.

  Message:
  {{ message }}

  {{ctx.output_format}}
  """#
}

test TestName {
  functions [ClassifyTicket]
  args {
    message #"
      Hi, I was charged twice for my monthly subscription this morning. 
      Can you please refund the extra charge ASAP? 
      I'm worried this might happen again.
    "#
  }
}

Thanks to the BAML extension you installed in your VSCode, you’d get the following actions in your code.

The code is pretty straightforward.

First, we define our locally hosted Ollama model
We then define the Ticket class — It contains a summary and two ENUM fields — Category and Urgency.
The ClassifyTicket is our function (remember, we talked about how in BAML, a prompt is seen as a function) where we define our prompt and pass the user message to it. Its return type is Ticket
The last piece of code is a test that we can run and see how the parsing happens:

Perfect, our example works, and BAML does its job quite well!

I promised you Java integration, though, so let’s look into it:

BAML offers close integration only for a limited set of languages (Typescript, Python, Ruby…). It’s pretty nice — you get seamless type-safety, autocompletion, and direct function calls as part of the code in your app.

They have a simple-enough solution for other languages ( like Java, Go, or others), though — OpenAPI integration that exposes BAML functions as RESTful API. So we essentially need to generate a client using the OpenAPI generator. They have done a good job documenting this, but I’ll guide you through it in detail here too. (I’ll also use Gradle instead of Maven, which is used across their documentation)

Set up Java Project

Create a Java Gradle project and add the mavenLocal() repository to the build.gradle file:

repositories {
    ...
    mavenLocal()
}

dependencies {

    ...

}

Install npm and OpenAPI

On my Mac, I just needed to execute: brew install npm openapi-generator

You can easily find the instructions to install both for Linux and Windows online.

Add BAML to our existing Java project

Open your terminal in the root folder of your Java project and execute the following command, which will give us some starter BAML code:

npx @boundaryml/baml init \
  --client-type rest/openapi --openapi-client-type java

You should see the following output in your terminal:

This command creates a baml_src folder in the root of our project. It contains three files: clients.baml generators.baml and resume.baml

clients.baml defines which AI Services (OpenAI, Anthropic, etc.) you want to connect to. It’s basically a list of available models you can choose from (Ollama isn’t there by default.)

Note: You don’t always need a separate clients.baml file. You can define clients directly in your main BAML files (like resume.baml or as in our example from above ticket-classifier.baml) if you prefer to keep everything in one place.

resume.baml - Your AI Functions. This is the file that contains the actual AI functions — the prompts, expected outputs, and which client to use. (Same as our example — ticket-classifier.baml , from above)

generators.baml — Code Generation Settings: This file tells BAML how to generate client libraries in your programming language of choice. In our case, we have output_type = “rest/openapi” which generates OpenAPI spec + HTTP client. So, BAML converts your functions into a standard REST API specification and generates an OpenAPI Client

So, after understanding what these files do, let’s:

Delete the clients.baml (We’ll define our Ollama Client in the same file as our AI Functions and prompts.)
Go to generators.baml and remove the && mvn clean install from the generated on_generate command.
Rename resume.baml to ticket-classifier.baml — We want to implement our example about the support tickets, so a better file naming is only logical.
Delete the pre-generated content (everything) in ticket-classifier.baml and add our BAML code for the support tickets classifier. Here is the code again:

client<llm> LocalOllama {
  provider "openai-generic"
  options {
    base_url "http://localhost:11434/v1"
    model "gemma3:latest"
  }
}

enum Category {
  Billing
  Technical
  Account
  Other
}

enum Urgency {
  Low
  Medium
  High
}

class Ticket {
  category Category
  urgency Urgency
  summary string
}

function ClassifyTicket(message: string) -> Ticket {
  client LocalOllama
  prompt #"""
  You are a customer support ticket assistant.  
  Given the customer's message, classify it into:
  - category: Billing, Technical, Account, or Other
  - urgency: Low, Medium, or High
  - summary: A short 1-sentence summary of the issue.

  Message:
  {{ message }}

  {{ctx.output_format}}
  """#
}

test TestName {
  functions [ClassifyTicket]
  args {
    message #"
      Hi, I was charged twice for my monthly subscription this morning. 
      Can you please refund the extra charge ASAP? 
      I'm worried this might happen again.
    "#
  }
}

(Again: at the beginning of the file, we define our Ollama Client using the gemma3:latest model)

Generate OpenAPI Schema

Run the following command:

npx @boundaryml/baml generate

What this does: BAML reads your ticket-classifier.baml and generators.baml files and creates a baml_client/ directory containing:

The generated Java client code
An OpenAPI specification (under baml_client/openapi.yaml )
All the type-safe classes and API methods you need to call your AI functions

Building and Running Our BAML Application

Now, this generated code is just source files — we need to compile them into a usable library and install it locally so our main application can import it as a dependency. Let’s go into our baml_client folder and execute:

cd baml_client
./gradlew clean build publishToMavenLocal

This will compile the code and add the library to our local Maven repository (~/.m2/repository/), so all our projects can use it (remember we added mavenLocal() to our build.gradle file when we started with our Java project).

Now that we’ve published our BAML client to the local Maven repository, we need to tell our main Java project to use it:

  dependencies {
     ...

     implementation "org.openapitools:openapi-java-client:0.1.0"

     ...
  }

(Your IDE won’t recognize the BAML Client classes if you don’t add this, and you’ll get compilation errors.)

We’re ready to write some Java code now. Go to your Main.java file and add the following sample code:

import com.boundaryml.baml_client.ApiClient;
import com.boundaryml.baml_client.ApiException;
import com.boundaryml.baml_client.Configuration;
import com.boundaryml.baml_client.api.DefaultApi;
import com.boundaryml.baml_client.model.ClassifyTicketRequest;
import com.boundaryml.baml_client.model.Ticket;
 
public class Main {
 
    private final static String userMessage = """
            Hi, I was charged twice for my monthly subscription this morning.
            Can you please refund the extra charge ASAP?
            I'm worried this might happen again.
        """;
 
 
    public static void main(String[] args) throws ApiException {
        ApiClient apiClient = Configuration.getDefaultApiClient();
        DefaultApi defaultApi = new DefaultApi(apiClient);
        ClassifyTicketRequest classifyTicketRequest = new ClassifyTicketRequest().message(userMessage);
        Ticket supportTicket = defaultApi.classifyTicket(classifyTicketRequest);
        
        System.out.println("Here the well-structured output from the LLM");
        System.out.println(supportTicket);
    }
}

You’ll see you’ll have all types available (like ClassifyTicketRequest , Ticket , etc.)

This will call our function classifyTicket that we defined in our BAML file, and will convert it into a well-structured Ticket object with all the needed fields.

Start the BAML development server

npx @boundaryml/baml dev

Let’s run the program and see the output:

Here the well-structured output from the LLM
class Ticket {
    category: Billing
    summary: The customer was incorrectly charged twice for their subscription and requests a refund.
    urgency: Medium
}

And that’s it! We get a normal type-safe Java Object out of the LLM!

Conclusion

Well-structured responses from LLMs are crucial for AI Workflows and AI Agent calls nowadays. You might know I’m a big fan of Langchain4j, which offers a similar functionality out of the box by just returning the object type from the AI Interface method call, but the BAML way is supposed to scale better in real-world scenario and decouples the logic of writing the Prompt itself — meaning non-technical people can play around in the baml files too. Another BAML advantage is, of course, that it can be used in multiple Programming Languages. It can also be very helpful when using popular AI Platforms like AWS Bedrock, for example — I think this can play well together. Feel free to share if you have experience with this combo. I’d love to hear about your experiments in the comments!

Thanks for reading

Loading comments...