AI Agents Need a New Syntax

Jan 24, 2025

Disclaimer: We build BAML, a new syntax for building AI applications and agents.

How Syntax Shapes Thinking

The concepts a syntax exposes fundamentally change how developers think about problems and solve them. Let's explore an example from web development.

We used to write websites like this:

def home():
    return """
<button onClick="() => console.log('Button clicked!')">Click me</button>
"""

This approach of mixing JavaScript, HTML, CSS, and backend logic proved unmaintainable. A single unclosed tag could break an entire site – an error that a compiler could catch in milliseconds, yet our tooling at that time didn't.

React Entered The Scene

React transformed web development by asking: "What if we brought HTML and JavaScript into the same syntax?" This led to JSX (and later TSX), which solved several critical problems:

  • Enforced component-based architecture
  • Standardized state management
  • Enabled compile-time error checking

This wasn't just a theoretical improvement. Facebook literally made React because their newsfeed had become too complex to maintain using traditional approaches.

Tailwind CSS followed a similar pattern by bringing CSS into React components. Traditional style.css files were full of custom 1-off classes that, in practice, were rarely reused. Turns out, the answer was always inline css, just with different syntax and amazing dev-x.

// inline CSS
<div style="{ 'background': 'red' }"/>
// Tailwind
<div className="bg-red-300" />

Read more about Why Tailwind.

Tailwind CSS Evolution

5 years ago, I felt the exact same way as @swyx about React. Being a C++ dev, I didn't want to learn React. It felt like bloatware. So, I wrote my own React (see the code), router and state management and all. I was wrong. React is better. Please forgive my sins.

Engineering problems with Agents

  • Prompts should be easy to find and read. Fundamentally they are strings like HTML, but often they are incredibly dynamic - closer to TSX. Strings everywhere are horribly unmaintable.
    • Why not TSX? Not everyone in the world uses Javascript/Typescript. But everyone will use AI.
  • Swapping models should be easy. Models get better every week, your app should too.
  • You need a hot-reload loop. When building a frontend, you change your code, cmd+s, the browser reloads (while persisting state from parent components)! What's the equivalent for agents?
  • It should work with any language (C++, Java, Ruby, …). Every existing piece of software will eventually be agentic. We shouldn't have to stand up a python microservice to build anything LLM related.
  • You need structured outputs. Not every model will support this, and traditional implementations degrade quality.
  • Agents stream - and streaming requires a different type-system. If the LLM returns a T[], while streaming I likely want a Partial<T>[]. This is even more complicated for multi-step agents.
  • No internet dependency - I don't want my app to require a paid SAAS to call a REST API.

Our Proposal: Basic-Ass Machine Learning (BAML)

or "Basically a Made-up Language" for your boss

This is our attempt at a syntax that starts to address some of these problems. In some ways the problem of creating a new syntax is even harder than the ones listed above. Everything is fair game when making new syntax. If you can code it, it can be yours. We needed a design philosophy to help restrict ideas:

BAML's Design Philosophy

  • 1: Avoid invention when possible
    • Yes, prompts need versioning — we have a great versioning tool: git
    • Yes, you need to save and iterate on prompts — we have a great storage: filesystems
  • 2: Any file editor and any terminal should be enough to use it
  • 3: Be fast
  • 4: A first year university student should be able to understand it

The core BAML principle: LLM Prompts are functions

The fundamental building block in BAML is a function. Every prompt is a function that takes in parameters and returns a type.

function ChatAgent(message: Message[], tone: "happy" | "sad") -> string

Every function additionally defines which models it uses and what its prompt is.

function ChatAgent(message: Message[], tone: "happy" | "sad") -> string {
    client "openai/gpt-4o"

    prompt #"
        Be a {{ tone }} bot.

        {% for m in message %}
        {{ _.role(m.role) }}
        {{ m.content }}
        {% endfor %}
    "#
}

class Message {
    role string
    content string
}

Making prompts easy to find and read

Since every prompt is a function, you can use standard LSP (Language Server Protocols) tools to find every prompt you've written. But we've taken BAML one step further and built native tooling for VSCode (jetbrains + neovim coming soon).

  1. You can see the full prompt (including any multi-modal assets)
  2. You can see the exact network request we are making
  3. You can see every function you've ever written Functions

Swapping models: 1-line changes

It's just 1 line. Sorry Sam

Retries and fallbacks? All statically defined.

Want to do this dynamically at runtime? Check out our Client Registry documentation.

Hot-reloading for prompts

Using AI is all about iteration speed.

If testing your pipeline takes 2 minutes, in 20 minutes, you can only test 10 ideas.

If testing your pipeline took 5 seconds, in 20 minutes, you can test 240 ideas.

Introducing testing, for prompts.

Plugging into every programming language

Since BAML is a configuration file format, the BAML toolchain allows you to transpile BAML into native Python, Typescript, Ruby, etc code. So you get autocomplete and static analysis without thinking about it. Plus, our compiler is fast - it takes less than 30 ms to generate code!

All interfaces use the same Rust core, so no more "typescript is broken!". Every language we natively support is Tier 1 and "works". Every feature ships to all of them at the same time.

For languages we don't yet have native support for, BAML can also turn into a local REST API Server and use OpenAPI to generate a typesafe client. Devs are using this for C++, Java, Erlang, Go, PHP etc). However, we do plan on adding more languages with native codegen! Just takes time. (PRs Welcome! 🙃)

Structured outputs with any LLM

JSON is amazing for REST APIs, but way too strict and verbose for LLMs. LLMs need something flexible. We created the SAP (schema-aligned parsing) algorithm to support the flexible outputs LLMs can provide, like markdown within a json blob or chain-of-thought prior to answering.

SAP works with any model on day-1, without depending on tool-use or function-calling APIs.

Read more about this in our R1 blog post and O1 blog post.

Streaming (when it's a first class citizen)

Streaming is way harder than it should be. With our [python/typescript/ruby] generated code, streaming becomes natural and type-safe.

No strings attached

  • 100% open-source (Apache 2)
  • 100% private. AGI will not require an internet connection, neither will BAML
    • No network requests beyond the model call you explicitly set
    • Not stored or used for any training data
  • BAML files can be saved locally on your machine and checked into Github for easy diffs.

Conclusion

As models get better, we'll continue expecting even more out of them. But what will never change is that we'll want a way to write maintainable code that uses those models. The current way we all just assemble strings is very reminiscent of the early days PHP/HTML soup in web development. We hope some of the ideas we shared today can make a tiny dent in helping us all shape the way we all code tomorrow.

Try BAML Today

Thanks for reading!