Tutorials22 days ago5 min read

You could invent the next coding agent!

How BAML makes tools calls easy to integrate into apps

Greg Hale

Do you use Claude Code? Codex? Do you feel like there are parts that you could have made yourself, even if some parts of the apps remain mysterious? Or, have you thought that your agent could work so much better if only it had this one extra feature, tool, or piece of context?

If you have dreamed about hacking on coding assistants even a little, trust me, stop what you're doing, initialize a new repo and get started! It's much easier than you think. And more than ever, the limiting factor behind the next great coding agent isn't some advanced build system or leetcode-hard algorithm. It's simple creativity.

There is now an easy recipe for writing coding assistants. With BAML and some vibe coding you can spin up your own either from scratch or as a Zed plugin in just a few hours. Then you have an infinite playground for teaching the coding assistant new tricks.

The recipe

Describe your tool calls and engineer your context in BAML.
Build a TUI with ink or ratatui, OR Create an ACP plugin for Zed.
Connect the tool calls to the filesystem.
Play!

Vaibhav and Dex recently made a video discussing this process in detail. It's a great watch!

Dex himself is no stranger to coding assistants, having released his production-grade CodeLayer project. The full source code from the video and extensive notes are available on the AI That Works GitHub Repo.

How to write the next great coding assistant

Let's walk through the recipe. As a motivating example, let's build our assistant around a great idea that Theo shared in a recent video. Cursor should have two model dropdowns, so you can choose an expensive model for high-level planning tasks, and a different, cheaper model for simpler tasks, like implementating little independent features.

Describe your tool calls in BAML

The tools available to your agent are the core of the whole system, so that's the place to start building your agent. And in my biased opinion, BAML is the best way to define tool calls and the prompts that generate them. We have comprehensive docs on Tool calling in BAML.

The exact tools you provide to the coding agent will depend on the special features of your agent, but you will almost always need tools for reading and writing files:

class ReadFileTool {
  type "read_file_tool"
  reasoning string
  file_path string
}

class WriteFileTool {
  type "write_file_tool"
  reasoning string
  file_path string
  content string
}

class SendUserMeassage {
  type "send_user_message"
  content string
}

type Tool = ReadFileTool | WriteFileTool | SendUserMessage

This example, comes from my Zed ACP Plugin, which implements Theo's dual-difficulty subagent idea.

Next, write a BAML function along these lines to allow an LLM to generate tool calls that satisfy the user's request:

// Execute tasks with tool calls using Sonnet
function ChooseToolsForTask(
  user_prompt: string,
  conversation_history: string
) -> Tool[] {
  client ClaudeSonnet
  prompt #"
    You are an expert coding assistant with access to tools.

    This describes your output format. i.e. the encoding of tools.
    {{ ctx.output_format }}

    Conversation history:
    {{ conversation_history }}

    Current task: {{ user_prompt }}

    Break complex tasks into steps, e.g.:
    1. Understand the codebase
    2. Plan changes
    3. Implement
    4. Test

    {{ ctx.output_format }}
  "#
}

Now let's add some BAML types and an LLM function to implement Theo's idea. We need to classify the user's query as either high level planning, or simple implemntation.

enum TaskComplexity {
  COMPLEX
  SIMPLE
}

class TaskClassification {
  complexity TaskComplexity
  reasoning string
}


function ClassifyTask(
  user_prompt: string,
) -> TaskClassification {
  client ClaudeHaiku
  prompt #"
    Classify this query as either a complex planning task, or a simple
    implementation task:
    {{ user_prompt }}

    Respond in this format:
    {{ ctx.output_format }}
  "#
}

With these BAML types and functions, we've modeled all our LLM interactions.

Choose your platform

Do you want to build a standalone TUI like Claude Code or Codex? Or do you want an assistant that integrates with an IDE?

If you're making a TUI, just build your baml_src, crack open your existing coding assistant, and start vibe-coding the TUI. You can even point your assistant at the codex GitHub repo as a reference. Even if your app is written in TypeScript and uses ink, modern LLMs will have no trouble translating the concepts from codex, which uses Rust and ratatui.

Connect the tool calls with the file system

Vibe coding will likely do this for you, but it's worth briefly mentioning here. Your coding assistant is an agent that not only chooses tool calls, but also executes them.

When you call your ExecuteTaskWithTools BAML function, you will get back a list of tool calls - the tool calls that the LLM chose to implement the user's query. Depending on the tool (and on your platform - TUI vs. IDE), you will have to do some kind of file modification. Either directly editing the tool's specified file on disk, or (in the Zed case) signaling via the ACP SDK to change some file.

Here is a simplified version of the context orchestrator and tool executor from baml-acp.

async function runAgenticLoop(
  userPrompt: string,
  selectedModel: "sonnet" | "haiku",
  priorHistory: string[] // Context from previous user queries
) -> Promise<{ message: string, history: string[] }> {
    // History accumulates all context across iterations
    const conversationHistory = [
      ...priorHistory,
      `User: ${userPrompt}`
    ];

    // Loop up to 10 times
    for (let iteration = 0; iteration < 10; iteration++) {

      // 1. Call LLM with prompt + accumulated history
      const agentResponse = selectedModel === "haiku"
        ? await b.ExecuteTaskWithToolsHaiku(
            userPrompt,
            conversationHistory  // Full history as context
          )
        : await b.ExecuteTaskWithTools(
            userPrompt,
            conversationHistory
          );

      // 2. Add agent's thinking to history
      if (agentResponse.thinking) {
        conversationHistory.push(`Agent: ${agentResponse.thinking}`);
      }

      // 3. Execute each tool the agent requested
      for (const toolCall of agentResponse.tool_calls) {
        let result: string;

        // Execute the tool (inlined for clarity)
        if (toolCall.type === "read_file") {
          const fileContent = await readFile(toolCall.path);
          result = `✓ Read file: ${fileContent.substring(0, 100)}...`;
        }
        else if (toolCall.type === "write_file") {
          await writeFile(toolCall.path, toolCall.content);
          result = `✓ Wrote file: ${toolCall.path}`;
        }
        else if (toolCall.type === "run_command") {
          const output = await execCommand(toolCall.command);
          result = `✓ Command output: ${output}`;
        }
        else if (toolCall.type === "complete") {
          result = "✓ Task marked complete";
        }

        // 4. Add tool result to history for next iteration
        conversationHistory.push(
          `Tool: ${toolCall.type}\n` +
          `Reasoning: ${toolCall.reasoning}\n` +
          `Result: ${result}`
        );
      }

      // 5. Check if agent says it's done
      if (agentResponse.is_complete) {
        return { 
          agentResponse.final_message,
          history: conversationHistory
        };
      }

      // Loop continues: next iteration calls LLM with updated history
    }

    return {
      message: "Max iterations reached",
      history: conversationHistory
    }

  }

This function takes a selectedModel as input, which is supplied by the main ACP loop, which chooses a difficulty level based on the user's query.

    async handleTask(
      complexModel: string,
      simplemodel: string
    ) -> Promise<AgenticLoopResult> { 
      // Classify the task complexity (with full context)
      const classification: TaskClassification =
        await b.ClassifyTask(fullPrompt);

      // Determine which model to use
      const selectedModel =
        classification.complexity === "COMPLEX" ? complexModel : simpleModel;
      const modelName = selectedModel === "sonnet" ? "Sonnet 4.5" : "Haiku 3.5";
      const taskType =
        classification.complexity === "COMPLEX" ? "complex" : "simple";

      // Send status message to user
      await this.connection.sessionUpdate({
        sessionId: this.sessionId,
        update: {
          sessionUpdate: "agent_thought_chunk",
          content: {
            type: "text",
            text: `📊 Task classified as ${taskType.toUpperCase()}\n💭 ${classification.reasoning}\n🤖 Using ${modelName} for this ${taskType} task`,
          },
        },
      });

      // Run the agentic loop with selected model
      const finalMessage = await this.runAgenticLoop(userPrompt, selectedModel);
      ...
    }

With this logic in place, we've connected our Agent to Zed through the ACP and implemented Theo's nice idea for a split-model coding assistant.

Notice the selectors for models at the bottom-right, and the dynamic selection by task complexity in under "Thinking".

Play!

The baml-acp isn't meant to be a serious, production grade thing. It is more meant to demonstrate the point that you can, and should, whip up quick experiments to try out new ideas.

See how easy it is to define new tools and execute them? Do you see how you could do some context engineering to inject different structured data into the prompt?

There are so many ways to extend coding agents. Maybe you want to encourage the model to make not a Todo list, but a Todo tree. Maybe that tree should be stored specially in your context, rather than living implicitly in the conversation history. Maybe you heard about beads, the AI-friendly, local task manager, and you want to build a tight integration between beads and a coding agent.

Whatever your creative idea is, try it out! The world is your oyster.

Tutorials22 days ago5 min read

You could invent the next coding agent!

How BAML makes tools calls easy to integrate into apps

Greg Hale

The recipe

Describe your tool calls and engineer your context in BAML.
Build a TUI with ink or ratatui, OR Create an ACP plugin for Zed.
Connect the tool calls to the filesystem.
Play!

Vaibhav and Dex recently made a video discussing this process in detail. It's a great watch!

How to write the next great coding assistant

Describe your tool calls in BAML

The exact tools you provide to the coding agent will depend on the special features of your agent, but you will almost always need tools for reading and writing files:

class ReadFileTool {
  type "read_file_tool"
  reasoning string
  file_path string
}

class WriteFileTool {
  type "write_file_tool"
  reasoning string
  file_path string
  content string
}

class SendUserMeassage {
  type "send_user_message"
  content string
}

type Tool = ReadFileTool | WriteFileTool | SendUserMessage

This example, comes from my Zed ACP Plugin, which implements Theo's dual-difficulty subagent idea.

Next, write a BAML function along these lines to allow an LLM to generate tool calls that satisfy the user's request:

// Execute tasks with tool calls using Sonnet
function ChooseToolsForTask(
  user_prompt: string,
  conversation_history: string
) -> Tool[] {
  client ClaudeSonnet
  prompt #"
    You are an expert coding assistant with access to tools.

    This describes your output format. i.e. the encoding of tools.
    {{ ctx.output_format }}

    Conversation history:
    {{ conversation_history }}

    Current task: {{ user_prompt }}

    Break complex tasks into steps, e.g.:
    1. Understand the codebase
    2. Plan changes
    3. Implement
    4. Test

    {{ ctx.output_format }}
  "#
}

Now let's add some BAML types and an LLM function to implement Theo's idea. We need to classify the user's query as either high level planning, or simple implemntation.

enum TaskComplexity {
  COMPLEX
  SIMPLE
}

class TaskClassification {
  complexity TaskComplexity
  reasoning string
}


function ClassifyTask(
  user_prompt: string,
) -> TaskClassification {
  client ClaudeHaiku
  prompt #"
    Classify this query as either a complex planning task, or a simple
    implementation task:
    {{ user_prompt }}

    Respond in this format:
    {{ ctx.output_format }}
  "#
}

With these BAML types and functions, we've modeled all our LLM interactions.

Choose your platform

Do you want to build a standalone TUI like Claude Code or Codex? Or do you want an assistant that integrates with an IDE?

Connect the tool calls with the file system

Vibe coding will likely do this for you, but it's worth briefly mentioning here. Your coding assistant is an agent that not only chooses tool calls, but also executes them.

Here is a simplified version of the context orchestrator and tool executor from baml-acp.

async function runAgenticLoop(
  userPrompt: string,
  selectedModel: "sonnet" | "haiku",
  priorHistory: string[] // Context from previous user queries
) -> Promise<{ message: string, history: string[] }> {
    // History accumulates all context across iterations
    const conversationHistory = [
      ...priorHistory,
      `User: ${userPrompt}`
    ];

    // Loop up to 10 times
    for (let iteration = 0; iteration < 10; iteration++) {

      // 1. Call LLM with prompt + accumulated history
      const agentResponse = selectedModel === "haiku"
        ? await b.ExecuteTaskWithToolsHaiku(
            userPrompt,
            conversationHistory  // Full history as context
          )
        : await b.ExecuteTaskWithTools(
            userPrompt,
            conversationHistory
          );

      // 2. Add agent's thinking to history
      if (agentResponse.thinking) {
        conversationHistory.push(`Agent: ${agentResponse.thinking}`);
      }

      // 3. Execute each tool the agent requested
      for (const toolCall of agentResponse.tool_calls) {
        let result: string;

        // Execute the tool (inlined for clarity)
        if (toolCall.type === "read_file") {
          const fileContent = await readFile(toolCall.path);
          result = `✓ Read file: ${fileContent.substring(0, 100)}...`;
        }
        else if (toolCall.type === "write_file") {
          await writeFile(toolCall.path, toolCall.content);
          result = `✓ Wrote file: ${toolCall.path}`;
        }
        else if (toolCall.type === "run_command") {
          const output = await execCommand(toolCall.command);
          result = `✓ Command output: ${output}`;
        }
        else if (toolCall.type === "complete") {
          result = "✓ Task marked complete";
        }

        // 4. Add tool result to history for next iteration
        conversationHistory.push(
          `Tool: ${toolCall.type}\n` +
          `Reasoning: ${toolCall.reasoning}\n` +
          `Result: ${result}`
        );
      }

      // 5. Check if agent says it's done
      if (agentResponse.is_complete) {
        return { 
          agentResponse.final_message,
          history: conversationHistory
        };
      }

      // Loop continues: next iteration calls LLM with updated history
    }

    return {
      message: "Max iterations reached",
      history: conversationHistory
    }

  }

This function takes a selectedModel as input, which is supplied by the main ACP loop, which chooses a difficulty level based on the user's query.

    async handleTask(
      complexModel: string,
      simplemodel: string
    ) -> Promise<AgenticLoopResult> { 
      // Classify the task complexity (with full context)
      const classification: TaskClassification =
        await b.ClassifyTask(fullPrompt);

      // Determine which model to use
      const selectedModel =
        classification.complexity === "COMPLEX" ? complexModel : simpleModel;
      const modelName = selectedModel === "sonnet" ? "Sonnet 4.5" : "Haiku 3.5";
      const taskType =
        classification.complexity === "COMPLEX" ? "complex" : "simple";

      // Send status message to user
      await this.connection.sessionUpdate({
        sessionId: this.sessionId,
        update: {
          sessionUpdate: "agent_thought_chunk",
          content: {
            type: "text",
            text: `📊 Task classified as ${taskType.toUpperCase()}\n💭 ${classification.reasoning}\n🤖 Using ${modelName} for this ${taskType} task`,
          },
        },
      });

      // Run the agentic loop with selected model
      const finalMessage = await this.runAgenticLoop(userPrompt, selectedModel);
      ...
    }

With this logic in place, we've connected our Agent to Zed through the ACP and implemented Theo's nice idea for a split-model coding assistant.

Notice the selectors for models at the bottom-right, and the dynamic selection by task complexity in under "Thinking".

Play!

The baml-acp isn't meant to be a serious, production grade thing. It is more meant to demonstrate the point that you can, and should, whip up quick experiments to try out new ideas.

See how easy it is to define new tools and execute them? Do you see how you could do some context engineering to inject different structured data into the prompt?

Whatever your creative idea is, try it out! The world is your oyster.