Structured outputs with Gemini 2.0
Here is how we can get structured output from Gemini 2.0 reliably without having to write tedious JSON schemas.
For this, we are going to use BAML, which is a simple prompting configuration file to write prompts with types, that you can then import in your code. It helps keep your code clean, organized, and best of all -- comes with a VSCode playground to run your prompts immediately.
Then when you want to run it in python you can save the BAML file, and the VSCode extension will generate a baml_client
with your types and your function:
from baml_client import b
response = b.ExtractHierarchy(message="""
George is the CEO of the company.
Kelly is the VP of Sales. Asif is the global head of product development.
Mohammed manages the shopping cart experience.
Tim manages sales in South.
Stefan is responsible for sales in the f100 company.
Carol is in charge of user experience""")
print(response) # fully type-safe and validated!
Tool calling with Gemini 2.0 and BAML
Here is another example of using function calling with Gemini 2.0 -- where Gemini can decide to book an appointment or search for an item.
You can call this in python like this (we also support other languages!):
from baml_client import b
from baml_client.types import ProductSearch, ScheduleAppointment
response = b.ChooseOneTool(user_message="Find me running shoes under $100 in the sports category")
print(response)
if isinstance(response, ItemSearch):
print(f"Item Search called:")
print(f"Query: {response.query}")
print(f"Max Price: ${response.maxPrice}")
print(f"Category: {response.category}")
elif isinstance(response, BookAppointment):
print(f"Book Appointment called:")
print(f"Customer: {response.clientName}")
print(f"Service: {response.serviceRequested}")
print(f"Date: {response.datePreferred}")
print(f"Duration: {response.timeDuration} minutes")
That's it!
BAML works with many languages (Ruby, TS, Python, etc), so feel free to check those out!
Check out the docs for more!