Handling UUIDs in LLM Calls
Guidance on dealing with entity IDs in LLM functions

Greg Hale
LinkedInThis article will show you a trick we use for dealing with UUIDs.
UUIDs: Great for applications, expensive for LLMs
Your LLM data transformation tasks might involve UUIDs. For example you may have a list of messages with IDs, and you want to filter the list down to the IDs of messages that have a positive sentiment.

Cases like this require the model to read and produce UUIDs. This is a problem because UUIDs have a lot of entropy and cost a lot of tokens! The more tokens in a prompt, the harder it is for the model to respond correctly.
Just how many tokens? Let's check the OpenAI Tokenizer.

Each UUID costs a whopping 24 tokens, far more than the typical 1.25*words token count that you see for written language.
The tokenizer can also show us that a 3-digit number is encoded by a single token. What that means is that if you have fewer than 1000 unique UUIDs in your prompt, a single token is sufficient to identify each one.
Remapping UUIDs to ints
When working with structured data, the remapping trick is simple:
- Collect all the UUIDs from your input data.
- Find the unique ones and assign them int ids.
- Replace all the UUIDs in your prompt with their corresponding ints
- Make your LLM request
- Map all the ints in the response back to their corresponding UUIDs
A concrete example
Let's look at a real example where we aggregate items by their class IDs. We'll compare three approaches:
- Using UUIDs directly
- Using integers from the start
- Remapping UUIDs to integers before the LLM call
Note: You might wonder why we'd use an LLM for a simple aggregation task - after all, this is trivial to do programmatically. We're using aggregation as a benchmark because it's easy to scale (vary the number of items and IDs) and verify correctness automatically. The technique applies equally well to more realistic tasks like sentiment analysis on messages, categorizing support tickets, or any other structured data transformation where you need the LLM to accurately read and reproduce entity identifiers.
Here's a BAML function that takes items with UUID identifiers and aggregates them by class:
class ItemUuid {
class_id string @description("UUID")
name string
}
class AggregationUuid {
class_id string @description("Item identifier (UUID)")
count int
names string[]
}
function AggregateItemsUuid(items: ItemUuid[]) -> AggregationUuid[] {
client CustomHaiku
prompt #"
Aggregate the items by their class_id.
The output should list each class_id seen in the inputs
exactly once, collecting all the names of items with that
class_id and counting them.
{{ ctx.output_format }}
{{ items }}
"#
}
And here's the same function but using integer IDs:
class ItemInt {
class_id string
name string
}
class AggregationInt {
class_id int
count int
names string[]
}
function AggregateItemsInt(items: ItemInt[]) -> AggregationInt[] {
client CustomHaiku
prompt #"
Aggregate the items by their class_id.
The output should list each class_id seen in the inputs
exactly once, collecting all the names of items with that
class_id and counting them.
{{ ctx.output_format }}
{{ items }}
"#
}
The results
We ran an experiment with 200 items across 100 distinct class IDs using Claude Haiku. We ran each approach twice to check consistency. Here are the total error counts (sum of misspelled IDs, dropped IDs, and incorrect counts):
Approach | Run 1 | Run 2 | Average |
---|---|---|---|
Direct UUIDs | 29 errors | 68 errors | 48.5 errors |
Integer IDs | 7 errors | 5 errors | 6 errors |
UUID→Int Remapping | 5 errors | 6 errors | 5.5 errors |
The UUID approach is dramatically worse - Haiku makes 29-68 errors depending on the run. The model struggles to accurately read and reproduce the high-entropy UUID strings, leading to typos, truncations, and dropped identifiers.
By contrast, both the direct integer approach and the UUID remapping approach consistently produce only 5-7 errors total. The remapping technique gives you the best of both worlds: you can use UUIDs in your application code while getting integer-level accuracy from the LLM.
Implementing UUID remapping
Here's how to implement the remapping in Python:
from baml_client import b
from baml_client.types import ItemInt, ItemUuid, AggregationInt, AggregationUuid
def aggregate_items_with_remapping(items: list[ItemUuid]) -> list[AggregationUuid]:
# Step 1: Collect all UUIDs from input data
unique_uuids = list({item.class_id for item in items})
# Step 2: Assign them int IDs
uuid_to_int = {uuid: str(i) for i, uuid in enumerate(unique_uuids)}
int_to_uuid = {str(i): uuid for i, uuid in enumerate(unique_uuids)}
# Step 3: Replace UUIDs in prompt with corresponding ints
items_int = [
ItemInt(class_id=uuid_to_int[item.class_id], name=item.name)
for item in items
]
# Step 4: Make LLM request
result_int: list[AggregationInt] = b.AggregateItemsInt(items_int)
# Step 5: Map ints in response back to corresponding UUIDs
result_uuid = [
AggregationUuid(
class_id=int_to_uuid[str(agg.class_id)],
count=agg.count,
names=agg.names
)
for agg in result_int
]
return result_uuid
When to use this technique
This UUID remapping technique is valuable when:
- You have fewer than 1000 unique IDs (to stay in single-token territory)
- Your LLM task requires reading and reproducing IDs accurately
- You're seeing ID-related errors in your outputs
- Your IDs are high-entropy strings like UUIDs or hash values
You should benchmark your specific use case with and without UUID remapping. The benefit depends heavily on your exact task, the number of UUIDs in your prompt, and your model. For example, when we reduced our test to 100 items across 50 classes, all three approaches performed similarly (2-4 errors each), showing that UUID remapping provides diminishing returns for simpler tasks with fewer unique identifiers.
For cases where accuracy is critical, benchmarking is even more important. For instance, our experiment shows that Claude Haiku does not reach perfect aggregation accuracy, even using int identifiers. When we change the model to Opus4, we get 100% accuracy with ints but 80% accuracy with UUIDs.
For tasks where the LLM only needs to read IDs (not produce them), or where you have more than 1000 unique IDs, you may need different optimization strategies, such as breaking your data down into smaller batches.
Future: Automatic remapping in BAML
While the manual remapping approach works well, we're exploring ways to automate this pattern directly in BAML. Imagine a built-in RemappingId
type that you could use anywhere in your schemas:
class ItemUuid {
class_id RemappingId @description("UUID")
name string
}
class AggregationUuid {
class_id RemappingId @description("Item identifier (UUID)")
count int
names string[]
}
function AggregateItemsUuid(items: ItemUuid[]) -> AggregationUuid[] {
client CustomHaiku
prompt #"
Aggregate the items by their class IDs...
{{ ctx.output_format }}
{{ items }}
"#
}
The BAML runtime would automatically:
- Collect all
RemappingId
values from the input data - Build a UUID→int mapping for unique identifiers
- Replace all
RemappingId
fields with their integer equivalents in the prompt - Parse the LLM response using integer IDs
- Map the integers back to their original UUID values in the output
From the developer's perspective, you'd write your BAML functions as if you were working with UUIDs directly, but get the token efficiency and accuracy of integer IDs automatically.
If this kind of automatic optimization would be useful for your use case, we'd love to hear from you! Reach out to us on Discord or open a discussion on GitHub.
Conclusion
UUIDs are great for application code, but they're expensive and error-prone in LLM prompts. By remapping UUIDs to single-token integers before your LLM call, you can reduce both token costs and error rates significantly - in our experiments, from ~50% error rates down to < 5% for a 200-item aggregation task.
This is a simple transformation that can have a big impact on the reliability of your structured LLM outputs.