Skip to main content

What is Template Induction?

Template induction is Butter’s automatic process for detecting variable patterns in your LLM requests and responses. Instead of requiring you to manually specify bindings for every variable, Butter learns the structure of your requests over time and automatically identifies reusable templates. For example, given the query-response pair Say hello to ErikHello, Erik, Butter will automatically induce the templates Say hello to {{name}}Hello, {{name}}. That way, Butter can later respond to any matching query like Say hello to Raymond, understanding name="Raymond".

How It Works

Template induction runs asynchronously whenever Butter cannot respond to a query from cache. This is enable by default. Butter analyzes the content of your messages and applies a multi-stage pipeline to identify variables and their interrelationships. This process happens in the background, so it doesn’t impact your request latency.

Template Induction vs Manual Bindings

There are two ways to trigger Butter to create templates from messages:

Automatic (Template Induction)

  • Butter learns patterns automatically from your requests
  • No code changes required
  • Works best for common, recurring patterns

Manual (Butter-Bindings)

  • You explicitly declare variables using the Butter-Bindings header
  • Immediate template creation on first request
  • Useful when you know the pattern upfront
Both approaches can be used together. Manual bindings can jumpstart the template learning process, while automatic induction picks up on patterns on its own.

Disabling Template Induction

For certain use cases, you may want to disable automatic template induction:
  • When handling sensitive or unique data that shouldn’t be templated
  • During initial testing or debugging
  • When you want complete control over template creation
See the Butter-Disable-Induction reference for details on how to opt out. This involves passing an extra header in your requests setting Butter-Disable-Induction to "true".
Template induction is experimental and actively being improved. If you encounter unexpected behavior, you can disable it on a per-request basis while we continue to refine the feature.