Skip to main content

Getting Started

This guide will walk you through
  • Setup
  • Your first request
  • Your first cache hits:
    • Exact content matching
    • Template-aware matching

Step 1: Get an API Key

Sign into the Butter dashboard. This will create an empty cache, and an API key you’ll use to authenticate requests. To view your active API keys, visit the Keys page. Make sure to save your Butter and other relevant API keys accordingly.
export BUTTER_API_KEY=your-butter-api-key
export OPENAI_API_KEY=your-openai-api-key

Step 2: Configure your Client

Modify your LLM client or curl command to use a Butter’s base URL.
export BASE_URL=https://proxy.butter.dev

Step 3: Make your First Request

Let’s submit our first message.
curl -X POST $BASE_URL/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Butter-Auth: Bearer $BUTTER_API_KEY" \
-d '{"messages":[{"role":"user", "content":"What is the English word for mantequilla?"}],"model":"gpt-4o"}'

Step 4: Make your Second Request

Invoke the previous request once more. You should see a cache hit.
{
  "id":"cached",
  # ...
  "choices":[
    {
      # ...
      "message":{
        "content":"The English word for \"mantequilla\" is \"butter\".",
        "role":"assistant"
      }
    }
  ],
}

Step 5: View your Cache

Visit the Requests feed to view your requests and their cache status. The cache is a tree of all message sequences that have been seen. You can view a message’s respective node in the tree by clicking on it from the request view.

Next Steps

Now that you’ve run your first requests, continue onward to template-aware caching, or learn more about the cache structure.