Skip to content

Fix extract_cacheable_prefix to handle string content with message-level cache_control#19266

Merged
Sameerlite merged 1 commit into
BerriAI:litellm_staging_01_20_2026from
VedantMadane:fix-prompt-caching-string-content
Jan 20, 2026
Merged

Fix extract_cacheable_prefix to handle string content with message-level cache_control#19266
Sameerlite merged 1 commit into
BerriAI:litellm_staging_01_20_2026from
VedantMadane:fix-prompt-caching-string-content

Conversation

@VedantMadane

@VedantMadane VedantMadane commented Jan 17, 2026

Copy link
Copy Markdown
Contributor

Relevant issues

Fixes #19228

Summary

Fixed extract_cacheable_prefix in PromptCachingCache to correctly handle messages where cache_control is a sibling key of string content.

Problem

When a message has string content with a sibling cache_control key like:
json { "role": "user", "content": "large_message", "cache_control": {"type": "ephemeral", "ttl": "5m"} }

The extract_cacheable_prefix function would return an empty prefix because it only looked for cache_control inside content blocks (when content is a list), not at the message level.

This is a valid message format per LiteLLM's ChatCompletionUserMessage type definition (lines 692-693 in types/llms/openai.py).

Solution

Added a check for cache_control at the message level before checking within content blocks. When a message has message-level cache_control with type="ephemeral", the entire message is now correctly included in the cacheable prefix.

Changes

  • litellm/router_utils/prompt_caching_cache.py:

    • Added check for message-level cache_control in extract_cacheable_prefix
    • Set last_cacheable_content_idx = None to indicate entire message is cacheable
  • tests/router_unit_tests/test_router_prompt_caching.py:

    • Added 3 new tests for string content with message-level cache_control
    • Tests cover: string with cache_control, string without cache_control, mixed formats

Pre-Submission checklist

  • I have Added testing in the tests/litellm/ directory
  • My PR passes all unit tests on make test-unit (unable to run locally due to dependencies)
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

Bug Fix

@CLAassistant

CLAassistant commented Jan 17, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@vercel

vercel Bot commented Jan 17, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
litellm Error Error Jan 17, 2026 5:07am

Request Review

@Sameerlite Sameerlite changed the base branch from main to litellm_staging_01_20_2026 January 20, 2026 04:41

@Sameerlite Sameerlite left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Sameerlite Sameerlite merged commit 931998f into BerriAI:litellm_staging_01_20_2026 Jan 20, 2026
4 of 8 checks passed
fzowl pushed a commit to fzowl/litellm that referenced this pull request Jun 24, 2026
…-string-content

Fix extract_cacheable_prefix to handle string content with message-level cache_control
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: PromptCachingCache extract_cacheable_prefix broken when message.content is a string?

3 participants