Detect context length exceeded errors on HTTP 400 responses#642
Merged
Conversation
OpenAI returns HTTP 400 (not 429) for GPT-4 when context length is exceeded, e.g. "This model's maximum context length is 8192 tokens." The context_length_exceeded? check was only applied to 429 responses, causing these to surface as BadRequestError instead of ContextLengthExceededError.
21b4617 to
be84b6d
Compare
crmne
approved these changes
Feb 27, 2026
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #642 +/- ##
=======================================
Coverage 80.13% 80.14%
=======================================
Files 113 113
Lines 5095 5097 +2
Branches 1307 1308 +1
=======================================
+ Hits 4083 4085 +2
Misses 1012 1012 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this does
OpenAI returns HTTP 400 (not 429) when the context length is exceeded. The error message looks like:
Currently,
ErrorMiddlewareonly checkscontext_length_exceeded?on 429 responses. This means OpenAI's context length errors surface asBadRequestErrorinstead ofContextLengthExceededError, which breaks any downstream logic that relies on catchingContextLengthExceededError(e.g. conversation compaction, retry strategies).This PR adds the same
context_length_exceeded?check to the 400 handler, so context length errors are correctly classified regardless of whether the provider returns 400 or 429.Type of change
Scope check
Required for new features
N/A — this is a bug fix.
Quality check
overcommit --installand all hooks passbundle exec rake vcr:record[provider_name]bundle exec rspecmodels.json,aliases.json)AI-generated code
API changes