Skip to content

Detect context length exceeded errors on HTTP 400 responses#642

Merged
crmne merged 1 commit into
crmne:mainfrom
plehoux:fix/context-length-exceeded-on-400
Feb 27, 2026
Merged

Detect context length exceeded errors on HTTP 400 responses#642
crmne merged 1 commit into
crmne:mainfrom
plehoux:fix/context-length-exceeded-on-400

Conversation

@plehoux

@plehoux plehoux commented Feb 27, 2026

Copy link
Copy Markdown
Contributor

What this does

OpenAI returns HTTP 400 (not 429) when the context length is exceeded. The error message looks like:

This model's maximum context length is 8192 tokens. However, your messages resulted in 10061 tokens (7911 in the messages, 2150 in the functions). Please reduce the length of the messages or functions.

Currently, ErrorMiddleware only checks context_length_exceeded? on 429 responses. This means OpenAI's context length errors surface as BadRequestError instead of ContextLengthExceededError, which breaks any downstream logic that relies on catching ContextLengthExceededError (e.g. conversation compaction, retry strategies).

This PR adds the same context_length_exceeded? check to the 400 handler, so context length errors are correctly classified regardless of whether the provider returns 400 or 429.

Type of change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation
  • Performance improvement

Scope check

  • I read the Contributing Guide
  • This aligns with RubyLLM's focus on LLM communication
  • This isn't application-specific logic that belongs in user code
  • This benefits most users, not just my specific use case

Required for new features

N/A — this is a bug fix.

Quality check

  • I ran overcommit --install and all hooks pass
  • I tested my changes thoroughly
  • For provider changes: Re-recorded VCR cassettes with bundle exec rake vcr:record[provider_name]
    • All tests pass: bundle exec rspec
  • I updated documentation if needed
  • I didn't modify auto-generated files manually (models.json, aliases.json)

AI-generated code

  • I used AI tools to help write this code
  • I have reviewed and understand all generated code (required if above is checked)

API changes

  • Breaking change
  • New public methods/classes
  • Changed method signatures
  • No API changes

OpenAI returns HTTP 400 (not 429) for GPT-4 when context length is exceeded,

e.g. "This model's maximum context length is 8192 tokens." The
context_length_exceeded? check was only applied to 429 responses,
causing these to surface as BadRequestError instead of
ContextLengthExceededError.
@plehoux plehoux force-pushed the fix/context-length-exceeded-on-400 branch from 21b4617 to be84b6d Compare February 27, 2026 17:40
@codecov

codecov Bot commented Feb 27, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.14%. Comparing base (bed3bfb) to head (be84b6d).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #642   +/-   ##
=======================================
  Coverage   80.13%   80.14%           
=======================================
  Files         113      113           
  Lines        5095     5097    +2     
  Branches     1307     1308    +1     
=======================================
+ Hits         4083     4085    +2     
  Misses       1012     1012           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@crmne crmne merged commit 66a665f into crmne:main Feb 27, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants