Skip to content

fix(modelinfo): offer the max effort tier in the Shift+Tab thinking cycle#3178

Merged
Sayt-0 merged 1 commit into
mainfrom
fix/3100-anthropic-thinking-max-tier
Jun 19, 2026
Merged

fix(modelinfo): offer the max effort tier in the Shift+Tab thinking cycle#3178
Sayt-0 merged 1 commit into
mainfrom
fix/3100-anthropic-thinking-max-tier

Conversation

@Sayt-0

@Sayt-0 Sayt-0 commented Jun 19, 2026

Copy link
Copy Markdown
Member

What

The TUI Shift+Tab thinking-level cycle hid the max effort tier on several Claude models that support it — Opus 4.7/4.8 capped at xhigh, Sonnet 4.6 capped at high, and Fable 5 capped at xhigh.

Root cause: anthropicTopEffort() returned a single highest tier, implicitly treating the selectable tiers as a linear ladder. But xhigh and max are independent capabilities, not a ladder: Opus 4.6 / Sonnet 4.6 accept max without xhigh, while Opus 4.7+ / Fable 5 / Mythos 5 accept both. A single returned tier cannot express that, and because max is an explicit-only tier it was dropped by effort.SupportedLevels().

This is a selector-gating fix only: the request path (pkg/model/provider/anthropic/thinking.go) already sends adaptive/<effort> verbatim, so authored thinking_budget values were never affected.

Fix

  • Replace anthropicTopEffort (one tier) with anthropicTopEfforts, which returns the full {xhigh, max} subset a model supports.
  • Recognise the Sonnet family (4.6 → max) — previously only Opus/Fable were handled.
  • Handle Mythos ids best-effort (no catalogue entry exists yet): every variant → max, the full release adds xhigh, the preview tops out at max.

Capability matrix (now matches Anthropic's effort docs)

Model low medium high xhigh max
Opus 4.5
Sonnet 4.5 / Haiku
Sonnet 4.6
Opus 4.6
Opus 4.7
Opus 4.8
Fable 5
Mythos 5
Mythos Preview

Tests

  • Unit (TestSupportedThinkingLevels, TestAnthropicTopEfforts): all matrix models + bedrock-prefixed / dotted / date-stamped variants.
  • End-to-end (TestCycleAgentThinkingLevel_PerModelTopTier): walks a full Shift+Tab cycle for all 9 models — the automated equivalent of the manual repro.

Verified: gofmt clean, golangci-lint 0 issues, go vet clean, all tests green.

Out of scope (follow-ups surfaced by review)

  • Docs (docs/providers/anthropic/index.md, docs/guides/thinking/index.md) still describe effort-based adaptive thinking as "Opus 4.6+ only" — predates this PR and concerns the API request path, not the cycle.
  • RejectsTokenThinking enumerates only Opus 4.6/4.7/4.8 — the issue explicitly scopes that path as not affected; whether Sonnet 4.6 / Fable / Mythos reject token budgets needs authoritative confirmation before changing.

Fixes #3100

@Sayt-0 Sayt-0 requested a review from a team as a code owner June 19, 2026 15:58
The interactive thinking-level cycle hid the `max` effort tier on Claude
models that support it. `anthropicTopEffort` returned a single "top" tier,
implicitly treating the selectable tiers as a linear ladder: Opus 4.7/4.8
and Fable 5 surfaced only `xhigh` and never `max`, and Sonnet 4.6 surfaced
neither.

xhigh and max are independent capabilities, not a single ladder: Opus 4.6
and Sonnet 4.6 accept max without xhigh, while Opus 4.7+, Fable 5 and
Mythos 5 accept both. Replace `anthropicTopEffort` with
`anthropicTopEfforts`, which returns the full {xhigh, max} subset a model
supports per Anthropic's effort matrix. Sonnet is now recognised
(4.6 -> max), and Mythos ids are handled best-effort (no catalogue entry
yet).

Selector-gating fix only: the request path already sends adaptive/<effort>
verbatim, so authored thinking_budget values were unaffected.

Fixes #3100
@Sayt-0 Sayt-0 force-pushed the fix/3100-anthropic-thinking-max-tier branch from b6ccbae to 7a06c95 Compare June 19, 2026 16:00

@docker-agent docker-agent left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

The fix correctly replaces the single-return anthropicTopEffort with anthropicTopEfforts (returning a slice), properly handling the xhigh and max capability independence, adding Sonnet 4.6 and Mythos support. The capability matrix logic and tests are thorough and cover the documented cases well.

One low-severity structural note (no blocking issues):

In claudeFamilyMinor, the guard branches inside the for range loop use return "", 0, false instead of continue:

if rest == "" || (rest[0] != '-' && rest[0] != '.') {
    return "", 0, false  // exits loop entirely instead of trying next family
}
minor, width := leadingInt(rest[1:])
if width == 0 || width > 2 {
    return "", 0, false  // same — stops after first family match attempt
}

If a model ID contains an opus prefix substring that fails the format guards, the sonnet family is never tried. In practice, real Claude IDs are single-family, so this is not a regression. A continue would make the intent explicit and guard against future ambiguity. Not blocking.

The test coverage is excellent — unit tests cover all 9 capability-matrix models plus Bedrock-prefixed, dotted, and date-stamped variants, and the new end-to-end TestCycleAgentThinkingLevel_PerModelTopTier is a solid regression guard for the Shift+Tab cycle.

continue
}
if rest == "" || (rest[0] != '-' && rest[0] != '.') {
return "", 0, false

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW] Loop guard uses return instead of continue, preventing subsequent family checks

The two guard branches inside the for range loop call return "", 0, false rather than continue. This means: if a model ID contains an "opus" substring that partially matches fam+"-4" but then fails the separator or digit-width check, the function exits entirely without trying "sonnet".

For all real Claude IDs today this is harmless — each ID contains exactly one family name. But the structural issue means future composite or synthetic IDs would be silently mis-classified. Using continue would match the intent of the loop and be more defensive:

if rest == "" || (rest[0] != '-' && rest[0] != '.') {
    continue  // try next family instead of returning
}
minor, width := leadingInt(rest[1:])
if width == 0 || width > 2 {
    continue  // same
}

Not blocking — no current model IDs are affected.

@aheritier aheritier added area/providers For features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.) area/tui For features/issues/fixes related to the TUI kind/fix PR fixes a bug (maps to fix: commit prefix) labels Jun 19, 2026
@Sayt-0 Sayt-0 merged commit 6ce5097 into main Jun 19, 2026
10 checks passed
@Sayt-0 Sayt-0 deleted the fix/3100-anthropic-thinking-max-tier branch June 19, 2026 16:08
pull Bot pushed a commit to TheTechOddBug/cagent that referenced this pull request Jun 20, 2026
…hinking cycle

PR docker#3178 fixed the Shift+Tab thinking level cycle to offer the 'max'
effort tier on Claude models that support it.

Changes:
- docs/guides/thinking/index.md: Replace the flat effort-level table
  with a per-model capability matrix. Correct the Shift+Tab cycle
  description to include 'max' for Opus 4.7+, Fable 5, Mythos 5; and
  document that Sonnet 4.6 and Opus 4.6 cycle to 'max' but not 'xhigh'.
  Update the adaptive-thinking heading and callout to mention Sonnet 4.6.
- docs/providers/anthropic/index.md: Change 'Claude Opus 4.6+ only' to
  'Claude Opus 4.6+, Sonnet 4.6' in the adaptive/effort-based section;
  update the coercion note to include Sonnet 4.6.

Source: docker#3178
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/providers For features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.) area/tui For features/issues/fixes related to the TUI kind/fix PR fixes a bug (maps to fix: commit prefix)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shift+Tab thinking cycle hides the max effort tier on Opus 4.7/4.8 (and Sonnet 4.6, Fable 5) — single-top-tier gating can't express xhigh + max

4 participants