fix(modelinfo): offer the max effort tier in the Shift+Tab thinking cycle by Sayt-0 · Pull Request #3178 · docker/docker-agent

Sayt-0 · 2026-06-19T15:58:00Z

What

The TUI Shift+Tab thinking-level cycle hid the max effort tier on several Claude models that support it — Opus 4.7/4.8 capped at xhigh, Sonnet 4.6 capped at high, and Fable 5 capped at xhigh.

Root cause: anthropicTopEffort() returned a single highest tier, implicitly treating the selectable tiers as a linear ladder. But xhigh and max are independent capabilities, not a ladder: Opus 4.6 / Sonnet 4.6 accept max without xhigh, while Opus 4.7+ / Fable 5 / Mythos 5 accept both. A single returned tier cannot express that, and because max is an explicit-only tier it was dropped by effort.SupportedLevels().

This is a selector-gating fix only: the request path (pkg/model/provider/anthropic/thinking.go) already sends adaptive/<effort> verbatim, so authored thinking_budget values were never affected.

Fix

Replace anthropicTopEffort (one tier) with anthropicTopEfforts, which returns the full {xhigh, max} subset a model supports.
Recognise the Sonnet family (4.6 → max) — previously only Opus/Fable were handled.
Handle Mythos ids best-effort (no catalogue entry exists yet): every variant → max, the full release adds xhigh, the preview tops out at max.

Capability matrix (now matches Anthropic's effort docs)

Model	low	medium	high	xhigh	max
Opus 4.5	✓	✓	✓	—	—
Sonnet 4.5 / Haiku	✓	✓	✓	—	—
Sonnet 4.6	✓	✓	✓	—	✓
Opus 4.6	✓	✓	✓	—	✓
Opus 4.7	✓	✓	✓	✓	✓
Opus 4.8	✓	✓	✓	✓	✓
Fable 5	✓	✓	✓	✓	✓
Mythos 5	✓	✓	✓	✓	✓
Mythos Preview	✓	✓	✓	—	✓

Tests

Unit (TestSupportedThinkingLevels, TestAnthropicTopEfforts): all matrix models + bedrock-prefixed / dotted / date-stamped variants.
End-to-end (TestCycleAgentThinkingLevel_PerModelTopTier): walks a full Shift+Tab cycle for all 9 models — the automated equivalent of the manual repro.

Verified: gofmt clean, golangci-lint 0 issues, go vet clean, all tests green.

Out of scope (follow-ups surfaced by review)

Docs (docs/providers/anthropic/index.md, docs/guides/thinking/index.md) still describe effort-based adaptive thinking as "Opus 4.6+ only" — predates this PR and concerns the API request path, not the cycle.
RejectsTokenThinking enumerates only Opus 4.6/4.7/4.8 — the issue explicitly scopes that path as not affected; whether Sonnet 4.6 / Fable / Mythos reject token budgets needs authoritative confirmation before changing.

Fixes #3100

The interactive thinking-level cycle hid the `max` effort tier on Claude models that support it. `anthropicTopEffort` returned a single "top" tier, implicitly treating the selectable tiers as a linear ladder: Opus 4.7/4.8 and Fable 5 surfaced only `xhigh` and never `max`, and Sonnet 4.6 surfaced neither. xhigh and max are independent capabilities, not a single ladder: Opus 4.6 and Sonnet 4.6 accept max without xhigh, while Opus 4.7+, Fable 5 and Mythos 5 accept both. Replace `anthropicTopEffort` with `anthropicTopEfforts`, which returns the full {xhigh, max} subset a model supports per Anthropic's effort matrix. Sonnet is now recognised (4.6 -> max), and Mythos ids are handled best-effort (no catalogue entry yet). Selector-gating fix only: the request path already sends adaptive/<effort> verbatim, so authored thinking_budget values were unaffected. Fixes #3100

docker-agent

Assessment: 🟢 APPROVE

The fix correctly replaces the single-return anthropicTopEffort with anthropicTopEfforts (returning a slice), properly handling the xhigh and max capability independence, adding Sonnet 4.6 and Mythos support. The capability matrix logic and tests are thorough and cover the documented cases well.

One low-severity structural note (no blocking issues):

In claudeFamilyMinor, the guard branches inside the for range loop use return "", 0, false instead of continue:

if rest == "" || (rest[0] != '-' && rest[0] != '.') {
    return "", 0, false  // exits loop entirely instead of trying next family
}
minor, width := leadingInt(rest[1:])
if width == 0 || width > 2 {
    return "", 0, false  // same — stops after first family match attempt
}

If a model ID contains an opus prefix substring that fails the format guards, the sonnet family is never tried. In practice, real Claude IDs are single-family, so this is not a regression. A continue would make the intent explicit and guard against future ambiguity. Not blocking.

The test coverage is excellent — unit tests cover all 9 capability-matrix models plus Bedrock-prefixed, dotted, and date-stamped variants, and the new end-to-end TestCycleAgentThinkingLevel_PerModelTopTier is a solid regression guard for the Shift+Tab cycle.

docker-agent · 2026-06-19T16:01:05Z

+			continue
+		}
+		if rest == "" || (rest[0] != '-' && rest[0] != '.') {
+			return "", 0, false


[LOW] Loop guard uses return instead of continue, preventing subsequent family checks

The two guard branches inside the for range loop call return "", 0, false rather than continue. This means: if a model ID contains an "opus" substring that partially matches fam+"-4" but then fails the separator or digit-width check, the function exits entirely without trying "sonnet".

For all real Claude IDs today this is harmless — each ID contains exactly one family name. But the structural issue means future composite or synthetic IDs would be silently mis-classified. Using continue would match the intent of the loop and be more defensive:

if rest == "" || (rest[0] != '-' && rest[0] != '.') { continue // try next family instead of returning } minor, width := leadingInt(rest[1:]) if width == 0 || width > 2 { continue // same }

Not blocking — no current model IDs are affected.

…hinking cycle PR docker#3178 fixed the Shift+Tab thinking level cycle to offer the 'max' effort tier on Claude models that support it. Changes: - docs/guides/thinking/index.md: Replace the flat effort-level table with a per-model capability matrix. Correct the Shift+Tab cycle description to include 'max' for Opus 4.7+, Fable 5, Mythos 5; and document that Sonnet 4.6 and Opus 4.6 cycle to 'max' but not 'xhigh'. Update the adaptive-thinking heading and callout to mention Sonnet 4.6. - docs/providers/anthropic/index.md: Change 'Claude Opus 4.6+ only' to 'Claude Opus 4.6+, Sonnet 4.6' in the adaptive/effort-based section; update the coercion note to include Sonnet 4.6. Source: docker#3178

Sayt-0 requested a review from a team as a code owner June 19, 2026 15:58

dgageot approved these changes Jun 19, 2026

View reviewed changes

Sayt-0 force-pushed the fix/3100-anthropic-thinking-max-tier branch from b6ccbae to 7a06c95 Compare June 19, 2026 16:00

docker-agent reviewed Jun 19, 2026

View reviewed changes

aheritier added area/providers For features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.) area/tui For features/issues/fixes related to the TUI kind/fix PR fixes a bug (maps to fix: commit prefix) labels Jun 19, 2026

Sayt-0 merged commit 6ce5097 into main Jun 19, 2026
10 checks passed

Sayt-0 deleted the fix/3100-anthropic-thinking-max-tier branch June 19, 2026 16:08

BrewTestBot mentioned this pull request Jun 19, 2026

docker-agent 1.83.0 Homebrew/homebrew-core#288848

Merged

aheritier mentioned this pull request Jun 20, 2026

docs: update /docs for PRs merged 2026-06-18–20 #3183

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(modelinfo): offer the max effort tier in the Shift+Tab thinking cycle#3178

fix(modelinfo): offer the max effort tier in the Shift+Tab thinking cycle#3178
Sayt-0 merged 1 commit into
mainfrom
fix/3100-anthropic-thinking-max-tier

Sayt-0 commented Jun 19, 2026 •

edited

Loading

Uh oh!

docker-agent left a comment

Uh oh!

docker-agent Jun 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Sayt-0 commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Fix

Capability matrix (now matches Anthropic's effort docs)

Tests

Out of scope (follow-ups surfaced by review)

Uh oh!

docker-agent left a comment

Choose a reason for hiding this comment

Assessment: 🟢 APPROVE

Uh oh!

docker-agent Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Sayt-0 commented Jun 19, 2026 •

edited

Loading