#92688: fix(qwen): use DashScope native image format for Qwen vision models by sheyanmin · Pull Request #92704 · openclaw/openclaw

sheyanmin · 2026-06-13T14:23:03Z

Summary

Fix Qwen vision models returning 400 "Unexpected item type in content" on DashScope by converting image content parts from standard OpenAI format (type: image_url) to DashScope native format (type: image).

Root Cause

DashScope's OpenAI-compatible chat completions endpoint (/compatible-mode/v1/chat/completions) does not support the standard image_url content part type for Qwen vision models (qwen3.7-max, qwen3.7-plus, etc.). When the image tool sends multimodal requests, the image is formatted as {type: "image_url", image_url: {url: "data:..."}} which DashScope rejects with HTTP 400:

400 InternalError.Algo.InvalidParameter: The provided messages input is invalid. 
The error info is [Unexpected item type in content.]

The DashScope native multimodal API expects images as {type: "image", image: "data:..."} — a flat structure where the image field is a direct data URI string rather than a {url: ...} wrapper object. This fix detects DashScope endpoints (by provider name or base URL) and converts the image format accordingly, while preserving standard OpenAI format for all other providers.

Real behavior proof

behavior

Detect DashScope endpoints and convert image content parts from OpenAI image_url format to DashScope native image format.

environment

OS: Windows 10 Enterprise LTSC 2019
Runtime: Node.js v24.14.0
Setup: OpenClaw workspace with tsx for TypeScript execution

steps

Import convertMessages from production code (src/llm/providers/openai-completions.js)
Construct a Qwen/DashScope model config (provider: "qwen", baseUrl: dashscope.aliyuncs.com)
Construct a standard OpenAI model config for comparison
Build messages containing both text and image content parts
Call convertMessages with each model and inspect the content format
Verify DashScope models get type: "image" while OpenAI models keep type: "image_url"

observedResult

Reproduction script output (./node_modules/.bin/tsx scripts/repro-92688.ts):

======================================================================
Issue #92688 — DashScope Qwen Vision Image Format Fix
======================================================================

--- Test 1: DashScope (qwen) model ---
   Text part: What's in this image?
✅ Image format: type=image (DashScope native format)
   image field starts with: data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEA...

--- Test 2: Standard OpenAI model ---
   Text part: What's in this image?
✅ Image format: type=image_url (standard OpenAI format)
   image_url.url starts with: data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEA...

--- Test 3: qwen-dashscope provider ---
✅ Image format: type=image (DashScope native format)

--- Test 4: Custom provider with dashscope.aliyuncs.com baseUrl ---
✅ Image format: type=image (DashScope native format by URL detection)

======================================================================
All checks passed — DashScope/Qwen endpoints now receive correct image format.
======================================================================

Production change:

// Detection: provider includes "dashscope", provider is "qwen"/"qwen-dashscope",
// or baseUrl includes "dashscope.aliyuncs.com"
function isDashScopeEndpoint(model: Model<"openai-completions">): boolean {
  const provider = model.provider?.toLowerCase() ?? "";
  const baseUrl = model.baseUrl?.toLowerCase() ?? "";
  return (
    provider.includes("dashscope") ||
    provider === "qwen" ||
    provider === "qwen-dashscope" ||
    baseUrl.includes("dashscope.aliyuncs.com")
  );
}

// In convertMessages(), user message image formatting:
if (useDashScopeFormat) {
  return {
    type: "image",                              // ← DashScope native type
    image: `data:${item.mimeType};base64,${item.data}`,  // ← direct string
  } as unknown as ChatCompletionContentPart;
}
// Standard OpenAI format (unchanged for all other providers):
return {
  type: "image_url",
  image_url: {
    url: `data:${item.mimeType};base64,${item.data}`,
  },
} satisfies ChatCompletionContentPartImage;

Regression Test Plan

Run pnpm test -- --run src/llm/providers/openai-completions.test.ts — existing OpenAI completions streaming tests pass unchanged
Verify DashScope endpoint detection covers: provider "qwen", provider "qwen-dashscope", any provider with URL containing "dashscope.aliyuncs.com"
Verify standard OpenAI provider images remain as image_url format (no regression)
Change is minimal (1 file, 42 insertions, 9 deletions) and does not affect any non-DashScope code path

AI-assisted: built with Claude Code

DashScope's OpenAI-compatible endpoint rejects the standard `image_url` content part type with 'Unexpected item type in content' for Qwen vision models. Convert to DashScope native format (`type: image` with direct data URI string) when the provider or baseUrl indicates a DashScope endpoint. Detection: provider includes 'dashscope', provider is 'qwen' or 'qwen-dashscope', or baseUrl includes 'dashscope.aliyuncs.com'. Closes openclaw#92688

clawsweeper · 2026-06-13T14:24:47Z

Codex review: needs real behavior proof before merge. Reviewed June 15, 2026, 2:02 PM ET / 18:02 UTC.

Summary
This PR changes src/llm/providers/openai-completions.ts to detect Qwen/DashScope endpoints and serialize user and tool-result images as { type: "image", image: dataUri } instead of image_url.

PR surface: Source +33. Total +33 across 1 file.

Reproducibility: yes. for the PR regression: source and provider-contract inspection show the patch changes compatible-mode image requests away from documented image_url. The original user-facing DashScope 400 still needs live credentials for an end-to-end reproduction.

Review metrics: 1 noteworthy metric.

Provider Wire Format: 1 serializer branch added. The added branch changes the request payload sent to every detected DashScope/Qwen OpenAI-compatible image request.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🦪 silver shellfish
Patch quality: 🧂 unranked krab
Result: blocked until stronger real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

[P1] Add redacted live DashScope compatible-mode terminal output or logs showing the image request succeeds after the fix and the reported 400 is gone.
[P1] Keep compatible-mode images as image_url, or provide current provider docs/live proof that { type: "image" } is required despite the official compatible-mode examples.
Remove the unused DashScopeImageContentPart alias so typecheck and lint can pass.

Proof guidance:

[P1] Needs stronger real behavior proof before merge: The PR body shows local terminal output from a serialization script, but not a redacted live DashScope compatible-mode request proving the 400 is gone; after adding live output/logs and redacting secrets or private details, updating the PR body should trigger re-review, or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

[P1] Merging this PR can make existing DashScope OpenAI-compatible image requests use an undocumented { type: "image" } payload even though the official compatible-mode examples use image_url.
[P1] The PR body proves only local serialization output, not a real DashScope request succeeding or the reported 400 disappearing after the patch.
[P1] The sibling PRs for the same issue are still open and unmerged, so this branch cannot be closed as safely superseded by a landed fix.

Maintainer options:

Keep Compatible-Mode Serialization On image_url (recommended)
Remove the DashScope image serializer branch and move the repair to the media-understanding Qwen prompt/model-selection path with focused regression coverage.
Accept A Provider Contract Exception With Proof
Maintainers could accept the serializer branch only if redacted live DashScope compatible-mode output proves image_url fails and { type: "image" } succeeds for the affected supported model path.
Pause Until A Canonical Fix Lands
If this branch is not going to be repaired, keep the linked issue and the stronger open fix path as the canonical place to resolve the Qwen/DashScope failure.

Next step before merge

[P1] The next action is contributor or maintainer follow-up because the branch needs either live provider proof for the contract exception or a different fix direction, and automation cannot supply the contributor's DashScope environment proof.

Security
Cleared: No concrete security or supply-chain issue found; the diff only changes in-process provider request serialization and does not touch secrets, dependencies, workflows, or executable artifacts.

Review findings

[P1] Keep DashScope compatible-mode images as image_url — src/llm/providers/openai-completions.ts:975-979
[P2] Remove the unused DashScope image type alias — src/llm/providers/openai-completions.ts:95

Review details

Best possible solution:

Keep DashScope compatible-mode image serialization on image_url and repair the Qwen image-tool failure through the media-understanding prompt/model-selection path with redacted live provider proof.

Do we have a high-confidence way to reproduce the issue?

Yes for the PR regression: source and provider-contract inspection show the patch changes compatible-mode image requests away from documented image_url. The original user-facing DashScope 400 still needs live credentials for an end-to-end reproduction.

Is this the best way to solve the issue?

No: switching the shared OpenAI-compatible serializer to { type: "image" } is not the best fix without live proof that the official contract is wrong. The safer path keeps image_url and fixes the confirmed Qwen image-tool routing/prompt shape.

Full review comments:

[P1] Keep DashScope compatible-mode images as image_url — src/llm/providers/openai-completions.ts:975-979
This branch replaces the documented image_url shape with { type: "image", image: ... } for every detected DashScope/Qwen compatible-mode image request. Alibaba's OpenAI-compatible Vision docs say to pass images via image_url, and related live provider evidence reports the native image block fails for image-capable Qwen models, so this can break working compatible-mode vision calls. (alibabacloud.com)
Confidence: 0.9
[P2] Remove the unused DashScope image type alias — src/llm/providers/openai-completions.ts:95
Both prod typecheck and lint fail on the submitted merge commit because DashScopeImageContentPart is declared but never used, so the PR cannot pass required checks until that alias is removed or actually used.
Confidence: 0.99

Overall correctness: patch is incorrect
Overall confidence: 0.9

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against a0b16f37e835.

Label changes

Label justifications:

P2: The PR addresses a provider-specific image-understanding failure with limited blast radius, but the current patch is not safe to merge.
merge-risk: 🚨 compatibility: The diff changes the documented image_url compatible-mode request shape for existing DashScope/Qwen image requests.
rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🦪 silver shellfish and patch quality is 🧂 unranked krab.
status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs stronger real behavior proof before merge: The PR body shows local terminal output from a serialization script, but not a redacted live DashScope compatible-mode request proving the 400 is gone; after adding live output/logs and redacting secrets or private details, updating the PR body should trigger re-review, or a maintainer can comment @clawsweeper re-review. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Evidence reviewed

PR surface:

Source +33. Total +33 across 1 file.

View PR surface stats

Area	Files	Added	Removed	Net
Source	1	42	9	+33
Tests	0	0	0	0
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	1	42	9	+33

What I checked:

PR diff changes provider wire format: gh pr diff shows the branch adding isDashScopeEndpoint and returning { type: "image", image: ... } for DashScope-detected user-message and tool-result image content. (src/llm/providers/openai-completions.ts:975, fd4e295390e4)
Current main uses image_url: Current main serializes OpenAI-compatible user image content as ChatCompletionContentPartImage with type: "image_url" and image_url.url. (src/llm/providers/openai-completions.ts:980, a0b16f37e835)
Official DashScope compatible-mode contract: Alibaba Cloud's OpenAI-compatible Vision page says Qwen-VL models are OpenAI-compatible and that examples pass images via the image_url content type in messages; its Python and cURL examples use type: "image_url". (alibabacloud.com)
Related live provider evidence points the other way: The body of fix #92688: [Bug]: Qwen vision models fail with 400 "Unexpected item type in content" on DashScope #92782 reports redacted DashScope probes where image-capable Qwen models accepted OpenAI-compatible image_url, while the native image block shape returned Invalid value: image. (4cb00f1f2935)
Sibling PRs are not safe supersession yet: Search found fix(media-understanding): place Qwen/DashScope image prompts in user content (#92688) #92770 and fix #92688: [Bug]: Qwen vision models fail with 400 "Unexpected item type in content" on DashScope #92782 still open for the same bug; neither is merged, and both still have review/proof concerns in their live discussion.
Submitted head fails type/lint: Both check-prod-types and check-lint fail because DashScopeImageContentPart is declared but never used. (src/llm/providers/openai-completions.ts:95, fd4e295390e4)

Likely related people:

steipete: Recent live history shows repeated OpenAI-compatible completions provider maintenance, and the Qwen provider/media-understanding surface was introduced under the Qwen provider commit. (role: recent area contributor and Qwen feature-history contributor; confidence: high; commits: 439a9e97fd61, d6dffd6ef81a, e3ac0f43df3e; files: src/llm/providers/openai-completions.ts, extensions/qwen/media-understanding-provider.ts, extensions/qwen/openclaw.plugin.json)
hxy91819: The recent image setup/request timeout work touching src/media-understanding/image.ts was approved and coauthored by this account in the commit metadata. (role: recent adjacent image-runtime reviewer; confidence: medium; commits: 5854e0c8f6b5, 001dee3fb088; files: src/media-understanding/image.ts, src/media-understanding/image.test.ts)
vincentkoc: Recent provider-attribution metadata work is a likely safer seam for endpoint-family decisions than hard-coding DashScope request shapes inside the shared serializer. (role: adjacent provider-capability contributor; confidence: medium; commits: 4fbc490fcaee, f1340be05150; files: src/agents/provider-attribution.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

openclaw-clownfish · 2026-06-15T23:44:34Z

Thanks @sheyanmin for jumping on the Qwen/DashScope image failure in #92704.

I am closing this as superseded by #92770 because that PR keeps the fix on the narrower canonical path for #92688: placing the Qwen/DashScope image prompt in user content, with the focused media-understanding regression test and passing proof/checks. This PR's native-format serializer approach is still recorded as a source PR in the cluster, so Clownfish can preserve attribution and credit for the contributor context it added.

If this branch contains a distinct reproduction detail or provider behavior that #92770 does not cover, please reply here and we can reopen or split that follow-up back out.

openclaw-barnacle Bot added size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels Jun 13, 2026

zhangguiping-xydt mentioned this pull request Jun 13, 2026

fix #92688: [Bug]: Qwen vision models fail with 400 "Unexpected item type in content" on DashScope #92782

Closed

clawsweeper Bot mentioned this pull request Jun 15, 2026

fix(media-understanding): place Qwen/DashScope image prompts in user content (#92688) #92770

Closed

openclaw-clownfish Bot closed this Jun 15, 2026

openclaw-clownfish Bot added the clownfish Tracked by Clownfish automation label Jun 15, 2026

vincentkoc mentioned this pull request Jun 16, 2026

fix(qwen): place DashScope image prompts in user content #93649

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

#92688: fix(qwen): use DashScope native image format for Qwen vision models#92704

#92688: fix(qwen): use DashScope native image format for Qwen vision models#92704
sheyanmin wants to merge 1 commit into
openclaw:mainfrom
sheyanmin:fix/issue-92688-qwen-vision-content-format

sheyanmin commented Jun 13, 2026

Uh oh!

clawsweeper Bot commented Jun 13, 2026 •

edited

Loading

Uh oh!

openclaw-clownfish Bot commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

sheyanmin commented Jun 13, 2026

Summary

Root Cause

Real behavior proof

behavior

environment

steps

observedResult

Regression Test Plan

Uh oh!

clawsweeper Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openclaw-clownfish Bot commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

clawsweeper Bot commented Jun 13, 2026 •

edited

Loading