Skip to content

Build (main): 20260527-1-af40b91

Latest

Choose a tag to compare

Build Information

Field Value
📦 Version 20260527-1-af40b91
💾 Binary Size Win: 49M, Linux: 49M, macOS: 50M
🔗 Commit af40b91
📅 Build Date 2026-05-27 17:25:40 UTC
Trigger push

📋 What's Changed

(Note: These release notes have been consolidated from multiple development builds into unified categories for easier reading.)

🚀 Features & Enhancements

  • Provider & Model Support: Added DeepSeek provider with V4 reasoning support & disable logic, implemented Gemini 3.1 Pro preview support with default quota mapping, added Qwen 3.5 Plus (coder-model), Claude Opus 4.6, and Kimi K2.5 (iFlow) models. Added Anthropic API compatibility handler and supported thinking configuration for Kimi K2.* models (7a4648e, 00a3fcd, 26a6be7, 0364116, 87e484e, d73e85c, 570bf5b, 7ed4c9c, 99cbc5c) by @Mirrowel, @MasuRii, yassin
  • Reasoning & Payload Configurations: Supported reasoning effort configuration (including Mistral-specific settings), tracked reasoning to cache write tokens, and logged transformed requests if the payload is modified (3e1b91a, 5f0d06b, 4f1cceb, 60d7144) by @Mirrowel
  • Concurrency & Rate Limiting: Increased base concurrency limits for Gemini CLI provider, supported unlimited concurrency setting default rotation to sequential, and implemented capacity-phase key acquisition with optimal concurrency limits (9fcab3b, 0846b0c, ee732b5) by @Mirrowel
  • Quotas & Tracking Capabilities: Implemented dynamic tier-specific tracking windows, flexible quota thresholds with advanced duration parsing, offset/percentage-based custom caps, historical max request tracking, granular tracking with cost calculation, and fair cycle/custom cap visualizations. Enhanced quota viewer with multi-window support/sorting logic, made window limits optional restoring legacy logging, and parsed granular Google quota details (0af9591, 87d04d1, 35270ee, 06db1e2, 523ea31, 731e848, b212f50, b56dd6b, 1c86c22) by @Mirrowel
  • Auth, Fingerprinting & Emulation: Implemented comprehensive device fingerprinting, profile simulation and binding, and centralized CLI fingerprinting with env var overrides. Implemented cookie-based authentication, thinking mode, and reasoning preservation for iFlow (8ab7334, 5992105, 38563b9, b973db8, d68c147) by @Mirrowel
  • Session, Routing & Acquisition Flow: Implemented session inference and sticky credential routing, automatic GCP project discovery fallback, async credential waiting, and quota group sync. Prevented premature fair cycle resets via cooldown thresholds and cleared cooldowns on verified quota availability (2363215, d21b633, 26f2846, 1ebf508, b776a39) by @Mirrowel
  • Logging, Stats & Error Handling: Enhanced usage stats aggregation, enforced async client execution, implemented quota group aggregation with explicit initialization, and integrated advanced logging with dynamic usage config. Enhanced acquisition logging with provider context, tracked internal provider retries, implemented transient retry pacing with reasoning-safe stream retries, and added display logging with full tier name caching for setup messages (3450c73, 673ca0e, 01dffe7, 69ab47f, dfd2070, 9cc70c4, 5293b13, 63c4a3c) by @Mirrowel

🐛 Bug Fixes

  • Provider Payloads & Streaming: Properly extracted text from multi-part system messages, yielded ModelResponseStream for chunks fixing object conversion, handled dictionary responses as 429 rate limit errors, awaited stream completion calls, and filtered transaction_context from litellm arguments. Added signed request headers for chat completions, handled null data in API responses during cookie auth, and removed tool_choice=auto avoiding 422 errors (55b9e91, 2da7c06, eea6f3d, 7947158, c9e14e8, 2265aa5, 99eb90d, dda8805) by @Mirrowel, yassin, @tiesworkman1025, @b3nw
  • Model-Specific Adjustments: Removed magistral-medium from model patterns, passed Mistral reasoning_effort via extra_body and assigned directly to payload root, normalized Qwen finish_reason fixing final chunk handling alongside usage chunks, removed default safety settings injection to prevent 400 errors, and updated emulation to gemini-cli v0.28.0 with onboarding (a9a7158, d808dc6, 3d92abf, 9363cd2, b345127, 04aa94c, 4291c10) by @Mirrowel
  • Tracking, Usage & Analytics: Normalized usage tokens to inclusive reasoning convention, prevented stale API data from overwriting local counts, excluded inactive providers from usage statistics, restricted selection and stats strictly to active credentials, exempted 503 retries from usage counting, purged stale tracking windows on tier changes, and removed duplicate tracking logic (f9f44e2, 22ef3f6, ee77de5, 1020f5a, 4fc6c2b, 4f68828, 2364ac5, bf8b2e7) by @Mirrowel
  • Quota & Window State Management: Lazy initialized window start and reset times, synced group timing to models preserving window limits, refined quota exhaustion logic fetching contexts, enforced independent checks for model and group caps, made group and model quota updates mutually exclusive, and respected window limit configuration in the tracking engine (03a22ef, 55e1829, 2ce5478, cdded49, 7635027, 610fe8e) by @Mirrowel
  • Concurrency, Locks & Retries: Fixed race condition in lock creation, reduced priority multipliers enforcing safe concurrency defaults, corrected retry loop limit for antigravity (updating user-agent string), and fixed interactive auth tools with queue initialization (2da1c3f, a0a71e8, bdf6094, 043d35a, ba4e613) by @Mirrowel
  • Initialization & Edge Cases: Suppressed litellm debug info during initialization and allowed server-managed project creation for paid tiers (f9668d3, 627e10d) by @Mirrowel

♻️ Architecture & Code Refactoring

  • Client & Usage Modularization: Finalized modular architecture, decoupled model/group usage statistics, centralized window aggregation to reconcile usage counts, centralized request execution setup logic, extracted executor helpers removing legacy modules, and preserved legacy client implementations (cd744cc, 2028c27, 39731b7, ba60834, 94231e0, 89f0358, 640be5a, 2136d98) by @Mirrowel
  • State Management & Helpers: Enforced singleton pattern for provider instances sharing them across client components, delegated 503 capacity handling to providers, transitioned availability checks to async, and used internal helpers for model group retrieval (aa744f1, 9e75529, bf9d4b4, 790c01e, ef0d096) by @Mirrowel
  • Configurations & Metadata: Centralized tier definitions, normalization logic, project metadata logic, and concurrency env var parsing with improved validation. Standardized window config with human-readable timestamps, standardized tier names to canonical uppercase formats, and updated quota resolution strategy prioritizing limit size (c6b09bc, 3e2318b, 54908c0, effdd28, fa7d158, 3a046f5, ba35176) by @Mirrowel
  • Deprecations: Retired Antigravity, Qwen Code, and iFlow legacy providers, replaced interactive re-auth with permanent/manual credential expiry, and removed the check_expired_windows method (8e90444, f6e88ae, 4a13407, e0b54cf) by @Mirrowel
  • Visuals & Masking: Enhanced credential masking, sanitized usage logs, extracted credential logging callbacks, and replaced raw emojis with rich markup aliases (93882eb, bb6e627, dc763c0) by @Mirrowel

📦 Build System, Chores & Other Changes

  • Build Systems & Dependencies: Upgraded python base image to 3.12, updated project metadata/python requirement, and collected rich unicode data submodules (b84714f, a990d92, 54d2385) by @Mirrowel
  • Documentation & Persistence: Updated usage persistence mounts and documentation (f1c963c) by @Mirrowel
  • Version Control: Merged dev into main (af40b91) by @Mirrowel

💜 Community Contributions

Thank you to our community contributors!

  • feat(qwen): add coder-model (Qwen 3.5 Plus) to hardcoded models (#144) by @redzrush101
  • feat(gemini_cli): add Gemini 3.1 support (#141) by @MasuRii
  • fix(iflow): add signed headers for chat requests (#120) by @redzrush101
  • fix(dedaluslabs): Remove tool_choice=auto to avoid 422 error (#111) by @b3nw
  • Refactor of the library core (#98) by @Mirrowel

📁 Included Files

Each OS-specific archive contains the following files:

File Description
proxy_app.exe Main application executable with built-in TUI launcher for Windows.
proxy_app Main application executable with built-in TUI launcher for Linux and macOS.
.env.example Example configuration file. Copy to .env and add your API keys.
README.md Project overview and quick start guide.
DOCUMENTATION.md Detailed configuration and usage documentation.
LICENSE License.

📦 Archives

  • Windows: LLM-API-Key-Proxy-Windows-main-20260527-1-af40b91.zip
  • Linux: LLM-API-Key-Proxy-Linux-main-20260527-1-af40b91.zip
  • macOS: LLM-API-Key-Proxy-macOS-main-20260527-1-af40b91.zip

🔗 Useful Links


Note: This is an automated build release.

Full Changelog: main/build-20260123-1-bf7ab7e...main/build-20260527-1-af40b91