Skip to content

refactor(rag): make the rag toolset opt-in to drop cgo from embedders#3174

Merged
dgageot merged 2 commits into
mainfrom
refactor/decouple-rag-from-teamloader-runtime
Jun 19, 2026
Merged

refactor(rag): make the rag toolset opt-in to drop cgo from embedders#3174
dgageot merged 2 commits into
mainfrom
refactor/decouple-rag-from-teamloader-runtime

Conversation

@dgageot

@dgageot dgageot commented Jun 19, 2026

Copy link
Copy Markdown
Member

The rag toolset depends on go-tree-sitter, which is cgo-only and has no pure-Go fallback. Two static import edges dragged it into the binary of every embedder of this library: pkg/teamloader's registry hard-coded the rag creator, and pkg/runtime type-asserted the concrete *builtinrag.ToolSet to wire indexing-progress events into the UI. Because pkg/runtime, pkg/embeddedchat, and pkg/teamloader are exactly what downstream embedders import, they all linked tree-sitter even when rag is never configured. That forces those embedders to build with CGO_ENABLED=0 (or ship a C toolchain) when cross-compiling to targets like Windows, purely to satisfy a dependency they don't use.

This change makes rag opt-in without changing the CLI's behavior. teamloader now exposes RegisterToolsetCreator, and the rag package self-registers its rag creator from an init(), so a single blank import is enough to enable it. The runtime no longer references the concrete rag type; it asserts a small EventForwarder interface defined in the pure-Go pkg/rag/types package instead. cmd/root blank-imports the rag package, so the docker-agent CLI keeps rag exactly as before, while embedders that don't import it get a tree-sitter-free dependency graph and can build with cgo enabled.

The one risk is that an entrypoint which needs rag forgets the blank import, silently turning rag into an unknown toolset type at runtime. A guard test in cmd/root resolves rag against the default registry so that mistake fails the build instead.

The rag toolset pulls in go-tree-sitter (cgo). Because teamloader's
registry and pkg/runtime both imported the rag toolset statically, every
embedder of the runtime/embeddedchat/teamloader linked tree-sitter even
when rag was never used — forcing CGO_ENABLED=0 workarounds for targets
without a C toolchain (e.g. cross-compiling to Windows).

Break both static edges:
- teamloader gains RegisterToolsetCreator; the rag package self-registers
  the "rag" creator from an init(), so a blank import is enough to enable
  it. cmd/root blank-imports it, keeping the CLI's behavior unchanged.
- pkg/runtime no longer type-asserts the concrete *builtinrag.ToolSet to
  wire indexing-event forwarding; it asserts a new EventForwarder
  interface defined in the pure-Go pkg/rag/types.

A guard test in cmd/root asserts "rag" still resolves in the default
registry, turning a forgotten blank import into a build failure.
@dgageot dgageot requested a review from a team as a code owner June 19, 2026 14:29
@aheritier aheritier added area/rag For work/issues that have to do with the RAG features kind/refactor PR refactors code without behavior change labels Jun 19, 2026
- use t.Context() instead of context.Background() in the guard test
- annotate the intentional rag init() with a gochecknoinits nolint
- name the EventCallback param in the EventForwarder interface
- replace the extraCreators merge loop with maps.Copy

@docker-agent docker-agent left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

This refactor cleanly achieves its stated goal: making the rag toolset opt-in via init() self-registration to eliminate the cgo go-tree-sitter dependency from embedders that don't need RAG.

What was reviewed:

  • pkg/teamloader/registry.go — new RegisterToolsetCreator + extraCreators merge pattern
  • pkg/tools/builtin/rag/register.go — new init() self-registration
  • pkg/rag/types/types.go — new EventForwarder interface and EventCallback type alias
  • pkg/runtime/loop.go / pkg/runtime/rag.go — switch from concrete *builtinrag.ToolSet to ragtypes.EventForwarder interface assertion
  • cmd/root/toolsets.go — blank import to keep CLI behavior unchanged
  • cmd/root/toolsets_test.go — guard test to catch accidental removal of the blank import

Key findings: No bugs found. All hypotheses were verified and dismissed:

  • tools.As[ragtypes.EventForwarder] interface assertion — valid; Go generics handle interface type parameters correctly, and *ToolSet satisfies EventForwarder via the type-aliased EventCallback.
  • Guard test logic — sound; an empty-config rag call returns errors.New("rag toolset requires either a rag_config block or a ref"), which is non-nil and doesn't contain "unknown toolset type", correctly distinguishing the two cases.
  • extraCreators global state / duplicate registration panic — not a concern; init() runs exactly once per binary, no cross-binary shared state, no data race (init completes before any test/main runs).
  • Circular import — broken correctly; pkg/teamloader no longer imports pkg/tools/builtin/rag; the dependency now flows in the other direction.

The design pattern (self-registration via init(), panic on duplicate) is idiomatic Go (identical to database/sql drivers) and appropriate here.

@dgageot dgageot merged commit bb35215 into main Jun 19, 2026
9 checks passed
@dgageot dgageot deleted the refactor/decouple-rag-from-teamloader-runtime branch June 19, 2026 14:48
pull Bot pushed a commit to TheTechOddBug/cagent that referenced this pull request Jun 20, 2026
…ddedchat, RAG opt-in, optional providers

- PR docker#3171 (pkg/embeddedchat): Add 'Headless Embedded Chat' section to
  docs/guides/go-sdk/index.md documenting the Config, Session, Event
  types and the Send/Confirm/Restart/Close API with worked examples.
- PR docker#3174 (RAG opt-in): Add 'RAG Toolset (cgo-free builds)' section
  explaining that pkg/rag must be blank-imported to register the toolset,
  allowing embedders to omit it and avoid the cgo dependency.
- PR docker#3176 (optional providers): Add 'Optional Provider Build Tags'
  section listing docker_agent_no_{openai,anthropic,google,bedrock} tags
  with the major dependency each removes and the Anthropic+Google note.
- Add pkg/embeddedchat row to the Core Packages table.

Sources:
  docker#3171
  docker#3174
  docker#3176
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/rag For work/issues that have to do with the RAG features kind/refactor PR refactors code without behavior change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants