Skip to content

fix(cli): wait for all image pulls before starting containers#5681

Merged
Coly010 merged 2 commits into
developfrom
fix/start-wait-for-image-pulls
Jun 24, 2026
Merged

fix(cli): wait for all image pulls before starting containers#5681
Coly010 merged 2 commits into
developfrom
fix/start-wait-for-image-pulls

Conversation

@Coly010

@Coly010 Coly010 commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

What changed

supabase start could start containers before their Docker images had finished downloading.

The command ran two uncoordinated image-acquisition paths:

  1. A best-effort concurrent pre-pull (pullImagesUsingCompose) using docker-compose's Pull with PullOptions{IgnoreFailures: true}. It only targets the primary registry and, by design, silently swallows per-image pull failures (the IgnoreFailures flag is the hook that lets the registry fallback recover).
  2. An authoritative lazy per-container pull inside utils.DockerStartDockerResolveImageIfNotCached (multi-registry fallback: ECR → GHCR → Docker Hub).

So any image the concurrent pre-pull failed to cache — a transient registry/network/rate-limit hiccup, common on a fresh machine pulling 10+ images at once — was pulled later, during the Starting database… / Starting containers… phase. That is the "start doesn't wait for pulls" behaviour from the issue. The pre-pull was added in #4394, matching the reporter's "last few versions" regression window.

The fix

Add ensureImagesCached, a completeness pass that runs immediately after the best-effort pre-pull and before any container starts. It resolves every project image through the same multi-registry fallback resolver DockerStart already uses (DockerResolveImageIfNotCached), fanned out concurrently via the existing utils.WaitAll primitive.

After it returns, every required image is guaranteed present in the local cache, so the per-container DockerStart calls become pure cache hits and never pull mid-start. On the happy path it is just N cheap image inspects; an image that genuinely cannot be pulled from any registry now fails the start cleanly before any container is created, instead of limping into a half-pulled start. The compose pre-pull (and its IgnoreFailures) is kept as the fast concurrent progress UI — it is simply no longer relied on for completeness.

TypeScript port

The native-TS port already pulls in a preparation phase that is awaited before startup and fails hard on pull errors, so it does not have this bug. This PR adds regression guards locking that contract in:

  • Stack.unit.test.ts: stack.start() aborts and starts zero containers when a docker pull fails.
  • prefetch.unit.test.ts: preparation fails with DockerPullError when the whole registry fallback chain fails.

Fixes #5068

@Coly010 Coly010 self-assigned this Jun 24, 2026
`supabase start` ran two uncoordinated image-pull paths: a best-effort
concurrent compose pre-pull (PullOptions.IgnoreFailures) and a lazy
per-container pull inside DockerStart. Any image the pre-pull failed to
cache was pulled later, during the "Starting..." phase, so containers
started before pulls had finished.

Add ensureImagesCached: a completeness pass that resolves every project
image through the same multi-registry fallback DockerStart uses, before
any container starts. The compose pre-pull is kept as the fast, best-
effort progress UI, but is no longer relied on for completeness.

Surface the Docker install hint from DockerResolveImageIfNotCached so a
"Docker not running" failure during the new pre-start resolve keeps the
helpful suggestion it previously only got from DockerStart.

Also add regression guards to the TypeScript port asserting that start
aborts (rather than deferring the pull) when image preparation fails.

Fixes #5068
@Coly010 Coly010 force-pushed the fix/start-wait-for-image-pulls branch from 4270a1f to d6d0e4e Compare June 24, 2026 12:14
@Coly010 Coly010 marked this pull request as ready for review June 24, 2026 12:20
@Coly010 Coly010 requested a review from a team as a code owner June 24, 2026 12:20
@chatgpt-codex-connector

Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown

Supabase CLI preview

npx --yes https://pkg.pr.new/supabase/cli/supabase@a244947c339f4280c6e1d08a730e60afd9f4ef33

Preview package for commit a244947.

@jgoux jgoux left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review summary

Well-constructed, correct fix. The diagnosis is accurate and the fix reuses existing primitives (WaitAll, DockerResolveImageIfNotCached, GetServices) rather than reinventing them. I verified the load-bearing assumption: GetRegistryImageUrls only uses the last path segment to rebuild the candidate set, so feeding it an already-normalized service.Image is idempotent and produces exactly the set DockerStart later resolves against. Coverage also matches — project.Services is the same notExcluded-filtered set that gets started. Test coverage (Go gock.IsDone() invariant check + TS regression guards) is strong.

One real-but-benign concern (CmdSuggestion concurrent write) and one minor UX note inline. Both non-blocking; approving in spirit pending a conscious decision on the first.

Comment thread apps/cli-go/internal/utils/docker.go Outdated
Comment thread apps/cli-go/internal/start/start.go
… data race

ensureImagesCached resolves images via WaitAll (one goroutine per image), so
writing the package-global CmdSuggestion from inside DockerResolveImageIfNotCached
raced under `go test -race` when the daemon was down. Extract
SuggestDockerInstallIfConnectionFailed and set the hint once, sequentially, in the
caller after errors.Join — same user-facing hint, no race. DockerStart now routes
through the same helper.

review: PR #5681
@Coly010 Coly010 added this pull request to the merge queue Jun 24, 2026
Merged via the queue into develop with commit dcb9e53 Jun 24, 2026
47 of 48 checks passed
@Coly010 Coly010 deleted the fix/start-wait-for-image-pulls branch June 24, 2026 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

"supabase start" doesn't wait for pulls to finish

2 participants