Skip to content

Add --build-resolution-cache to write a per-DSO library resolution note#647

Merged
Mic92 merged 3 commits into
NixOS:masterfrom
cachix:resolution-cache-note
Jun 24, 2026
Merged

Add --build-resolution-cache to write a per-DSO library resolution note#647
Mic92 merged 3 commits into
NixOS:masterfrom
cachix:resolution-cache-note

Conversation

@domenkozar

Copy link
Copy Markdown
Member

What

Adds a --build-resolution-cache option that records, for each DT_NEEDED
entry of an ELF, where the dynamic loader would find it, in a PT_NOTE
(.note.nixos.ldcache, owner NixOS, type 0x63a86cb6).

The descriptor is a sequence of NUL-terminated (needed, path-list) string
pairs, where each path-list element is either =<absolute-path> for a directly
resolved library or ?<dir> for a directory the loader must still search
itself (used for $ORIGIN-relative and glibc-hwcaps directories).

Why

On /nix/store-based systems every dependency lives in its own directory, so an
ELF's DT_RUNPATH lists one directory per dependency. To resolve each
DT_NEEDED soname the loader probes every DT_RUNPATH directory in turn,
producing a storm of failing openat()/stat() calls (roughly libraries times
directories) before each library is found. This measurably slows program
startup, especially on slow disks, network filesystems, and low power hardware.
See NixOS/nixpkgs#481620.

A glibc loader that understands this note can resolve a binary's direct
dependencies straight from it, skipping the DT_RUNPATH walk entirely. The note
is consulted after LD_LIBRARY_PATH and before the DT_RUNPATH search, so
LD_LIBRARY_PATH / LD_PRELOAD / /run/opengl-driver overrides keep working;
an unpatched loader ignores the note.

This is the writer half; the loader half is a separate glibc patch.

Behaviour

  • The note is placed in its own page-aligned PT_LOAD covered by a PT_NOTE,
    with a matching SHT_NOTE section; the program and section header tables are
    relocated through the existing rewriteSections() path (handles both
    ET_DYN and the page-aligned ET_EXEC case).
  • Resolves DT_RUNPATH, falling back to DT_RPATH when DT_RUNPATH is absent,
    mirroring the loader.
  • Rebuilding after the run path changed fails loudly rather than keeping a stale
    note; an empty run path warns and writes nothing.
  • Rejected together with --force-rpath (the cache only applies to
    DT_RUNPATH).

Tests

tests/build-resolution-cache.sh (PIE, DT_RPATH fallback, stale-rebuild,
no-run-path, LD_LIBRARY_PATH override, idempotency, --force-rpath
rejection) and tests/build-resolution-cache-no-pie.sh (the non-PIE ET_EXEC
page-boundary case).

Credit

Based on the per-DSO resolution-cache approach originally proposed by pennae in
NixOS/nixpkgs#207893.


Disclosure: this change was developed with the assistance of Claude Code
(Claude Opus 4.8); the commit carries an Assisted-by: trailer, and the author
has reviewed and is accountable for it.

🤖 Generated with Claude Code

@domenkozar domenkozar force-pushed the resolution-cache-note branch from 72ae765 to 71f2aa1 Compare June 22, 2026 20:05
@kugel-

kugel- commented Jun 23, 2026

Copy link
Copy Markdown

If you know the path couldn't you just record the absolute path in DT_NEEDED (I've done that on other occasions).

@domenkozar

Copy link
Copy Markdown
Member Author

Yes, glibc honors a / in DT_NEEDED and opens that path directly, but that bypasses the search entirely, which is what the note is trying to avoid. Three reasons it's not equivalent:

  • Overrides break. The note is consulted after LD_LIBRARY_PATH and before DT_RUNPATH, so LD_LIBRARY_PATH / LD_PRELOAD / /run/opengl-driver substitution keep working. An absolute DT_NEEDED skips the search, so those overrides no longer apply (on NixOS this pins the build-time driver instead of the runtime one).
  • No fallback. A stale note just costs the normal RUNPATH walk, and an unpatched loader ignores it. A wrong absolute DT_NEEDED has no soname left to fall back on, so it fails hard.
  • No hwcaps. The ?<dir> entries let the loader still pick a glibc-hwcaps / $ORIGIN variant. A single absolute path can only point at the baseline.

Plus it keeps DT_NEEDED as the canonical soname list (tooling, ldd, RUNPATH shrinking) instead of baking store paths into it. The note is meant to be an advisory, overridable cache, not a hard pin.

@domenkozar domenkozar force-pushed the resolution-cache-note branch 3 times, most recently from aa17f1c to c6598da Compare June 23, 2026 19:28
Loading a /nix/store program is slow because the loader must search every
DT_RUNPATH directory for every DT_NEEDED soname, producing a storm of failing
openat()/stat() syscalls before it finds each library (see
NixOS/nixpkgs#481620). This precomputes that search
at patch time and records the result in the binary, so a loader that
understands the note can resolve a binary's direct dependencies without
walking the run path at all.

The result is stored in a PT_NOTE (".note.nixos.ldcache", type 0x63a86cb6,
owner "NixOS"), whose descriptor is a sequence of NUL-terminated
(needed, path-list) pairs terminated by an empty entry. A path-list element is
"=<absolute-path>" for a library resolved directly, or "?<dir>" for a directory
the loader must still search itself. For example, a binary with
DT_NEEDED {libfoo.so, libbar.so} and DT_RUNPATH
"/nix/store/aaa-A/lib:$ORIGIN/../lib" yields (";" marks each '\0'):

  libbar.so;=/nix/store/aaa-A/lib/libbar.so;
  libfoo.so;=/nix/store/aaa-A/lib/libfoo.so;?$ORIGIN/../lib;
  ;

so the loader opens libbar.so and libfoo.so directly from aaa-A and only falls
back to searching $ORIGIN/../lib.

The "?<dir>" form exists because the note is authoritative: since the loader
uses it to skip the run-path walk entirely, the descriptor must be the complete
per-soname answer. Some directories cannot be resolved at patch time --
$ORIGIN, $LIB and $PLATFORM expand to the binary's load-time location and
loader/CPU-dependent strings, and glibc-hwcaps subdirectories are chosen by the
running CPU -- so patchelf records them as a search hint instead of a path.
This makes detecting such directories load-bearing: it is the only way to tell
"I probed and the library is not here" (drop the directory) from "I could not
probe" (record a hint), since access() collapses both to a failed lookup and
treats "$ORIGIN/../lib/libfoo.so" as a literal path. Without it an unresolvable
directory would be dropped silently and any library reachable only through it
would fail to load. The note is thus a path cache with holes that preserves the
loader's normal behavior for the parts it cannot precompute, while the syscall
savings still apply to every resolved "=<path>" entry.

The note is placed in its own page-aligned PT_LOAD covered by a PT_NOTE, with a
matching SHT_NOTE section, and the program and section header tables are
relocated through the existing rewriteSections() path. Building is idempotent
(an exact existing descriptor is a harmless re-run and is skipped) and is
rejected together with --force-rpath, since the cache only applies to
DT_RUNPATH.

The patch also resolves several cases that previously produced wrong, missing,
or destroyed output with no signal:

* On non-PIE executables rewriteSections() rewrites the section header table in
  place at e_shoff, where it grows by the one section the note adds. When the
  table sits at the end of a page-aligned file, the note offset coincided with
  the old table end, so the table's extra entry overwrote the note's Elf_Nhdr
  and silently destroyed it. The note is now placed past the grown table rather
  than merely past the current end of file. The ET_DYN path is unaffected.

* DT_RPATH was ignored: only DT_RUNPATH was read, so a binary whose search path
  lives in DT_RPATH got no note and exit 0. patchelf now falls back to DT_RPATH
  when DT_RUNPATH is absent, matching the loader.

* A re-patch after the run path changed kept the stale note silently. The
  existing descriptor is now compared; a mismatch errors instead of resolving
  against the old paths.

* Every no-op exit (statically linked, no needed entries or run path, nothing
  resolved) returned silently. patchelf now warns so callers are not misled
  into thinking a cache was written.

Add regression tests, including one that builds the cache on a non-PIE binary
padded to a page boundary, plus a pad-to-page helper to force that layout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@domenkozar domenkozar force-pushed the resolution-cache-note branch from c6598da to 89d1608 Compare June 23, 2026 20:48
Mic92 and others added 2 commits June 23, 2026 22:59
No functional change. Hoist a duplicated removeResolutionCache() call in
the rpRemove path, drop comments that restate the code, and shorten the
remaining ones to just the non-obvious WHY.

Merge the four near-identical "edit drops the cache" / "stale note fails"
test scripts into two, and cut the long test preambles down to a few
lines each.
build-resolution-cache: trim comments and consolidate tests
@Mic92 Mic92 added this pull request to the merge queue Jun 23, 2026
Merged via the queue into NixOS:master with commit ced7ec0 Jun 24, 2026
14 checks passed
@domenkozar domenkozar mentioned this pull request Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants