Add --build-resolution-cache to write a per-DSO library resolution note#647
Merged
Conversation
72ae765 to
71f2aa1
Compare
|
If you know the path couldn't you just record the absolute path in |
Member
Author
|
Yes, glibc honors a
Plus it keeps |
aa17f1c to
c6598da
Compare
Loading a /nix/store program is slow because the loader must search every DT_RUNPATH directory for every DT_NEEDED soname, producing a storm of failing openat()/stat() syscalls before it finds each library (see NixOS/nixpkgs#481620). This precomputes that search at patch time and records the result in the binary, so a loader that understands the note can resolve a binary's direct dependencies without walking the run path at all. The result is stored in a PT_NOTE (".note.nixos.ldcache", type 0x63a86cb6, owner "NixOS"), whose descriptor is a sequence of NUL-terminated (needed, path-list) pairs terminated by an empty entry. A path-list element is "=<absolute-path>" for a library resolved directly, or "?<dir>" for a directory the loader must still search itself. For example, a binary with DT_NEEDED {libfoo.so, libbar.so} and DT_RUNPATH "/nix/store/aaa-A/lib:$ORIGIN/../lib" yields (";" marks each '\0'): libbar.so;=/nix/store/aaa-A/lib/libbar.so; libfoo.so;=/nix/store/aaa-A/lib/libfoo.so;?$ORIGIN/../lib; ; so the loader opens libbar.so and libfoo.so directly from aaa-A and only falls back to searching $ORIGIN/../lib. The "?<dir>" form exists because the note is authoritative: since the loader uses it to skip the run-path walk entirely, the descriptor must be the complete per-soname answer. Some directories cannot be resolved at patch time -- $ORIGIN, $LIB and $PLATFORM expand to the binary's load-time location and loader/CPU-dependent strings, and glibc-hwcaps subdirectories are chosen by the running CPU -- so patchelf records them as a search hint instead of a path. This makes detecting such directories load-bearing: it is the only way to tell "I probed and the library is not here" (drop the directory) from "I could not probe" (record a hint), since access() collapses both to a failed lookup and treats "$ORIGIN/../lib/libfoo.so" as a literal path. Without it an unresolvable directory would be dropped silently and any library reachable only through it would fail to load. The note is thus a path cache with holes that preserves the loader's normal behavior for the parts it cannot precompute, while the syscall savings still apply to every resolved "=<path>" entry. The note is placed in its own page-aligned PT_LOAD covered by a PT_NOTE, with a matching SHT_NOTE section, and the program and section header tables are relocated through the existing rewriteSections() path. Building is idempotent (an exact existing descriptor is a harmless re-run and is skipped) and is rejected together with --force-rpath, since the cache only applies to DT_RUNPATH. The patch also resolves several cases that previously produced wrong, missing, or destroyed output with no signal: * On non-PIE executables rewriteSections() rewrites the section header table in place at e_shoff, where it grows by the one section the note adds. When the table sits at the end of a page-aligned file, the note offset coincided with the old table end, so the table's extra entry overwrote the note's Elf_Nhdr and silently destroyed it. The note is now placed past the grown table rather than merely past the current end of file. The ET_DYN path is unaffected. * DT_RPATH was ignored: only DT_RUNPATH was read, so a binary whose search path lives in DT_RPATH got no note and exit 0. patchelf now falls back to DT_RPATH when DT_RUNPATH is absent, matching the loader. * A re-patch after the run path changed kept the stale note silently. The existing descriptor is now compared; a mismatch errors instead of resolving against the old paths. * Every no-op exit (statically linked, no needed entries or run path, nothing resolved) returned silently. patchelf now warns so callers are not misled into thinking a cache was written. Add regression tests, including one that builds the cache on a non-PIE binary padded to a page boundary, plus a pad-to-page helper to force that layout. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
c6598da to
89d1608
Compare
No functional change. Hoist a duplicated removeResolutionCache() call in the rpRemove path, drop comments that restate the code, and shorten the remaining ones to just the non-obvious WHY. Merge the four near-identical "edit drops the cache" / "stale note fails" test scripts into two, and cut the long test preambles down to a few lines each.
build-resolution-cache: trim comments and consolidate tests
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a
--build-resolution-cacheoption that records, for eachDT_NEEDEDentry of an ELF, where the dynamic loader would find it, in a
PT_NOTE(
.note.nixos.ldcache, ownerNixOS, type0x63a86cb6).The descriptor is a sequence of NUL-terminated
(needed, path-list)stringpairs, where each path-list element is either
=<absolute-path>for a directlyresolved library or
?<dir>for a directory the loader must still searchitself (used for
$ORIGIN-relative andglibc-hwcapsdirectories).Why
On
/nix/store-based systems every dependency lives in its own directory, so anELF's
DT_RUNPATHlists one directory per dependency. To resolve eachDT_NEEDEDsoname the loader probes everyDT_RUNPATHdirectory in turn,producing a storm of failing
openat()/stat()calls (roughly libraries timesdirectories) before each library is found. This measurably slows program
startup, especially on slow disks, network filesystems, and low power hardware.
See NixOS/nixpkgs#481620.
A glibc loader that understands this note can resolve a binary's direct
dependencies straight from it, skipping the
DT_RUNPATHwalk entirely. The noteis consulted after
LD_LIBRARY_PATHand before theDT_RUNPATHsearch, soLD_LIBRARY_PATH/LD_PRELOAD//run/opengl-driveroverrides keep working;an unpatched loader ignores the note.
This is the writer half; the loader half is a separate glibc patch.
Behaviour
PT_LOADcovered by aPT_NOTE,with a matching
SHT_NOTEsection; the program and section header tables arerelocated through the existing
rewriteSections()path (handles bothET_DYNand the page-alignedET_EXECcase).DT_RUNPATH, falling back toDT_RPATHwhenDT_RUNPATHis absent,mirroring the loader.
note; an empty run path warns and writes nothing.
--force-rpath(the cache only applies toDT_RUNPATH).Tests
tests/build-resolution-cache.sh(PIE,DT_RPATHfallback, stale-rebuild,no-run-path,
LD_LIBRARY_PATHoverride, idempotency,--force-rpathrejection) and
tests/build-resolution-cache-no-pie.sh(the non-PIEET_EXECpage-boundary case).
Credit
Based on the per-DSO resolution-cache approach originally proposed by pennae in
NixOS/nixpkgs#207893.
Disclosure: this change was developed with the assistance of Claude Code
(Claude Opus 4.8); the commit carries an
Assisted-by:trailer, and the authorhas reviewed and is accountable for it.
🤖 Generated with Claude Code