fix: Handle symbols with empty versions#1480
Conversation
There was a problem hiding this comment.
EDIT: fixed by ignoring symbols missing from one of the linkers.
This sort of duplicates symbol_diff.rs because missing symbols will be present in both cases, like in this case:
❯ WILD_REFERENCE_LINKER=ld.bfd ./run-with ~/Projects/wild/target/debug/wild
wild: ./bin
ref: ./bin.ref-linker
.gnu.hash
wild Dynamic symbol `fuse_new` hash lookup found 128, expected 129
ref Dynamic symbol `fuse_reply_statfs` hash lookup found 138, expected 139
.dynamic.DT_FLAGS_1.NOW
wild 1
ref
version.__fuse_exited
wild
ref global (hidden)
version.__fuse_loop_mt
wild
ref global (hidden)
version.__fuse_process_cmd
wild
ref global (hidden)
version.__fuse_read_cmd
wild
ref global (hidden)
version.__fuse_set_getcontext_func
wild
ref global (hidden)
version.__fuse_setup
wild
ref global (hidden)
version.__fuse_teardown
wild
ref global (hidden)
dynsym.__fuse_exited.section
wild
ref .text
dynsym.__fuse_loop_mt.section
wild
ref .text
dynsym.__fuse_process_cmd.section
wild
ref .text
dynsym.__fuse_read_cmd.section
wild
ref .text
dynsym.__fuse_set_getcontext_func.section
wild
ref .text
dynsym.__fuse_setup.section
wild
ref .text
dynsym.__fuse_teardown.section
wild
ref .text
| ); | ||
| } | ||
|
|
||
| values.sort_values(); |
There was a problem hiding this comment.
Unfortunately, removing it doesn't fail Wild's tests, ATM.
This is necessary for cases when the same symbol has multiple versions like here:
version.fuse_new
wild FUSE_2.5 (hidden),FUSE_2.6,FUSE_2.2 (hidden),global (hidden)
ref FUSE_2.5 (hidden),global (hidden),FUSE_2.6,FUSE_2.2 (hidden)
| // If we don't optimise a TLS access, then we'll have references to __tls_get_addr, | ||
| // when GNU ld doesn't. | ||
| "dynsym.__tls_get_addr.*", | ||
| "version.__tls_get_addr", |
There was a problem hiding this comment.
symbol_diff.rs uses a bit different formatting: dynsym.<section_name>.section, so there is an inconsistency here.
|
EDIT: This also happens on loongarch64 but I don't test it locally. @marxin any clue why dynsym is so much different between Wild and GNU ld only on RISC-V? Details |
| // TODO: this duplicates `symbol_diff.rs` | ||
| // On aarch64, GNU ld emits a dynamic symbol called "_stack", which it puts in some section | ||
| // or other that doesn't make sense. e.g. ".got.plt". It probably puts it in that section | ||
| // because it's closest to the value that it assigns to the symbol. It's not clear where | ||
| // this symbol comes from. It's neither in any input files, nor in GNU ld's built-in linker | ||
| // script. | ||
| if sym_name == b"_stack" { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
This should be decided what to do here before merging, leaving a comment.
There was a problem hiding this comment.
I was able to get GNU ld to emit "_start" in its linker script by running ld -m armelf --verbose on aarch64.
There was a problem hiding this comment.
I copied this as is from symbol_diff.rs. Both symbol_diff.rs and this pass hit the same issues because they both iterate through .dynsym (pass from this PR also iterates through .gnu.version).
There was a problem hiding this comment.
Ah, gotcha. The comment does seem written in the way I would write it :) I guess it could be good to have the shared logic in a single method - probably wouldn't be less code, but would make it clear that they're have the same logic.
There was a problem hiding this comment.
Added DiffMode::IgnoreMissingValues, so we can get rid of the duplicated conditions.
Also, this allows removing the ignores for cases where only one of the linkers outputs a specific symbol. So this pass no longer duplicates missing symbols found by symbol_diff.rs.
Not to confuse with symbols without versions, empty version is `foo@`. It has special properties, making the linker ignore version script lines regarding this symbol. Before this PR, fuse2 would fail to build with "Symbol fuse_new has undefined version".
1a5717b to
a123ba3
Compare
| // TODO: this duplicates `symbol_diff.rs` | ||
| // On aarch64, GNU ld emits a dynamic symbol called "_stack", which it puts in some section | ||
| // or other that doesn't make sense. e.g. ".got.plt". It probably puts it in that section | ||
| // because it's closest to the value that it assigns to the symbol. It's not clear where | ||
| // this symbol comes from. It's neither in any input files, nor in GNU ld's built-in linker | ||
| // script. | ||
| if sym_name == b"_stack" { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
I was able to get GNU ld to emit "_start" in its linker script by running ld -m armelf --verbose on aarch64.
| // return Ok(None); | ||
| // } | ||
|
|
||
| // There is a quirk that I couldn't find docs for. When a symbol has an empty version |
There was a problem hiding this comment.
Have you checked the behaviour of LLD? Does it do the same thing?
There was a problem hiding this comment.
It doesn't mark unversioned symbol as hidden version, so we get this diff with LLD:
version.fuse_new
wild FUSE_2.2 (hidden),FUSE_2.5 (hidden),FUSE_2.6,local or global (hidden)
ref FUSE_2.2 (hidden),FUSE_2.5 (hidden),FUSE_2.6,local or global
FUSE_2.6 is non-hidden with all three linkers, local or global is hidden with GNU ld and Wild.
Other than that, it seems to be the same.
That might be related to the fact that there are many (compared to other architectures) debug info relocations pointing to symbols: https://github.com/davidlattimore/wild/blob/f757450b8780f70a3a27334432ac93256191b9f4/libwild/src/layout.rs#L4925 |
|
Other linkers don't have them, that's what puzzling me. Thanks anyway, I'll split out as a new issue. |
davidlattimore
left a comment
There was a problem hiding this comment.
Thanks for fixing this and sorry about the delay reviewing, I didn't realise you were waiting for me
|
It's fine, if it was something important I'd have pinged you sooner. |
Not to confuse with symbols without versions, empty version is
foo@. It has special properties, making the linker ignore version script lines regarding this symbol.Before this PR, fuse2 would fail to build with "Symbol fuse_new has undefined version".