Rewrite rustc_span::symbol::Interner to reduce lock contention#157701
Rewrite rustc_span::symbol::Interner to reduce lock contention#157701heinwol wants to merge 1 commit into
rustc_span::symbol::Interner to reduce lock contention#157701Conversation
| { | ||
| Entry::Occupied(v) => v.get().1, | ||
| Entry::Vacant(view) => { | ||
| let arena_write_lock = self.arena.lock(); |
There was a problem hiding this comment.
The arena can be stored behind the same lock as the map, right? The arena will only be locked while the map is write locked.
There was a problem hiding this comment.
unfortunately, RwLock<T> Requires T: Sync, which DroplessArena is not
EDIT: hold on, i'm a bit lost here, rechecking
|
I'll run the single-threaded benchmarks after #157701 (comment) is addressed. In the meantime, @heinwol could you run the parallel benchmarks locally using the https://github.com/Zoxc/rcb tool? |
|
Reminder, once the PR becomes ready for a review, use |
3742d99 to
d311ef7
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
The original implementation used a single `rustc_data_structures::sync::lock::Lock` for both reads and writes, which _allegedly_ caused unnecessary lock contention for read-heavy highly-parallelized scenarios. Now we have 2 locks: `RwLock` for the symbol map and `Lock` for the arena
d311ef7 to
d51f1b7
Compare
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Rewrite `rustc_span::symbol::Interner` to reduce lock contention
|
There's a question, however: |
| pub(crate) struct Interner(Lock<InternerInner>); | ||
| pub(crate) struct Interner { | ||
| map: RwLock<InternerMap>, | ||
| arena: Lock<DroplessArena>, |
There was a problem hiding this comment.
It shouldn't be very hard to add an AtomicDroplessArena. Actually there's the VecCache from rustc_data_structures that functions almost exactly like it.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (4b2184e): comparison URL. Overall result: ❌ regressions - please read:Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf. Next, please: If you can, justify the regressions found in this try perf run in writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -3.0%, secondary -0.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 4.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis perf run didn't have relevant results for this metric. Bootstrap: 517.006s -> 516.711s (-0.06%) |
The original implementation used a single
rustc_data_structures::sync::lock::Lockfor both reads and writes, which allegedly caused unnecessary lock contention for read-heavy highly-parallelized scenarios. Now we have 2 locks:RwLockfor the symbol map andLockfor the arenaBased on #157252
r? @petrochenkov
could you run the CI benches please?