Vec::dedup_by docs explicit function argument order#157995
Conversation
|
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Mark-Simulacrum (or someone else) some time within the next two weeks. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
Why was this reviewer chosen?The reviewer was selected based on:
|
| /// If it returns `true`, the `current` element is removed from the vector. | ||
| /// | ||
| /// If the vector is sorted, this removes all duplicates. | ||
| /// The element `prev` occurs *before* `current` in the vector (`[.., prev, current, ..]`), |
There was a problem hiding this comment.
| /// The element `prev` occurs *before* `current` in the vector (`[.., prev, current, ..]`), | |
| /// The element `prev` occurs *before* `current` in the vector (`[.., prev, .., current, ..]`), |
It's not obvious to me whether / why we make the ordering guarantee, but I think it's probably clearer if we don't imply prev is directly before current? It's definitely not the case if you view this as the 'original' slice -- we batch move elements back into place, so there's dropped elements between prev and current.
There was a problem hiding this comment.
Yeah, I misread the source-code. The first part checks whether a duplicate element exists, and elements are in consecutive order there. But that's not true for the actual de-duplication part later on.
For me its not about the ordering guarantee, but mentioning that the order (of the current implementation) is unexpected. And since that ordering does exist, I think we now should also guarantee it. Any future re-implementation of dedup_by changing the order would break a lot of code.
There was a problem hiding this comment.
I think the ordering is guaranteed already:
The elements are passed in opposite order from their order in the slice
even if it's a bit more vaguely worded, I think this is the same guarantee you are making, right?
(happy to take a wording improvement but I'm not seeing a new guarantee here)
There was a problem hiding this comment.
Yes, its only a wording improvement.
| /// however, the order (ascending vs. descending) can matter. | ||
| /// | ||
| /// Both references passed to `same_bucket` are mutable. | ||
| /// This allows merging elements by mutating `prev` and returning `true`. |
There was a problem hiding this comment.
Are there simple examples for both the 'complicated predicates' and the 'merging' discussed here that we could add? It seems plausibly useful to show how that might look. Definitely the current example given seems like it would be better to stay as a, b arguments -- current, prev is just adding noise there.
There was a problem hiding this comment.
The reason I changed a, b in general was that (for me) they imply the order [.., a, b, ..], which is not the case. If you mean just change the naming in the example, then at least use maybe x, p or something.
There was a problem hiding this comment.
Maybe I can think of a simple example of complicated predicates and merging yet. The example I came across in my personal code was sorting Vec<Range<usize>> by .start and then merging intervals that were subsets of another.
|
Reminder, once the PR becomes ready for a review, use |
Co-authored-by: Mark Rousskov <mark.simulacrum@gmail.com>
|
I added a non-trivial example which depends on the sorting order and demonstrates how merging could look like. I also shortened the variable names. @rustbot ready |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Because they are essentially the same docs, I wanted to link from @rustbot ready |
Clarify and expand
Vec::dedup_byandslice::partition_dedup_bydocs.Fixes #157959