seq: Performance optimization#9557
Conversation
|
GNU testsuite comparison: |
|
Would be great to add your example to the benchmarks |
In a separate pr please :) |
anastygnome
left a comment
There was a problem hiding this comment.
Any reason for not using
n.checked_ilog10().unwrap_or(0) + 1Which should be available?
Seemed unnecessary given it is just a constant. If there is any benefit to this alternative I am happy to refactor. |
88d12d3 to
f026b7a
Compare
Merging this PR will improve performance by 58.82%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | seq_large_integers |
2.1 ms | 1.4 ms | +58.82% |
| ⚡ | Memory | seq_large_integers |
52.8 KB | 46.1 KB | +14.49% |
Comparing FidelSch:seq-optimization (43b0ca5) with main (bb91a5b)
Footnotes
-
38 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
|
any idea why tsort_input_parsing_heavy regressed ? |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
If I recall correctly the performance is similar to your solution and it's part of the language. Could you try sending a commit to trigger the benchmarks with this solution instead? |
|
note, upstream fails this way - what do you think we should do here? |
|
After some digging, the maximum value |
ok, could you please document this in docs/src/extensions.md ? thanks |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
@FidelSch a few changes seem to be unrelated, no ? |
|
Ah, they seem to have been introduced by my markdown formatter, did not notice until now. Should I roll them back? |
|
yes please |
02ac2b2 to
f1c2b37
Compare
Clarify seq output accuracy and value range limitations compared to GNU coreutils.
|
GNU testsuite comparison: |
sylvestre
left a comment
There was a problem hiding this comment.
The optimization idea is sound — avoiding to_string() on large numbers is a real win.
However, the bits() / LOG2_10 approximation can be off by 1 for some values (e.g., exact powers of 10). Since this is used for padding width, being off by one character could produce misaligned output. Have you verified against the GNU test suite?
Reading #6182 I noticed that most of the time spent running
cargo run seq 4e4000003 4e4000003was just BigUint::to_string()Reading the code I found it is being called twice, once to get the actual string representation of the first number, and again on the last number, but only to get its length; and discarding the actual result.
This seems fine on fairly small numbers, but its efficiency degrades significantly on larger ones.
In my machine, this change resulted in a ~2x speedup on the mentioned case, and a marginal but seemingly better performance on smaller cases.