crypto: Handle modexp with exponent of 1 separately#1424
Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes the modular exponentiation (modexp) implementation by handling the exponent value of 1 as a special case, avoiding 2 Montgomery multiplications. The optimization initializes the result with the base in Montgomery form and reduces the loop iterations by 1.
Changes:
- Modified
modexp_oddto initialize result with base instead of 1, and loop frombit_width() - 1instead ofbit_width() - Added special handling for exponent = 0 in
modexp_implto return 1 (or 0 if modulus is 1) without computation - Added test vectors covering edge cases including exponent 0, exponent with modulus 1, and modulus 2
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| lib/evmone_precompiles/modexp.cpp | Optimized modexp_odd to skip first iteration, added exponent=0 special case, improved performance by 15-42% for small exponents |
| test/unittests/precompiles_expmod_test.cpp | Added test vectors for edge cases: exponent 0, exponent with modulus 0/1/2 to ensure correctness |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1424 +/- ##
=======================================
Coverage 81.68% 81.68%
=======================================
Files 152 152
Lines 13576 13584 +8
Branches 3217 3218 +1
=======================================
+ Hits 11089 11096 +7
Misses 343 343
- Partials 2144 2145 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
This allows optimizing the main loop and avoid 2 Montgomery
multiplications.
```
│ old │ new │
│ gas/s │ gas/s vs base │
modexp<expmod_execute>/mod_len:8/exp_bits:33-14 875.8M ± 7% 891.7M ± 0% +1.81% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:16/exp_bits:33-14 868.9M ± 0% 877.8M ± 1% +1.02% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:24/exp_bits:33-14 217.1M ± 0% 223.9M ± 0% +3.14% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:32/exp_bits:33-14 218.0M ± 0% 225.0M ± 0% +3.24% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:32/exp_bits:256-14 237.4M ± 0% 239.4M ± 0% +0.87% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:32/exp_bits:8192-14 472.1M ± 0% 475.8M ± 0% +0.77% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:40/exp_bits:11-14 182.7M ± 0% 196.5M ± 1% +7.52% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:48/exp_bits:8-14 239.6M ± 0% 265.1M ± 2% +10.63% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:48/exp_bits:256-14 336.5M ± 0% 337.5M ± 0% +0.30% (p=0.001 n=11)
modexp<expmod_execute>/mod_len:56/exp_bits:6-14 297.9M ± 0% 339.2M ± 0% +13.87% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:64/exp_bits:5-14 348.9M ± 1% 407.0M ± 0% +16.65% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:72/exp_bits:4-14 116.0M ± 2% 144.1M ± 0% +24.21% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:112/exp_bits:4-14 271.0M ± 0% 335.1M ± 1% +23.66% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:136/exp_bits:3-14 86.38M ± 0% 114.13M ± 0% +32.12% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:192/exp_bits:2-14 107.2M ± 1% 153.1M ± 1% +42.91% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:504/exp_bits:2-14 47.21M ± 1% 67.21M ± 1% +42.37% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:512/exp_bits:2-14 48.79M ± 0% 69.45M ± 1% +42.34% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:512/exp_bits:8192-14 348.0M ± 0% 347.5M ± 1% ~ (p=0.478 n=11)
modexp<expmod_execute>/mod_len:520/exp_bits:2-14 50.24M ± 0% 71.75M ± 1% +42.81% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:1016/exp_bits:2-14 188.3M ± 0% 267.9M ± 0% +42.25% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:1024/exp_bits:2-14 191.5M ± 0% 271.7M ± 1% +41.93% (p=0.000 n=11)
modexp<expmod_execute>/mod_len:1024/exp_bits:256-14 700.3M ± 0% 700.7M ± 0% ~ (p=0.748 n=11)
modexp<expmod_execute>/mod_len:1024/exp_bits:2048-14 1.327G ± 0% 1.320G ± 1% -0.52% (p=0.003 n=11)
geomean 231.5M 268.3M +15.89%
```
06e0259 to
7117f8d
Compare
This allows optimizing the main loop and avoid 2 Montgomery multiplications.