Skip to content

improve pkg_resources.Distribution.hashcmp performance#2090

Closed
gotcha wants to merge 3 commits into
pypa:masterfrom
gotcha:optimize-ditribution-hashcmp
Closed

improve pkg_resources.Distribution.hashcmp performance#2090
gotcha wants to merge 3 commits into
pypa:masterfrom
gotcha:optimize-ditribution-hashcmp

Conversation

@gotcha

@gotcha gotcha commented May 6, 2020

Copy link
Copy Markdown

by storing location without md5.

Turns Distribution.location into a property to make hashcmp more performant.

Closes #2089

Pull Request Checklist

  • Changes have tests
  • News fragment added in changelog.d. See documentation for details

by storing location without md5
@gotcha

gotcha commented May 6, 2020

Copy link
Copy Markdown
Author

As the fix is not supposed to change any semantics, I wonder which sort of tests should be added.

Comment thread pkg_resources/__init__.py Outdated
instead of `property(getter, setter)`

Co-authored-by: Nguyễn Gia Phong <mcsinyx@disroot.org>
@gotcha gotcha changed the title improve pkg_resources.Distribution.hascmp performance improve pkg_resources.Distribution.hashcmp performance May 6, 2020
@jaraco

jaraco commented May 10, 2020

Copy link
Copy Markdown
Member

Although this approach achieves the desired result (only calculate _location_without_md5 once), it causes quite a bit of entanglement with the location property, unnecessarily adding a new ._location property. How about instead having _location_without_md5 be a cached property or method to avoid it being calculated more than once for any given Distribution?

@gotcha

gotcha commented May 11, 2020

Copy link
Copy Markdown
Author

I make location a property as there is afaik nothing that avoids setting location twice.

@jaraco

jaraco commented May 12, 2020

Copy link
Copy Markdown
Member

That's fair, and I had the same concern.

After inspecting the code, it appears it's only ever set once during the constructor. I'd hate to add code that explicitly adds support for setting the value later when it will only ever be set once during construction.

I'll go with #2098 until a use-case demands a setter. Thanks for the contrib!

@jaraco jaraco closed this May 12, 2020
@jaraco jaraco reopened this May 12, 2020
@jaraco

jaraco commented May 12, 2020

Copy link
Copy Markdown
Member

I think more important than before accepting either PR will be to define a problem that needs to be solved, so I'll await an answer in #2089

@jaraco

jaraco commented May 13, 2020

Copy link
Copy Markdown
Member

I prefer #2108. Thanks again for the contrib.

@jaraco jaraco closed this May 13, 2020
@gotcha

gotcha commented May 13, 2020

Copy link
Copy Markdown
Author

It is always better to remove code when possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pkg_resources.Distribution.hashcmp is slow

3 participants