Hosted PyPI repository /simple interface does not provide SHA256 hashes

We are using Sonatype Nexus Repository Manager (OSS 3.21.1-01) to provide a hosted PyPI repository.

Is there a way to make the Nexus hosted PyPI repository provide SHA256 hashes in the /simple web interface’s href attributes that link to the packages being served, as described in PEP 503? Currently, the Nexus hosted PyPI repository’s /simple web interface defaults to providing MD5 hashes for packages.

From PEP 503, describing the /simple interface:

The href attribute MUST be a URL that links to the location of the file for download, and the text of the anchor tag MUST match the final path component (the filename) of the URL. The URL SHOULD include a hash in the form of a URL fragment with the following syntax: #<hashname>=<hashvalue>, where <hashname> is the lowercase name of the hash function (such as sha256) and <hashvalue> is the hex encoded digest.

Repositories SHOULD choose a hash function from one of the ones guaranteed to be available via the hashlib module in the Python standard library (currently md5, sha1, sha224, sha256, sha384, sha512). The current recommendation is to use sha256.

On client systems that are FIPS 140-2 compliant (the MD5 algorithm is disabled), ‘pip’ cannot download packages from the Nexus hosted PyPI repository because it is providing MD5 hashes in the href fragment.

@james.l.brophy Thanks for raising this. Could you file this as an improvement ticket at http://issues.sonatype.org under the Nexus project please.

Thank you for the suggestion.

I created issue NEXUS-24127.

2 Likes

Yeah a create issue and be ignored for two years! Bravo!

Providing a md5 sum as a integrity check is a security vulerability. Together with CVE-2022-31289 user could replace legit package with malicious one (GitHub - corkami/collisions: Hash collisions and their exploitations) injecting custom code and defeat the integrity check for tools (like poetry) relying on the integrity of nexus. This one should have a CVE assigned to it.

Note that both the binary, and the hash come from the same source thus are only useful to determine whether there were errors downloading the file, not for security purposes.

If you are the victim of a man-in-the-middle, or if the server were compromised, then attacker would control both the hash, and the binary rendering the particular algorithm moot.

Minor point, but please note that CVE-2022-31289 is not a vulnerability. The student researcher who originally claimed it went public before validating it with us, and misunderstood what he had found. The researcher’s blog post has since been removed.

You are incorrect. If you have requirements with hashes specified in git and somebody tampers with nexus repository in any way checking of dependencies will fail. However if you have md5 you can make a zip collision in like in minutes. For that reason most “lock” files in git repos contain sha256 or sha512 checksums (like yarn.lock, pipfile.lock, …).

Please read:
https://pip.pypa.io/en/latest/topics/secure-installs/#hash-checking-mode

However, weaker ones such as md5, sha1, and sha224 are excluded to avoid giving a false sense of security.

Next version Poetry will calculate sha256 by itself for legacy repositories still using md5 like Nexus.

If I understand you correctly Nexus is useful to determine whether there were errors downloading the file, not for security purposes. That’s a red flag in my book. Please check with your security team.

Again, consider when both the hash and the binary have the source, attackers have the ability to modify both utilizing the same attack. Only when the hash and the binary are provided by two distinct systems can the hash be relied upon for security (i.e. metadata links to binaries found on a CDN.)

Well you can easily validate hashes on pypi in case of external library.

Lock files in git are external system. Of course you are not protected from the start when you add new dependency, but you are in case attack happens later on preserving the integrity. With md5 not so much.

Hi Matej,

Thank you for your comment. Our team is currently looking into the SHA256 hash request. We hope to have good news for you in the near future. Thanks!

Hi all,
Last week Sonatype released Nexus Repository 3.41.0 which now provides SHA256 hashes in /simple interface of PyPI repositories: Nexus Repository 3.41.0 Release Notes

From what I can see it looks like SHA256 is only being used for proxied repos. I think hosted repos are still based on MD5 sums…

Update: Found this detailed in [NEXUS-34950] pypi package versions published using twine before upgrading to 3.41.0 or later are missing from /simple index preventing discovery by clients - Sonatype JIRA. This is really unfortunate behaviour.