S3 Storage increased 10x over 8 months

nirfrankel.nf · February 29, 2024, 2:43pm

Previously our nexus repo bucket was stable at 2TB, early July we upgraded the version of Nexus to 3.32.0-03 from 3.17.0-01. The date of this coincides with the increase in storage.

I’ve used the S3 Storagelens to get a breakdown of the objects, we thought it might be failed multipart uploads, but it’s only a relatively small amount (20 GB), and versioning represents 16.8TB of the bloat. The bucket has ballooned from 2TB to 26.3TB. So there’s at least 7.5TB that is unaccounted for, also versioning was enabled prior and the storage was always consistent at 2TB so really it’s 24TB that is unexplained.
I’m not familiar with nexus at all, but I’ve been reviewing documentation and threads trying to understand what the volumes and chapters are. There’s a content directory with 43 vol directories and each has 47 chapter directories, from what I understand that’s a storage scheme of nexus. We also see the tmp directory there, with versioning represents a few TB. Finally there’s a directory called content with a subdirectory nexus-repository-docker, this is populated with thousands of directories named similar to this 00078b7e-d388-46a1-aa2d-355918d5cdfe, all with 2 .byte and 2 .properties files

mpiggott · February 29, 2024, 2:58pm

The main suggestions I would have:

Ensure you have not specified -1 in your blob store configuration for Expiration Days - Configuring Blob Stores
Ensure you have configured the tasks for Docker Delete Unused, and Docker Delete incomplete upload

Generally you should try to move to a significantly newer version given 3.32 was released in 2021.

nirfrankel.nf · February 29, 2024, 3:50pm

Thanks so much for responding Matthew, I’ve been looking into the deletion of unused docker images, there’s no cleanup policy but there is a cleanup task which executes daily to Cleanup repositories using their associated policies, after doing some reading I think this is the default task created by Nexus, when I check the Docker repo or any other repo for that matter there is no cleanup policy there.
I also read that:
The Admin - Compact blob store task does not apply to S3 blob stores, which are cleaned up using AWS Lifecycle.

I did see a disabled lifecycle rule for deleting objects that are older than 2 weeks, but it’s disabled and there’s no way to tell for how long. But ever since July the bucket has been increasing on a daily basis.
What could explain why we never saw the growth until July with the version being updated?

mpiggott · March 4, 2024, 6:27pm

Docker Delete incomplete upload

This job is responsible for removing partial uploads (the Docker client uploads layers in chunks over multiple requests) when uploads are not completed then you may have partial uploads left behind.

mpiggott · March 4, 2024, 6:29pm

https://help.sonatype.com/en/cleanup-policies.html#docker-cleanup-strategies

The other task referenced Docker Delete Unused identifies layers that aren’t referenced by tagged manifests and removes them.

nirfrankel.nf · March 11, 2024, 5:57pm

I’m not sure if this shows up on AWS but there is a filter for incomplete multipart uploads and this only represented about 20 GB, so not that significant in terms of the 28 TB we’re looking at.

nirfrankel.nf · March 11, 2024, 6:00pm

I haven’t been given the green light to delete unused docker images yet.

I was trying to look into how frequently Nexus communicates with AWS, perhaps limiting that. I’ve noticed that in the AWS bucket when we show the versions, there are thousands of versions for the same object, some being uploaded a few seconds a part. Versioning has always been enabled in the bucket and it has been around since 2018, but something happened when we updated the CFT template to the newer version of Nexus at the time.
Is there a way to look into how frequently Nexus uploads to AWS, or perhaps limit this to once a day or something?

mpiggott · March 12, 2024, 1:39pm

This is something internal to Nexus.

mpiggott · March 12, 2024, 1:47pm

For Nexus on the initial upload you should see 3 uploads:

Write properties
Write bytes
Rewrite properties

After the original upload the bytes are never re-written, properties files may be touched to mark as deleted depending on configuration. I’m aware of 2 single run tasks for pypi and rubygems that will update the properties file to add an additional hash.

nirfrankel.nf · March 12, 2024, 2:32pm

I just used the orientDB console and calculated the individual sizes of each repo.
Is it normal to see discrepancies between the total size of the blob store and the sum of the repos calculated using the orientDB console?
Or are the two not related?
For example we have a Docker repo: 10937684612293 → 10.9TB
But the size of the S3 blob store is a total of ~315GB
I expected the docker repo to the be the biggest one but didn’t expect the sum of the repos to be greater than what the GUI on the Nexus site was displaying.

mpiggott · March 12, 2024, 4:58pm

In some circumstances the blob store size can get out of sync.