Nexus3 Availability During OrientDB Backup

sonatype3 · October 17, 2020, 10:08am

Hello Community!

I’m relatively new to Nexus and I want to understand this tip from the documentation:

Write access to databases is temporarily suspended until a backup is complete. It’s advised to schedule backup tasks during off-hours.

(Prepare a Backup)

Does the above comment mean that users will get an error when trying to use the Nexus during the backup? Can users read artifacts during the backup? Can they publish / write content during the backup?

Our OrientDB database is 1.2GB (with an 8TB blob store). It took 19s to backup those 1.2GB using the Backup Task. This created a file of 100-200MB. I tried to write to the Nexus - using a test upload - at the same time as the backup, but I missed the window because the backup was so fast!

The above Nexus is a production instance of OSS, where I ideally need to build a DEV instance to peform backup and restore testing … and further my understanding.

Hoping you can help.
Kind Regards,
Edd

jgergel · October 17, 2020, 1:13pm

Hi Edd, Welcome to the Sonatype Community!

While the Task is running that exports database for backup, Nexus Repository Manager will be in a read only state. Users/processes will not be allowed to upload new artifacts.

Tip: remember to take a regular back up copy of the database exported files and all of the Sonatype-work directory for disaster recovery purposes.

Cheers!

sonatype3 · October 17, 2020, 6:45pm

Thank you Jerry!

Yes, we’re backing up that directory with an EBS snpshot in AWS. I’ve set the Backup Task to run just before the EBS [file-system] snapshot, so that the OrientDB backup output and the blob storage all get captured in the same backup.

I have one related question where maybe the Community may be able to help …

We’ve recently found out that our users would like to move towards a Recovery Time Objective of 1hr, so I am thinking of running this OrientDB backup every hour and taking the EBS snapshot every hour too. I may need to implement a solution that stops our Bamboo pipelines coincidentally publishing to the Nexus each hour at the same time as the OrientDB backup, so they do not sporadically fail when they hit that exact 19 seconds of backup activity. Maybe the best solution is to modify our jobs so that they retry the activity … ??

Ideas most welcome.

Best Regards,
Edd

jgergel · October 19, 2020, 3:58pm

Hi Edd,

Thanks for reaching back out to us.

The Frequency of the export is really related to your needs in a Recovery Point Objective (data). RTO (Recovery Time Objective) is how long it takes for you to recover the system to a useful state. If your Recovery Point Objective is defined as 6 hours, then you only need to run the Export every 6 hours.
Also, if you are trying to reduce the number of orphaned assets in between backups, Repair - Reconcile component database from blob store will help reduce that situation.

On the topic of retrying the upload, yes we have seen customers add a retry when the upload fails. The delay time can be whatever you feel is appropriate.

Also, the amount of time it takes to export the databases will grow over time. As you add more components or versions, the export will take a little longer. I suggest giving yourself a little buffer in the timing and also review stats checking for growth and make adjustments as needed.

Hope all of this helps. Cheers!