N3: blob store corruption

OK, I restored from the backup taken prior to the storage issue, however the result wasn’t much different and there were still many issues.

If someone in the future encounters something similar, following are steps that I performed (on the restored backup) in order to recover from that:

  1. With Nexus stopped, launch the OrientDB console, connect to the component database, remove duplicate asset entries from the components database (there were many and they prevented the index asset_bucket_component_name_idx rebuild).
    Note: The shown SQL statement deletes the older duplicate entries. If you remove the DESC at the end it will remove the new duplicate entries instead. After checking the data I decided that deleting the older duplicates is the better choice.
/usr/lib/jvm/jre-1.8.0/bin/java -jar /opt/nexus3/lib/support/nexus-orient-console.jar
CONNECT PLOCAL:/sonatype-work/nexus3/db/component admin admin
DELETE FROM asset WHERE @rid NOT IN(SELECT FIRST(LIST(@rid)) FROM asset GROUP BY name,bucket ORDER BY @rid DESC)
  1. Still in the OrientDB console, drop the browse_node table (to invoke its full rebuild and avoid the org.sonatype.nexus.repository.browse.internal.orient.BrowseNodeCollisionException: Node already has an asset exceptions).
DROP CLASS browse_node
  1. Still in the OrientDB console, repair the component database and rebuild its indexes (mainly to verify that things are fine now).
REPAIR DATABASE --fix-graph
REPAIR DATABASE --fix-links
REPAIR DATABASE --fix-ridbags
REPAIR DATABASE --fix-bonsai
REBUILD INDEX *
  1. Still in the OrientDB console, export, drop and reimport the database in order to start in clean state and with a smallest possible database (export&import is currently the only way how to compact the OrientDB).
    Note: In some cases the drop isn’t successful, then you will need to DISCONNECT & EXIT from the OrientDB console, delete the database directory on the filesystem and launch the OrientDB console again. and continue.
EXPORT DATABASE /tmp/nexus3-component
DROP DATABASE
CREATE DATABASE PLOCAL:/sonatype-work/nexus3/db/component admin admin
IMPORT DATABASE /tmp/nexus3-component.json.gz -preserveClusterIDs=true
DISCONNECT EXIT
  1. Start Nexus, let it rebuild the browse nodes (example in the log: Task 'Rebuild repository browse tree - (npm-internal,npm-external,npm-all,documentation,npmjs)' [create.browse.nodes] state change RUNNING -> OK).

  2. Login to the Nexus admin console and manually start the following tasks, one at a time and in the shown order:
    Repair - Reconcile component database from blob store (on all blob stores)
    Repair - Reconcile npm /-/v1/search metadata (on all repositories)
    Repair - Rebuild repository browse (on all repositories)
    Repair - Rebuild repository search (on all repositories)
    Repair - Rebuild npm metadata (on all repositories)

Honestly I hope that when Nexus 4 comes out one day, it uses a completely different storage backends and not this horrible mess that one has only unnecessary troubles with.
We still use Nexus 2 as our main repository system (we use Nexus 3 only where it’s unavoidable) for this very reason. Maintaining Nexus 2 and dealing with any consequences of crashes or other issues is a breeze when compared to Nexus 3.

3 Likes