Explain Repair Index and Update Index Options

Nexus Repository Manager OSS 2.15.1-02

In light of several indexing issues we are experiencing (see here, here and here), I am confused.

Building on some 3 year old advice,

  1. The lucene search indexes in your Nexus Repo instance are corrupt.

  2. To fix this, shut down Nexus, and rename “sonatype-work/nexus/indexer” to “sonatype-work/nexus/indexer.old”. Then restart, and go to “administration/scheduled tasks” and create and run a repair indexes task against all repositories.

We ran “repair indexes task”, which my understanding corresponds to the built-in Scheduled (manual) Task, Repair Repositories Index. After 7.5 hours, it failed but the error provided no indication where it failed or how far along things were. So we are re-running - and expecting different outcome :rofl:. I believe this is going to go through everything again.

Nevertheless, my question is, there are menu options, when you right-click in Repositories on :

  • a group: Repair Index | Update Index
  • a repository (or proxy): Repair Index | Update Index
  • When you Browse storage in a Group Repository and click on an a folder or item element in the tree: Update Index

I would assume Repair Index on a specific repository only repairs that specific repository (there’s also a Scheduled (manual) Task to that effect), correct?

What effect does Repair Index have if the indexer directory is not removed vs when it is. There’s a complete rebuild if the indexer directory is removed ?

What’s the difference between the indexer directory and the individual .index directories within each directory ? Do they need to be removed too?

If the Index is assumed to corrupt, what is the effect of clicking Update Index?
Does Update Index on a Group also trigger the pre-Update Indexing on the associated repositories,or must one manually Update indexes on all member repositories (and proxies) prior to the proxy?

What happens when you click Update Index on an element in the browse Storage, vs a folder vs a folder with sub-folders?

There are also other Scheduled Task options:
Download Indexes
Publish Index
Optimize Repository Index
Update Repository Index
Repair Repository Index

So, how does “Optimize” interact with Repair or Update or Corrupted indexes?

I really just want to fix my problem but don’t understand why it’s broken or what the right sequence of steps are to fix it.

FTR: the indicators “something is wrong” are:

  • Navigating the “Browse Storage”, reports “com/path (not found)
  • Not being able to find known elements via Advanced Search

plus:


Task ID: 97
Task Name: PublishIndexAuthorized
Stack trace:
java.lang.IndexOutOfBoundsException: Index: 7650, Size: 35
Task ID: 18
Task Name: Optimize All Indexes
Stack trace:
java.io.IOException: Exception(s) happened during optimizeAllRepositoriesIndex()

Some additional information on actions taken …

After 11 days of waiting for the RepairAllIndexes to complete, which it did not, the task was cancelled.
I then set about to manually reindex all the repositories, hosted and proxied.

It took just over an hour to reindex (repair index) “Central”, 10 mins for “sonatype-grid-releases” and 10 mins for the other proxies.
The largest hosted repos took about 20 mins each for snapshots and release, but the complete set of 17 hosted repos took about one hour. Details below.

Thus, I had no idea what it was doing for 11+ days. That is, until I started reindexing the 13 groups, reindexing group “hosted-2-all” AFTER I had done the hosted and the proxies. Then I discovered the following in the logs:

org.sonatype.nexus.index.tasks.RepairIndexTask - Scheduled task (RepairIndexTask) started :: Repairing repository index "hosted-2-all" from path / and below.
org.sonatype.nexus.index.NexusScanningListener - Scanning of repositoryID="hosted-2" started.
org.sonatype.nexus.index.NexusScanningListener - Scanning of repositoryID="hosted-2" finished: scanned=39245, added=39245, updated=0, removed=0, scanningDuration=0:10:51.617
org.sonatype.nexus.index.DefaultIndexerManager - Publishing index for repository hosted-2
org.sonatype.nexus.index.NexusScanningListener - Scanning of repositoryID="hosted-2-snapshots" started.
org.sonatype.nexus.index.NexusScanningListener - Scanning of repositoryID="hosted-2-snapshots" finished: scanned=499, added=499, updated=0, removed=0, scanningDuration=0:01:48.198
org.sonatype.nexus.index.DefaultIndexerManager - Publishing index for repository hosted-2-snapshots
org.sonatype.nexus.index.DefaultIndexerManager - Publishing index for repository hosted-2-all
org.sonatype.nexus.index.tasks.RepairIndexTask - Scheduled task (RepairIndexTask) finished :: Repairing repository index "hosted-2-all" from path / and below. (started 2022-08-31T08:50:17+00:00, runtime 0:13:09.912)
org.sonatype.nexus.configuration.application.DefaultNexusConfiguration - Applying Nexus Configuration due to changes in [Scheduled Tasks] made by *TASK...

It would appear that reindexing a group causes a reindex of all the repos in a group. This appears problematic as our largest repo pair (release+snapshot) appears in most of the groups, and Cental apepars in many as well.

So, I’m lead to believe as ReindiexAllRepos, in fact reindexes everything, including groups and then repeatedly reindexes the same repos over and over. This is not the expected outcome. I also have no idea what the impact is if these actions are executed in parallel on the same target repo from multiple groups.

Sadly, at the end of this exercise, I am still seeing errors in the logs, including the original error, so not sure I accomplished anything and am not much further ahead in understanding the actions.

repositoryID="proxy-central" finished: scanned=138930, added=1, updated=134662, removed=0, scanningDuration=1:11:48
repositoryID="proxy-releases" finished: scanned=10, added=2, updated=3, removed=0, scanningDuration=0:09:31
repositoryID="proxy-1" finished: scanned=1101, added=1101, updated=0, removed=0, scanningDuration=0:00:19
repositoryID="proxy-2" finished: scanned=943, added=410, updated=533, removed=0, scanningDuration=0:00:29
repositoryID="proxy-3" finished: scanned=1731, added=1231, updated=500, removed=0, scanningDuration=0:00:53
repositoryID="proxy-4" finished: scanned=8659, added=1771, updated=2443, removed=0, scanningDuration=0:04:01
repositoryID="proxy-6" finished: scanned=75, added=0, updated=43, removed=0, scanningDuration=0:00:04
repositoryID="proxy-7" finished: scanned=2959, added=8, updated=2951, removed=0, scanningDuration=0:01:02
repositoryID="proxy-8" finished: scanned=686, added=438, updated=204, removed=0, scanningDuration=0:04:02
repositoryID="hosted-1" finished: scanned=24, added=24, updated=0, removed=0, scanningDuration=0:00:01
repositoryID="hosted-1-snapshots" finished: scanned=0, added=0, updated=0, removed=0, scanningDuration=0:00:00
repositoryID="hosted-2" finished: scanned=39245, added=39245, updated=0, removed=0, scanningDuration=0:10:51
repositoryID="hosted-2-snapshots" finished: scanned=499, added=499, updated=0, removed=0, scanningDuration=0:01:38
repositoryID="hosted-3" finished: scanned=4889, added=4889, updated=0, removed=0, scanningDuration=0:01:32
repositoryID="hosted-3-snapshots" finished: scanned=0, added=0, updated=0, removed=0, scanningDuration=0:00:00
repositoryID="hosted-4" finished: scanned=331, added=331, updated=0, removed=0, scanningDuration=0:00:18
repositoryID="hosted-5" finished: scanned=174, added=174, updated=0, removed=0, scanningDuration=0:00:03
repositoryID="hosted-6" finished: scanned=10722, added=10722, updated=0, removed=0, scanningDuration=0:04:08
repositoryID="hosted-7" finished: scanned=1563, added=1563, updated=0, removed=0, scanningDuration=0:00:22
repositoryID="hosted-7-snapshots" finished: scanned=2, added=2, updated=0, removed=0, scanningDuration=0:00:00
repositoryID="hosted-8" finished: scanned=4555, added=4555, updated=0, removed=0, scanningDuration=0:01:39
repositoryID="hosted-8-snapshots" finished: scanned=962, added=962, updated=0, removed=0, scanningDuration=0:00:58
repositoryID="hosted-9" finished: scanned=6107, added=6107, updated=0, removed=0, scanningDuration=0:04:13
repositoryID="hosted-10" finished: scanned=84367, added=84367, updated=0, removed=0, scanningDuration=0:20:15
repositoryID="hosted-10-snapshots" finished: scanned=38761, added=38761, updated=0, removed=0, scanningDuration=0:18:48
repositoryID="hosted-11" finished: scanned=0, added=0, updated=0, removed=0, scanningDuration=0:00:12
repositoryID="hosted-12" finished: scanned=2469, added=2469, updated=0, removed=0, scanningDuration=0:01:13
repositoryID="hosted-13" finished: scanned=16, added=16, updated=0, removed=0, scanningDuration=0:00:01
repositoryID="hosted-14" finished: scanned=140, added=140, updated=0, removed=0, scanningDuration=0:00:15