Error in latest version 3.47.0 of Nexus

sanchhoker · February 8, 2023, 2:05pm

org.sonatype.nexus.repository.storage.MissingBlobException: Blob default@XXXX exists in metadata, but is missing from the blobstore

mysticdrew · February 8, 2023, 3:03pm

This happened to us also, all of our nexus repos. We deleted our artifacts and republished.
Luckily we have a small amount of artifacts, I feel for people with a lot.

mpiggott · February 8, 2023, 3:43pm

This suggests the database references a file that is missing from disk.

Common causes are:

Process outside of Nexus removed a file
Restored a database backup that is not in sync with your filesystem
Database corruption typically caused by improper shutdown or running out of space on disk

ldurant · February 8, 2023, 4:25pm

Hi Sanjeet,

I’m the technical writer for Nexus Repository and not the most technical, but I felt kind of rude just posting the 3.47 announcement and not responding here. So, I just wanted to let you know that I’ve seen this and am digging around. Could you provide any more detail about how this error came up? Did you recently use the Repair metadata/blobstore task?

pba · February 8, 2023, 6:48pm

Hi,

I’m seeing the same issue on our Nexus OSS 3.
I upgraded our Nexus from 3.43.0 to 3.47.0 in the way we usually upgrade, and after starting I’m getting these errors. I’m now running the task Repair - Reconcile component database from blob store for our blob stores but it’s taking quite long probably due to the big size of the repo.

Upgrade steps:
We have a symlink nexus-current which points to the current Nexus version we use.
Stop Nexus 3.43.0.
Unpack the 3.47.0 tar.
Change the symlink to point to the new Nexus 3.47.0 directory.
Start Nexus 3.47.0.
Nothing was changed in between and low disk space doesn’t seem to be the issue.

sanchhoker · February 8, 2023, 7:05pm

Hi Lisa

Appreciate you writing back on this.
The way we run nexus is we always use the latest version , and pur nexus broke last night after upgrade.
Spent whole day wondering what is happening.
Ended up running Reconcile with Blob which took 4 hours or so and then issue was resolved

sanchhoker · February 8, 2023, 7:07pm

Yes it took around 4 hours for us but did resolved the problem.
But the version definitely has breaking changes!

pba · February 8, 2023, 7:08pm

Thanks for letting me know!
Hope it also solves the problem for our Nexus. Will share the result here.

ldurant · February 8, 2023, 7:10pm

I’m glad the issue was resolved; I hope the task resolves it for Paul as well. I’m not sure why it happened, but I appreciate you bringing it to our attention and letting us know what worked. I’ll let you know if I hear anything more from the team.

djeanprost · February 8, 2023, 7:10pm

Hello can you please tell if we should use reconcile component database from blobstore or reconcile date metadata from blob store ?
thank you
By the way, same problem here

ldurant · February 8, 2023, 7:12pm

I believe he is using the Repair - Reconcile component database from blobstore task. (Link to docs because…tech writer )

sanchhoker · February 8, 2023, 7:24pm

Yes as confirmed by Lisa we used Reconcile component database from blob store option

plynch · February 8, 2023, 7:26pm

Without seeing much more detail about the specific errors ( like full logs from the upgrade and a support zip ) we won’t be able to advise what the issue is or the correct solution. In general we do not advise to arbitrarily run the Reconcile task to solve random or limited errors on upgrade. If you have a product license and are a paying customer, please open a support ticket at https://support.sonatype.com.

If you are using Nexus Repo OSS, please open a Jira issue with support zip and full logs as compressed attachments from the upgrade inside our NEXUS project at Loading... , keeping in mind only you and Sonatype employees will see your attachments in the Jira issue you create there and we have no SLA response times for Jira issues for OSS users, but we will try to help on a best effort basis.

djeanprost · February 8, 2023, 10:08pm

Hello,

Here is the summary regarding my case : after an upgrade from 3.45 to 3.47, components download started to fail. Although file system is there, it looks like database was corrupted.
I tried to execute the reconcile task, but it’s far tool long to run against my 260GB blobstore. judging from task execution log, it takes 1 minute to process 4 elements. Won’t be ready for tomorrow morning.
I contacted ops to use a backup of nexus, and we will revert to 3.45.0 which was running as expected.
I have collected a support zip and nexus logs. I will concentrate to fix my production environment, and if I’m still ok after tomorrow, I’ll try to give as much information as possible.
For the moment, we don’t know what happened, but it looks like there is a critical issue around

pba · February 9, 2023, 6:04am

Thanks for sharing your situation @djeanprost.
Same situation here, the reconcile task is still running after 14 hours.
Our blob stores are about the same size.
Trying to get a backup to restore.
Can share our logfiles and support zip if needed.

drew.swine · February 9, 2023, 6:41am

Hi,
In our case, customers are blocked by this issue. My repair is still running after 10 hours…
We are studying the possibility to get a restore.

Update: Finally, we change the tag in our docker-compose and now we’re using the 3.46 version to fix this issue. One customer is working fine.

But another customer after starting the container in the 3.46 version, they find this error:
jenkins org.sonatype.nexus.repository.httpbridge.internal.ViewServlet - Failure servicing: GET /repository/maven-public-qa/org/springframework/boot/spring-boot-starter-parent/1.5.22.RELEASE/spring-boot-starter-parent-1.5.22.RELEASE.pom
java.lang.IllegalArgumentException: Not a valid blob reference

Do you have any clue?

Regards.

djeanprost · February 9, 2023, 8:41am

While backup is being restored, I created this issue : Log in - Sonatype JIRA

djeanprost · February 9, 2023, 9:00am

Hello Drew
As the problem might be a database corruption or something like that, reverting to previous version doesn’t fix this. This is the reason why you still have those kind of problem.
Unless you restore the backup or use the reconcile repair task, I think you still have the problem.

brummer · February 9, 2023, 9:27am

We faced the same issue. We are using the Nexus docker image. We restored the backup with the 3.46. Basically we were not able to download any artifact which was deployed before the upgrade. New artifacts worked without any problem.

andrea · February 9, 2023, 10:14am

Hi all,

Due to the same error, I restored previous version doing the following:

Stop Nexus OSS 3.47
Restore sonatype-work directory
Start Nexus OSS 3.43

Now i have tons of WARN messages in nexus.log

2023-02-09 11:01:07,024+0100 WARN  [elasticsearch[81953454-69ECB210-9FAC3F16-38F00418-A66D7DF6][generic][T#14]]  *SYSTEM org.elasticsearch.cluster.action.shard - [81953454-69ECB210-9FAC3F16-38F00418-A66D7DF6] [3775d91831f0772efcb39b18617455a7fffa670f][0] received shard failed for target shard [[3775d91831f0772efcb39b18617455a7fffa670f][0], node[IeTnYAWSReunbMsJeABLyw], [P], v[5], s[INITIALIZING], a[id=s6TngNq9TIWokt18VP_-0Q], unassigned_info[[reason=ALLOCATION_FAILED], at[2023-02-09T10:01:06.998Z], details[failed recovery, failure IndexShardRecoveryException[failed to recovery from gateway]; nested: EngineCreationFailureException[failed to create engine]; nested: CorruptIndexException[file mismatch, expected id=1cvao2rwyu5xpm6yl5f9vwmlo, got=77flm8i6jy1zis9diev7y3jad (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/opt/nexus/sonatype-work/nexus3/elasticsearch/nexus/nodes/0/indices/3775d91831f0772efcb39b18617455a7fffa670f/0/index/_g.si")))]; ]]], indexUUID [QDNoE78qTtKuF3kfwLRrgg], message [failed recovery], failure [IndexShardRecoveryException[failed to recovery from gateway]; nested: EngineCreationFailureException[failed to create engine]; nested: CorruptIndexException[file mismatch, expected id=1cvao2rwyu5xpm6yl5f9vwmlo, got=77flm8i6jy1zis9diev7y3jad (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/opt/nexus/sonatype-work/nexus3/elasticsearch/nexus/nodes/0/indices/3775d91831f0772efcb39b18617455a7fffa670f/0/index/_g.si")))]; ]
org.elasticsearch.index.shard.IndexShardRecoveryException: failed to recovery from gateway
	at org.elasticsearch.index.shard.StoreRecoveryService.recoverFromStore(StoreRecoveryService.java:250)
	at org.elasticsearch.index.shard.StoreRecoveryService.access$100(StoreRecoveryService.java:56)
	at org.elasticsearch.index.shard.StoreRecoveryService$1.run(StoreRecoveryService.java:129)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
Caused by: org.elasticsearch.index.engine.EngineCreationFailureException: failed to create engine
	at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:152)
	at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25)
	at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1513)
	at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1497)
	at org.elasticsearch.index.shard.IndexShard.internalPerformTranslogRecovery(IndexShard.java:970)
	at org.elasticsearch.index.shard.IndexShard.performTranslogRecovery(IndexShard.java:942)
	at org.elasticsearch.index.shard.StoreRecoveryService.recoverFromStore(StoreRecoveryService.java:241)
	... 5 common frames omitted
Caused by: org.apache.lucene.index.CorruptIndexException: file mismatch, expected id=1cvao2rwyu5xpm6yl5f9vwmlo, got=77flm8i6jy1zis9diev7y3jad (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/opt/nexus/sonatype-work/nexus3/elasticsearch/nexus/nodes/0/indices/3775d91831f0772efcb39b18617455a7fffa670f/0/index/_g.si")))
	at org.apache.lucene.codecs.CodecUtil.checkIndexHeaderID(CodecUtil.java:266)
	at org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:256)
	at org.apache.lucene.codecs.lucene50.Lucene50SegmentInfoFormat.read(Lucene50SegmentInfoFormat.java:86)
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:362)
	at org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:171)
	at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:949)
	at org.elasticsearch.index.engine.InternalEngine.createWriter(InternalEngine.java:1086)
	at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:146)
	... 11 common frames omitted
	Suppressed: org.apache.lucene.index.CorruptIndexException: checksum passed (93f84479). possibly transient resource issue, or a Lucene or JVM bug (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/opt/nexus/sonatype-work/nexus3/elasticsearch/nexus/nodes/0/indices/3775d91831f0772efcb39b18617455a7fffa670f/0/index/_g.si")))
		at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:379)
		at org.apache.lucene.codecs.lucene50.Lucene50SegmentInfoFormat.read(Lucene50SegmentInfoFormat.java:117)
		... 16 common frames omitted

and my nexus.log file size it’s growing at 2MB/s…

I think it was caused by the backup being done with the application running, but i’m not sure about that.

Nexus OSS 3.43 is working fine at the moment, but i don’t know if (and when) the indexes will be rebuilt.

Can you help me pls ?

Thanks a lot for any support

Kind regards,
Andrea.