When was the recommendation to use external PostgreSQL and the deprecation of k8s and orientDB declared?

d3-ito · December 26, 2022, 4:31am

https://help.sonatype.com/repomanager3/product-information/system-requirements#SystemRequirements-DatabaseLimitations

Please allow me to ask a question about the following statement shown in the URL above.

We highly recommend that you have your Nexus Repository 3 instance use an external PostgreSQL database. Repository Pro for an external PostgreSQL database or on migrating an existing Nexus Repository 3 instance to a PostgreSQL database.

We strongly recommend against running Nexus Repository 3 on an embedded database within container orchestration environments such as Kubernetes. Doing so can lead to severe data corruption.

The above description was not there in the past, but if anyone knows when it was added, please let me know.
I don’t know the update history of the page, so I am wondering when it was added.

Since it is mentioned in the Release Note for 3.31.0, was it added since then?
https://help.sonatype.com/repomanager3/product-information/release-notes/2021-release-notes/nexus-repository-3.31.0---3.31.1-release-notes

d3-ito · December 28, 2022, 2:23pm

https://help.sonatype.com/repomanager3/product-information/system-requirements#SystemRequirements-DatabaseLimitations

I have a question regarding the following information in the URL above.
We are using nexus repository 3 (using the embedded database) in a kubernetes environment.
Does data corruption happen suddenly, even during normal operation?

The other day, when the kubernetes node failed, the embedded database of nexus repository 3 was corrupted. (It has already been restored from backup data.)
I think data corruption can happen during node or pod failure.
However, I am very concerned that data corruption can occur during normal operation.
If anyone knows, please let me know.

dsawa · January 2, 2023, 9:41am

Hi Daiki,

I think data corruption can happen during node or pod failure.

Yes, you are correct.

Does data corruption happen suddenly, even during normal operation?
I am very concerned that data corruption can occur during normal operation.

No, data corruption does not happen spontaneously out of a fact of running in a container. However, container orchestration system may decide to star or stop a pod on a whim and may not give them sufficient time to complete all IO operations which may lead to abrupt shutdown of embedded database resulting in a corrupted database.

Regardless of when the information was added on that page, the information is valid for all Nexus Repository versions.

d3-ito · January 18, 2023, 1:43pm

Hello Dawid,

Thank you for your comment.
We now understand that DB corruption does not occur suddenly during normal operation. We are relieved to hear this.
As you said, in a container orchestration environment, data IO and DB updates are being generated when a pod is stopped, so I understand that inconsistencies and corruption can occur.

dsawa · January 18, 2023, 2:06pm

Yes, Daiki, You are correct. One such hypothetical situation can look like:

Nexus Repository with embedded OrientDB running in a pod.
You receive a sudden spike in traffic causing your existing pod to be unresponsive.
Depending on your configuration, your orchestrator can be unable to determine your status of the pod.
Depending on your configuration, your orchestrator can decide to shut down the slow pod.
This might happen while the pod is performing a write operation to the embedded database resulting in a corruption.

Due to the above, we recommend against using embedded databases in container environment.

d3-ito · January 18, 2023, 2:55pm

Hello Dawid,

Thank you for your comment.
I now understand the mechanism that leads to DB corruption.
We also recently had a Node failure in our container orchestration environment (k8s), which resulted in a corrupted DB for our Nexus container, which is exactly the mechanism you described.

I will close this query as you have answered it. Thank you so much.