Problem with MongoDB 5 and Component Pack 8

Hi everybody,

I'm in the process of installing Component Pack 8 on my Kubernetes cluster running the supported version V1.24.1 using containerd CRI. The cluster has 2 worker nodes and 1 master node.

I build the MongoDB 5 image and try to get it running with helm. However I always get the same error and can't really decipher what is wrong. Bootstrap and PVCs are already deployed and working. The Kubernetes cluster has no problems and all workers are ready.

Here is the error:

Events:
  Type     Reason               Age                   From               Message
  ----     ------               ----                  ----               -------
  Normal   Scheduled            3m29s                 default-scheduler  Successfully assigned connections/mongo5-0 to worker.softwerk.de
  Normal   Pulled               3m24s                 kubelet            Container image "ubuntu:20.04" already present on machine
  Normal   Created              3m24s                 kubelet            Created container init-chmod-data5
  Normal   Started              3m24s                 kubelet            Started container init-chmod-data5
  Normal   Pulled               3m23s                 kubelet            Container image "hclcr.io/cnx/middleware-mongodb5-sidecar:latest" already present on machine
  Normal   Started              3m22s                 kubelet            Started container mongo5-sidecar
  Normal   Created              3m22s                 kubelet            Created container mongo5-sidecar
  Normal   Pulled               3m5s (x3 over 3m23s)  kubelet            Container image "hclcr.io/cnx/middleware-mongodb5:0.1.1-20221013-124108" already present on machine
  Normal   Created              3m5s (x3 over 3m23s)  kubelet            Created container mongo5
  Normal   Started              3m4s (x3 over 3m23s)  kubelet            Started container mongo5
  Normal   Killing              3m4s (x3 over 3m23s)  kubelet            FailedPostStartHook
  Warning  FailedPostStartHook  3m4s (x3 over 3m23s)  kubelet            Exec lifecycle hook ([bash -c sleep 30 && mongo --tls --tlsCertificateKeyFile /etc/mongodb/x509/user_admin.pem --tlsCAFile /etc/mongodb/x509/mongo-CA-cert.crt --host mongo5-0.mongo5.connections.svc.cluster.local --eval "rs.initiate({_id:\"rs0\", members:[{_id:0, host:\"mongo5-0.mongo5.connections.svc.cluster.local:27017\"}]})"]) for Container "mongo5" in Pod "mongo5-0_connections(a72bd516-6f63-41bb-921f-39c36fd6bbe1)" failed - error: command 'bash -c sleep 30 && mongo --tls --tlsCertificateKeyFile /etc/mongodb/x509/user_admin.pem --tlsCAFile /etc/mongodb/x509/mongo-CA-cert.crt --host mongo5-0.mongo5.connections.svc.cluster.local --eval "rs.initiate({_id:\"rs0\", members:[{_id:0, host:\"mongo5-0.mongo5.connections.svc.cluster.local:27017\"}]})"' exited with 137: , message: ""
  Warning  BackOff              3m3s (x4 over 3m20s)  kubelet            Back-off restarting failed container

Does anybody have the same problem or did you deploy Component Pack 8 without probelms?

Regards

Florian Stahl

It looks like it is failing because of error-code 137. I did a first check and I found:

https://stackoverflow.com/questions/59296801/docker-compose-exit-code-is-137-when-there-is-no-oom-exception

Could be possible that it fails, because insufficient memory is used for the mongoDB image.

I already checked all the links suggesting the 137 error, but I ruled it out.

The worker nodes have more than enough RAM and I also exprimented with resource limits in the deplayment chart. But that didn't help either.

Hi Florian,
when checking the event history that you have shared, I just noticed that mongoDB was already started and then the post start hook happens. According to:

https://stackoverflow.com/questions/57592563/why-i-get-error-137-when-exec-lifecycle-hook-in-kubernetes-cronjob

It is maybe possible to prevent that error by letting the main container waiting for a longer time until the hook is finished.

Hi Thorsten,

I don't really know how to do that.

And if this is nessaccary it seems like a bug in the Helm chart provided by HCL.

Hi Florian,

yes agree. Maybe try out the hint from Christoph first. If that will not help, feel free to open a case with HCL to take a deeper look.

Thanks, Thorsten

Have you checked https://github.com/turnkeylinux/tracker/issues/1724 ?

In my case the default cpu in Libvirt does not have this cpu extension. Please check if your virtual cpu does support AVX.

That could be the problem!

I'm running on ESXi and indeed don't have the AVX extension by default.

I will check how to enable and come back to this thread.

That was indeed the problem.

After activating AVX for the involved VMs everything works fine.

Thank you very much Christoph!

For Intel x86_64, MongoDB >= 5 requires Sandy Bridge or later.
For AMD x86_64, MongoDB >= 5 requires Bulldozer or later.
Starting in MongoDB 5.0, mongod, mongos, and the legacy mongo shell no longer support x86_64 platforms which do not meet this minimum microarchitecture requirement.
https://docs.mongodb.com/manual/administration/production-notes/#x86_64


The underling requirement for the MongoDB 5.0 binary server packages is CPUs with AVX instructions. These are broadly Sandy Bridge or ewer Intel CPUs, but there is a caveat:
Not all CPUs from the listed families support AVX. Generally, CPUs with the commercial denomination Core i3/i5/i7/i9 support them, whereas Pentium and Celeron CPUs do not.
https://www.mongodb.com/community/forums/t/mongodb-5-0-cpu-intel-g4650-compatibility/116610/2

On AWS, AVX, AVX2, and Enhanced Networking are only available on instances launched with hardware virtual machine (HVM) AMIs.
https://aws.amazon.com/ec2/instance-types/