Startup of replicaset takes very long

Hello

I have replicas = 3 and during startup, why does one pod after the other get started?

Lets say 1 pod takes 30min (after a “forced” shutdown), then you have to wait 3x30min. Any other options?

Regards
John

Hey @jamoser .

why does one pod after the other get started

It is the default behaviour of a StatefulSet. It uses OrderedReady Pod Management.

It is possible to change it by setting .spec.podManagementPolicy. But:

  1. I’m not sure that it is safe for replica set. I will need to consult with our experts.
  2. I don’t see it being supported in the Operator right now (you can’t set this option in the custom resource).

But also I’m curious - why it takes 30min to start a single Pod? What is the main driver there?

Hello

There is the functionality

pause: true | false

to shutdown the cluster / start the cluster. Sometimes (and very often) it just does not work. The only way how to “gracefully” shutdown is, to set the replicas on the kubernetes statefulset to → 0

I would assume that the newly introduced terminationGracePeriod (on Percona level) would be active and therefore wait until MongoDB is able to shut down - seems it’s not the case.

Also it seems you do not do a flush before the MongoDB shutdown.

This all results that the MongoDB in the new pod has first do a crash recovery. This can take in our case for ex. on a balanced disk with 100’000 collections, very very long.

So imo something is not clean regarding shutdown of the replica pods.

Regards
John

1 Like