Non-sharding mode, how to keep the pods "sticky" to a node to avoid complete outage

jamoser · July 13, 2023, 4:30pm

Hello

It’s more a Kubernetes question …

Assuming you run the MongoDB cluster on multiple nodes. Now there are a lot of writes which all go to the master/writer pod of the Statefulset. In some cases Kubernetes might try the reschedule the “writer pod” to a different node, if it creates too much load on the node. The pod gets rescheduled and the “Writer” is moved to another pod - but the problem arises again, rescheduling … and so forth until ALL of the pods got rescheduled. Since the startup of each pod/mongdb takes some time, at some point the whole cluster is NOT AVAILABLE any more.

Is there an elegant and RECOMMENDED way to make the pods sticky so that the other pods on the node get rescheduled.

Sergey_Pronin · July 14, 2023, 7:33am

Hey @jamoser ,

well, there are few ways that I can think of, but there are multiple cases.

If the Pod is getting killed by OutOfMemory killer (OOM), than it is not much you can do except ensuring that it has enough resources.
If the node goes under pressure and starts evicting Pods, the eviction does not takes into account Disruption budgets or grace termination periods. But at the same time, again it would first evict Burstable pods. So the recommendation here would be to use guaranteed pods (requests = limits) for mongodb.

Hope this helps. I’m open to jump into a call to discuss it. You know where to find me

Topic		Replies	Views
GKE / MongoDB cluster under "stress" not accepting changes from cr.yaml Percona Operator for MongoDB	2	49	April 30, 2025
Pods in Pending state - 0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod Percona Operator for MongoDB mongodb , psmdb-operator	6	6375	December 22, 2023
Pvc taint with IP of node, pod stuck in pending in case of node restart/crash Percona Operator for MongoDB percona , mongodb	10	1299	April 29, 2025
3-node sharded MongoDB cluster on a budget Percona Operator for MongoDB	5	1500	May 3, 2024
How to force a rolling update / restart on the PerconaMongoDB cluster Percona Operator for MongoDB	4	1576	November 3, 2021

Non-sharding mode, how to keep the pods "sticky" to a node to avoid complete outage

Related topics