3-node sharded MongoDB cluster on a budget

Damiano_Albani · November 22, 2021, 11:56am

Hello,

I would like to have your opinion on the feasibility of running a 3-node, sharded MongoDB cluster on Kubernetes.
The use case is to able run a MongoDB database which has larger storage requirements than a single server (that I have) can meet.
This is for a hobby project so the budget is limited. Thus the 3 node constraint.
I was thinking of the following setup:

3 replicasets, each composed of 2 mongod and 1 arbiter
sharding activated, with 3 config server pods
1 mongos on each node

In practice, a deployment would look like something as (Primary / Secondary):

node A would run: rs0-0 (P), rs1-arbiter-0, rs2-1 (S), cfg-0, mongos
node B would run: rs0-1, rs1-0 (P), rs2-1 (S), r2-arbiter-0, cfg-1, mongos
node C would run: rs0-arbiter-0, rs1-1 (S), rs2-0 (P), cfg-2, mongos

Each node would thus host:

a primary of a first replicaset;
a secondary of a second replicaset;
and the arbiter of a third replicaset.
Let’s say that each node has 1TB disk capacity and documents are perfectly distributed across the shards.
That would mean that each shard could be as big as 0.5 TB, and the total available space for the MongoDB cluster would be 0.5 x 3 = 1.5 TB.
With the added benefit of the redundancy, where one of the 3 node may go down without losing all the data.
Am I correct?

I’ve already experimented with the operator and I’ve managed to get something running (with “allowUnsafeConfigurations=true”).
Although I haven’t succeeded in making Kubernetes schedule the pods exactly as I wanted.
With podAntiAffinity rules, I could ensure that no node host more than 1 pod of each replicaset.
But I couldn’t figure out how to enforce a “perfect” spread, i.e. excluding the possibly of having rs0-X, rs1-X and rs2-X running on a single node.

What do think of this architecture? Will it work? Is it sensible?
Thanks!

Damiano

Sergey_Pronin · December 6, 2021, 9:20am

Hello @Damiano_Albani ,

sorry for not coming back to you earlier. Here are my thoughts:

Using one arbiter and two nodes is unsafe. You cannot get write guarantee since MongoDB v 4.0 with such a setup. We treat 4 nodes + 1 arbiter as safe setup. That is why you need to set unsafe flag to true.
Getting such a spread is kinda tricky.
2.1 there is still a chance that your pods will be reshuffled in a case of node failure. (I hope you will set preferredDuringSchedulingIgnoredDuringExecution flag).
2.2 To achieve this distribution you will need to manually set affinity and labels on the nodes and Pods. This is tricky and not recommended. In Operators we set affinity rules per deployment or statefulSet and this provides safe distribution.

Damiano_Albani · December 10, 2021, 8:38pm

Hi @Sergey_Pronin, thanks for your feedback.

My original idea to use 2 nodes and 1 arbiter came from https://docs.mongodb.com/manual/core/replica-set-architecture-three-members/ and https://docs.mongodb.com/manual/core/replica-set-arbiter/.
Although this setup is by no means production grade, I’m curious what is the basis of your statement that “you cannot get write guarantee since MongoDB v 4.0 with such a setup”.
Such a spread is indeed tricky, and in the end I did what you suggested, i.e. added a label on the nodes.
In case someone on the interweb shall ever find it interesting, I paste here a snippet of my Helmfile based setup:

releases:
  - name: mongodb-sharded
    chart: percona-1/psmdb-db
    version: 1.10.0
    values:
      - allowUnsafeConfigurations: true
        replsets:
          {{ range $_, $rs := (list "rs0" "rs1" "rs2") }}
          - name: {{ $rs }}
            size: 2
            arbiter:
              enabled: true
              size: 1
          {{ end }}
        sharding:
          enabled: true
          mongos:
            size: 3
    jsonPatches:
      - target:
          group: psmdb.percona.com
          version: v1-10-0
          kind: PerconaServerMongoDB
          name: mongodb-sharded-psmdb-db
        patch:
          {{ range $i, $rs := (list "rs0" "rs1" "rs2") }}
          - op: add
            path: /spec/replsets/{{ $i }}/nodeSelector
            value:
              mongodb.project.xyz/{{ $rs }}: mongod
          - op: add
            path: /spec/replsets/{{ $i }}/arbiter/nodeSelector
            value:
              mongodb.project.xyz/{{ $rs }}: arbiter

Where I set such labels on 3 Kubernetes nodes:

  node-a:
    mongodb.project.xyz/rs0: mongod
    mongodb.project.xyz/rs1: mongod
    mongodb.project.xyz/rs2: arbiter
  node-b:
    mongodb.project.xyz/rs0: mongod
    mongodb.project.xyz/rs1: arbiter
    mongodb.project.xyz/rs2: mongod
  node-b:
    mongodb.project.xyz/rs0: arbiter
    mongodb.project.xyz/rs1: mongod
    mongodb.project.xyz/rs2: mongod

Sergey_Pronin · December 13, 2021, 7:30pm

Hello @Damiano_Albani ,

With 2 nodes and 1 arbiter you run 2 data nodes only. In the case Primary fails and your Secondary becomes new Primary, then while your old PRIMARY is doing an initial or resync, if anything happens during that time to your new PRIMARY, you would be hosed because you would not have a complete full data bearing node available to become the new PRIMARY if necessary. It is quite a race condition-case and a rare one, but can happen.
Might work! But don’t forget to set anti-affinity to preferredDuringSchedulingIgnoredDuringExecution. Otherwise you may end up with degraded cluster till node-X gets back.

Vivek_Shinde · May 3, 2024, 4:49am

In such cases, is it possible to take backup using backup-agent sidecar?

Sergey_Pronin · May 3, 2024, 8:06am

@Vivek_Shinde wow, 2 years later.

It is possible

Topic		Replies	Views
Migrating Sharded MongoDB cluster to Kubernetes Percona Operator for MongoDB percona , mongodb	1	426	February 18, 2024
Replica Set with Percona MongoDB on Docker Percona Server for MongoDB	2	1909	October 14, 2019
Percona Mongo Operator - sharding with replicaset Percona Operator for MongoDB	1	841	January 21, 2020
Stack multiple replicasets on 3 nodes (GKE) Percona Operator for MongoDB	3	494	January 25, 2022
Percona server for mongodb with kubernetes can be built with only replicasets Percona Operator for MongoDB	1	978	June 16, 2021

3-node sharded MongoDB cluster on a budget

Related topics