Percona MongoDB Operator Multi-Cluster Switchover/Failover

Laurentiu_Soica · March 18, 2025, 12:17pm

Description:

I went through the tutorial to setup Percona MongoDB Multi-Cluster with the K8S Operator. About Multi-cluster and multi-region deployments - Percona Operator for MongoDB

Everything went smooth. I want now to simulate a switchover/failover but I was not able to find instructions.

What I tried:
Start with:

2 K8S clusters, A and B
Percona MongoDB operator deployed in each
Deploy a RS with 3 nodes in B, unmanaged, expose it with type NodePort. Sharding disabled.
Deploy a RS with 3 nodes in A, managed, expose it with type NodePort. Sharding disabled.Use B nodes as externalNodes (using host IPs and NodePorts ports)
make sure the credentials and certificates are the same
The setup is successful, I have a 6 node ReplicaSet and both PSMDB CRs are “ready”

As a switchover process:

Change PSMDB CR in A from managed to unmanaged
Connect to MongoDB and remove A’s nodes from the ReplicaSet. Now the replicaSet have 3 nodes, the nodes from B
Change PSMDB CR in B from unmanaged to managed
The CR in B moves to error state with error: message: ‘Error: failed to update config members: delete: write mongo config: replSetReconfig:
(NodeNotFound) No host described in new configuration with {version: 131364, term:
2} for replica set rs0 maps to this node’

Ivan_Groenewold · March 19, 2025, 11:38am

Hi, please take a look at Disaster Recovery for MongoDB on Kubernetes

Laurentiu_Soica · March 19, 2025, 1:19pm

Thanks for sharing the link. The failover (disaster recovery) process worked as described.

Now, in case of a switchover, where both the clusters are healthy but I just want to switch the PRIMARY/ACTIVE from A to B, I did not manage to make it work.

What I did:

Start with the same setup as described in the blog post
Changed the CR in A from managed to unmanaged
Leave the ReplicaSet config as it was (6 nodes, all healthy)
Changed the CR in B from unmanaged to managed
The CR in B moves to error state:

  message: 'Error: failed to update config members: fix member hostname: write mongo
    config: replSetReconfig: (NewReplicaSetConfigurationIncompatible) Found two member
    configurations with same host field, members.1.host == members.4.host == <host IP>:<node port>'
  mongoImage: percona/percona-server-mongodb:7.0.15-9-multi

LE: Removing the tags from all Mongo nodes and moving the primary to B instead of A resulted in a successful switchover. However, the process involved quite a few manual steps, and I’m unsure whether it is entirely deterministic.

Ivan_Groenewold · March 20, 2025, 10:54am

What you did looks good to me. Can you share the output of rs.conf() ? also what is the value of clusterServiceDNSMode ?

Laurentiu_Soica · March 20, 2025, 12:15pm

clusterServiceDNSMode: "External"

rs.conf() before switchover:

{
  _id: 'rs0',
  version: 78580,
  term: 3,
  members: [
    {
      _id: 0,
      host: '1.2.3.4:30864',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 2,
      tags: {
        serviceName: 'mongodb',
        nodeName: 'mw-3',
        podName: 'mongodb-rs0-0'
      },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 1,
      host: '1.2.3.5:31383',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 2,
      tags: {
        serviceName: 'mongodb',
        nodeName: 'mw-1',
        podName: 'mongodb-rs0-1'
      },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 2,
      host: '1.2.3.6:32351',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 2,
      tags: {
        serviceName: 'mongodb',
        nodeName: 'mw-2',
        podName: 'mongodb-rs0-2'
      },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 3,
      host: '1.2.3.7:30308',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 1,
      tags: { external: 'true' },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 4,
      host: '1.2.3.8:32118',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 1,
      tags: { external: 'true' },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 5,
      host: '1.2.3.9:30498',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 0,
      tags: { external: 'true' },
      secondaryDelaySecs: Long('0'),
      votes: 0
    }
  ],
  protocolVersion: Long('1'),
  writeConcernMajorityJournalDefault: true,
  settings: {
    chainingAllowed: true,
    heartbeatIntervalMillis: 2000,
    heartbeatTimeoutSecs: 10,
    electionTimeoutMillis: 10000,
    catchUpTimeoutMillis: -1,
    catchUpTakeoverDelayMillis: 30000,
    getLastErrorModes: {},
    getLastErrorDefaults: { w: 1, wtimeout: 0 },
    replicaSetId: ObjectId('67dbf1520289a4b6b57b904b')
  }
}

cleanup tags and switch primary:

cfg = rs.config()
cfg.members[0].tags = {}
cfg.members[1].tags = {}
cfg.members[2].tags = {}
cfg.members[3].tags = {}
cfg.members[4].tags = {}
cfg.members[5].tags = {}
cfg.members[3].priority = 10
rs.reconfig(cfg, {force: true})

rs.conf() after switchover

{
  _id: 'rs0',
  version: 148264,
  term: 4,
  members: [
    {
      _id: 0,
      host: '1.2.3.4:30864',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 1,
      tags: { external: 'true' },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 1,
      host: '1.2.3.5:31383',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 1,
      tags: { external: 'true' },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 2,
      host: '1.2.3.6:32351',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 0,
      tags: { external: 'true' },
      secondaryDelaySecs: Long('0'),
      votes: 0
    },
    {
      _id: 3,
      host: '1.2.3.7:30308',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 2,
      tags: {
        podName: 'mongodb-rs0-0',
        serviceName: 'mongodb',
        nodeName: 'mw-3'
      },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 4,
      host: '1.2.3.8:32118',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 2,
      tags: {
        podName: 'mongodb-rs0-1',
        serviceName: 'mongodb',
        nodeName: 'mw-2'
      },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 5,
      host: '1.2.3.9:30498',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 2,
      tags: {
        nodeName: 'mw-1',
        podName: 'mongodb-rs0-2',
        serviceName: 'mongodb'
      },
      secondaryDelaySecs: Long('0'),
      votes: 1
    }
  ],
  protocolVersion: Long('1'),
  writeConcernMajorityJournalDefault: true,
  settings: {
    chainingAllowed: true,
    heartbeatIntervalMillis: 2000,
    heartbeatTimeoutSecs: 10,
    electionTimeoutMillis: 10000,
    catchUpTimeoutMillis: -1,
    catchUpTakeoverDelayMillis: 30000,
    getLastErrorModes: {},
    getLastErrorDefaults: { w: 1, wtimeout: 0 },
    replicaSetId: ObjectId('67dbf1520289a4b6b57b904b')
  }
}

Ivan_Groenewold · March 20, 2025, 12:25pm

Tag cleanup should be performed by operator when you change the site B to managed. I believe there might be an issue with the cr.yaml on site B. Anyway, we will update the documentation with clear steps for switchover. Feel free to subscribe to Jira for updates

Laurentiu_Soica · March 20, 2025, 12:58pm

In case of a failover, when I manually remove the A’s nodes from rs.conf(), like described in the blog post, I can confirm the operator refreshes the tags. In case of a switchover, where I want to keep all the nodes but switch the “Active” cluster, so the active operator and the primary node, the operator gets confused by the existing tags and fails as described above.

Anyway, indeed it could be a config error on my end so a reproducible procedure will certainly help me identify any misconfigurations.

Laurentiu_Soica · August 29, 2025, 12:13pm

Hi @Ivan_Groenewold Is this still on your radar? I still can’t perform the switchover if I do not manually cleanup the tags.

Ege_Gunes · September 1, 2025, 7:02am

@Laurentiu_Soica I will check this sometime this week.

Ivan_Groenewold · September 9, 2025, 4:49pm

I think the issue with the tags might be due to your manually removing nodes from the replica set.

Can you try instead letting the operator handle changes? basically add nodes from site B as external on site A, and add nodes from site A as external on site B.

At switchover time, you set the cluster on A to “unmanaged”, and then on B to “managed”. The operator should take care of promoting a node on B site as primary.

Laurentiu_Soica · September 16, 2025, 6:24am

I think that’s the process I follow (I do node removal on failover only, not switchover).

In case of a switchover, this is the process:

Start with cluster A with unmanaged: false, updateStrategy: SmartUpdate, 3 local nodes, 3 externalNodes and cluster B with unmanaged: true, updateStrategy: OnDelete, 3 local nodes, 3 externalNodes
Demote cluster A: unmanaged: true, updateStrategy: OnDelete, 3 local nodes, 3 externalNodes
Promote cluster B: unmanaged: false, updateStrategy: SmartUpdate, 3 local nodes, 3 externalNodes

At this point, cluster B CR reports:

message: ‘Error: failed to update config members: fix member hostname: write mongo
config: replSetReconfig: (NewReplicaSetConfigurationIncompatible) Found two member
configurations with same host field, members.1.host == members.4.host == IP:PORT’
mongoImage: percona/percona-server-mongodb:8.0.8-3
mongoVersion: 8.0.8-3
observedGeneration: 4
ready: 3
replsets:
rs0:
initialized: true
ready: 3
size: 3
status: ready
size: 3
state: error

Ivan_Groenewold · September 23, 2025, 11:39am

I couldn’t reproduce the same behavior. In my case the promotion of cluster B works fine. Would you mind opening a bug at https://perconadev.atlassian.net/ and providing the full cr.yaml for both clusters? that way the dev team can take a look.

Laurentiu_Soica · September 23, 2025, 3:37pm

I tried to add the details here Jira

Wasn’t able to edit description or add files so I’ve add the details as comments.

Topic		Replies	Views
Mongodb cluster in restart loop Percona Operator for MongoDB	2	1347	March 1, 2023
Cross DC/region replication of percona mongodb Percona Operator for MongoDB percona , mongodb , psmdb-operator	4	616	February 21, 2024
Working Configuration Percona Server for MongoDB cross-site replication for OpenShift Percona Operator for MongoDB	8	2178	July 4, 2023
Percona MongoDB Replica set configuration Percona Operator for MongoDB	7	851	October 11, 2024
HA mongodb CE statefulset to percona mongodb with operator Percona Operator for MongoDB percona , mongodb , psmdb-operator	1	648	October 19, 2023

Percona MongoDB Operator Multi-Cluster Switchover/Failover

Description:

Related topics