Percona MongoDB Operator Milti-Cluster Switchover/Failover

Laurentiu_Soica · March 18, 2025, 12:17pm

Description:

I went through the tutorial to setup Percona MongoDB Multi-Cluster with the K8S Operator. About Multi-cluster - Percona Operator for MongoDB

Everything went smooth. I want now to simulate a switchover/failover but I was not able to find instructions.

What I tried:
Start with:

2 K8S clusters, A and B
Percona MongoDB operator deployed in each
Deploy a RS with 3 nodes in B, unmanaged, expose it with type NodePort. Sharding disabled.
Deploy a RS with 3 nodes in A, managed, expose it with type NodePort. Sharding disabled.Use B nodes as externalNodes (using host IPs and NodePorts ports)
make sure the credentials and certificates are the same
The setup is successful, I have a 6 node ReplicaSet and both PSMDB CRs are “ready”

As a switchover process:

Change PSMDB CR in A from managed to unmanaged
Connect to MongoDB and remove A’s nodes from the ReplicaSet. Now the replicaSet have 3 nodes, the nodes from B
Change PSMDB CR in B from unmanaged to managed
The CR in B moves to error state with error: message: ‘Error: failed to update config members: delete: write mongo config: replSetReconfig:
(NodeNotFound) No host described in new configuration with {version: 131364, term:
2} for replica set rs0 maps to this node’

Ivan_Groenewold · March 19, 2025, 11:38am

Hi, please take a look at Disaster Recovery for MongoDB on Kubernetes

Laurentiu_Soica · March 19, 2025, 1:19pm

Thanks for sharing the link. The failover (disaster recovery) process worked as described.

Now, in case of a switchover, where both the clusters are healthy but I just want to switch the PRIMARY/ACTIVE from A to B, I did not manage to make it work.

What I did:

Start with the same setup as described in the blog post
Changed the CR in A from managed to unmanaged
Leave the ReplicaSet config as it was (6 nodes, all healthy)
Changed the CR in B from unmanaged to managed
The CR in B moves to error state:

  message: 'Error: failed to update config members: fix member hostname: write mongo
    config: replSetReconfig: (NewReplicaSetConfigurationIncompatible) Found two member
    configurations with same host field, members.1.host == members.4.host == <host IP>:<node port>'
  mongoImage: percona/percona-server-mongodb:7.0.15-9-multi

LE: Removing the tags from all Mongo nodes and moving the primary to B instead of A resulted in a successful switchover. However, the process involved quite a few manual steps, and I’m unsure whether it is entirely deterministic.

Ivan_Groenewold · March 20, 2025, 10:54am

What you did looks good to me. Can you share the output of rs.conf() ? also what is the value of clusterServiceDNSMode ?

Laurentiu_Soica · March 20, 2025, 12:15pm

clusterServiceDNSMode: "External"

rs.conf() before switchover:

{
  _id: 'rs0',
  version: 78580,
  term: 3,
  members: [
    {
      _id: 0,
      host: '1.2.3.4:30864',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 2,
      tags: {
        serviceName: 'mongodb',
        nodeName: 'mw-3',
        podName: 'mongodb-rs0-0'
      },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 1,
      host: '1.2.3.5:31383',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 2,
      tags: {
        serviceName: 'mongodb',
        nodeName: 'mw-1',
        podName: 'mongodb-rs0-1'
      },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 2,
      host: '1.2.3.6:32351',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 2,
      tags: {
        serviceName: 'mongodb',
        nodeName: 'mw-2',
        podName: 'mongodb-rs0-2'
      },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 3,
      host: '1.2.3.7:30308',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 1,
      tags: { external: 'true' },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 4,
      host: '1.2.3.8:32118',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 1,
      tags: { external: 'true' },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 5,
      host: '1.2.3.9:30498',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 0,
      tags: { external: 'true' },
      secondaryDelaySecs: Long('0'),
      votes: 0
    }
  ],
  protocolVersion: Long('1'),
  writeConcernMajorityJournalDefault: true,
  settings: {
    chainingAllowed: true,
    heartbeatIntervalMillis: 2000,
    heartbeatTimeoutSecs: 10,
    electionTimeoutMillis: 10000,
    catchUpTimeoutMillis: -1,
    catchUpTakeoverDelayMillis: 30000,
    getLastErrorModes: {},
    getLastErrorDefaults: { w: 1, wtimeout: 0 },
    replicaSetId: ObjectId('67dbf1520289a4b6b57b904b')
  }
}

cleanup tags and switch primary:

cfg = rs.config()
cfg.members[0].tags = {}
cfg.members[1].tags = {}
cfg.members[2].tags = {}
cfg.members[3].tags = {}
cfg.members[4].tags = {}
cfg.members[5].tags = {}
cfg.members[3].priority = 10
rs.reconfig(cfg, {force: true})

rs.conf() after switchover

{
  _id: 'rs0',
  version: 148264,
  term: 4,
  members: [
    {
      _id: 0,
      host: '1.2.3.4:30864',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 1,
      tags: { external: 'true' },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 1,
      host: '1.2.3.5:31383',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 1,
      tags: { external: 'true' },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 2,
      host: '1.2.3.6:32351',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 0,
      tags: { external: 'true' },
      secondaryDelaySecs: Long('0'),
      votes: 0
    },
    {
      _id: 3,
      host: '1.2.3.7:30308',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 2,
      tags: {
        podName: 'mongodb-rs0-0',
        serviceName: 'mongodb',
        nodeName: 'mw-3'
      },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 4,
      host: '1.2.3.8:32118',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 2,
      tags: {
        podName: 'mongodb-rs0-1',
        serviceName: 'mongodb',
        nodeName: 'mw-2'
      },
      secondaryDelaySecs: Long('0'),
      votes: 1
    },
    {
      _id: 5,
      host: '1.2.3.9:30498',
      arbiterOnly: false,
      buildIndexes: true,
      hidden: false,
      priority: 2,
      tags: {
        nodeName: 'mw-1',
        podName: 'mongodb-rs0-2',
        serviceName: 'mongodb'
      },
      secondaryDelaySecs: Long('0'),
      votes: 1
    }
  ],
  protocolVersion: Long('1'),
  writeConcernMajorityJournalDefault: true,
  settings: {
    chainingAllowed: true,
    heartbeatIntervalMillis: 2000,
    heartbeatTimeoutSecs: 10,
    electionTimeoutMillis: 10000,
    catchUpTimeoutMillis: -1,
    catchUpTakeoverDelayMillis: 30000,
    getLastErrorModes: {},
    getLastErrorDefaults: { w: 1, wtimeout: 0 },
    replicaSetId: ObjectId('67dbf1520289a4b6b57b904b')
  }
}

Ivan_Groenewold · March 20, 2025, 12:25pm

Tag cleanup should be performed by operator when you change the site B to managed. I believe there might be an issue with the cr.yaml on site B. Anyway, we will update the documentation with clear steps for switchover. Feel free to subscribe to Jira for updates

Laurentiu_Soica · March 20, 2025, 12:58pm

In case of a failover, when I manually remove the A’s nodes from rs.conf(), like described in the blog post, I can confirm the operator refreshes the tags. In case of a switchover, where I want to keep all the nodes but switch the “Active” cluster, so the active operator and the primary node, the operator gets confused by the existing tags and fails as described above.

Anyway, indeed it could be a config error on my end so a reproducible procedure will certainly help me identify any misconfigurations.

Laurentiu_Soica · August 29, 2025, 12:13pm

Hi @Ivan_Groenewold Is this still on your radar? I still can’t perform the switchover if I do not manually cleanup the tags.

Ege_Gunes · September 1, 2025, 7:02am

@Laurentiu_Soica I will check this sometime this week.

Topic		Replies	Views
Mongodb cluster in restart loop Percona Operator for MongoDB	2	1314	March 1, 2023
Percona MongoDB Replica set configuration Percona Operator for MongoDB	7	685	October 11, 2024
HA mongodb CE statefulset to percona mongodb with operator Percona Operator for MongoDB percona , mongodb , psmdb-operator	1	640	October 19, 2023
Working Configuration Percona Server for MongoDB cross-site replication for OpenShift Percona Operator for MongoDB	8	2098	July 4, 2023
Cross DC/region replication of percona mongodb Percona Operator for MongoDB percona , mongodb , psmdb-operator	4	587	February 21, 2024

Percona MongoDB Operator Milti-Cluster Switchover/Failover

Description:

Related topics