Gtid based async replication from Primary to DR xtradb cluster breaks replication on DR nodes

adi · October 17, 2022, 7:03pm

Hi Team,
I changed log file/position based async replication to GTID based replication from primary xtraDB cluster to xtraDB DR cluster, after doing this DR slave starts getting changes from primary xtraDB source node.

But issue is DR slave node stops replicating the other 2 nodes in DR xtradb CLuster.
Steps I followed:

on 1 Primary xtadb MASTER node

Enabled GTID one master node
Below are the entries in my.cnf
server-id=1
log-bin=mysql-bin
gtid_mode=on
enforce-gtid-consistency
log-slave-updates
created replica user

4.Took backup from master using below command
mysqldump --all-databases --single-transaction --triggers --routines --host=127.0.0.1 --port=3306 --user=msandbox --password=msandbox > dump.sql

On 1 DR xtradb SLAVE node

entries in s
server-id=2
log-bin=mysql-bin
relay-log=relay-log-server
gtid_mode=on
enforce-gtid-consistency
log-slave-updates
source SQLdump.sql
able to view gtid on slave

4.CHANGE MASTER TO MASTER_HOST=“primary_master_IP”, MASTER_USER=“user”, MASTER_PASSWORD=“password”, MASTER_PORT=3306, MASTER_AUTO_POSITION = 1; SOURCE_SSL=1, MASTER_SSL_CA=‘/certs/ca.pem’, MASTER_SSL_CERT=‘/certs/client-cert.pem’, MASTER_SSL_KEY=‘/certs/client-key.pem’;

start slave
show slave status have no errors and also replicate but othe 2 nodes in DR xtradb cluster stops replicating with DR slave node

Thanks
Adi

Pep_Pla · October 18, 2022, 5:27am

Hi Adi,

Thank you for being an active member of the Percona community. Could you explain what you mean by “the other two nodes in DR xtradb cluster stops replicating with DR slave node”?

Do you mean that the cluster is broken? What is the error that you are seeing?

Thank you!

Pep

adi · October 18, 2022, 7:19am

Hi Sir,
So my both xtradb DR and primary env clusters have 3 node each, and only one node in DR I choose to be a slave of another node in primary env.
But after implementing gtid based replication using above provided steps between 1 primary node to 1 DR node, I was expecting other 2 nodes in DR will also replicate from Slave node of DR which is replicating from primary node.

But DR node which is configured as slave on that wsrep_cluster_size value become 0 and on other 2 nodes of DR wsrep_cluster_size value become 2.

I want slave node in DR should also replicate data coming from primary master node to other 2 nodes in DR env.

Logs on non-slave node of DR:

2022-10-18T07:01:03.602543Z 11 [Warning] [MY-000000] [WSREP] Event 1 Gtid apply failed: 1, seqno 1534
2022-10-18T07:01:03.608928Z 0 [Note] [MY-000000] [Galera] Member 0(hostnamet7) initiates vote on e3828439-4552-11ed-b097-f7e0d163013e:1534,a39f651ed2a1bfaf:  Can't create database 'stess1'; database exists, Error_code: 1007; @@SESSION.GTID_NEXT cannot be set to UUID:NUMBER when @@GLOBAL.GTID_MODE = OFF., Error_code: 1781;
2022-10-18T07:01:03.609006Z 0 [Note] [MY-000000] [Galera] Votes over e3828439-4552-11ed-b097-f7e0d163013e:1534:
   a39f651ed2a1bfaf:   1/3
Waiting for more votes.
2022-10-18T07:01:03.609044Z 0 [Note] [MY-000000] [Galera] Member 1(hostnamet9) initiates vote on e3828439-4552-11ed-b097-f7e0d163013e:1534,a39f651ed2a1bfaf:  Can't create database 'stess1'; database exists, Error_code: 1007; @@SESSION.GTID_NEXT cannot be set to UUID:NUMBER when @@GLOBAL.GTID_MODE = OFF., Error_code: 1781;
2022-10-18T07:01:03.609076Z 0 [Note] [MY-000000] [Galera] Votes over e3828439-4552-11ed-b097-f7e0d163013e:1534:
   a39f651ed2a1bfaf:   2/3
Winner: a39f651ed2a1bfaf
2022-10-18T07:01:03.643994Z 0 [Note] [MY-000000] [Galera] declaring 3942a4bd-905d at ssl://x.xx.xx.245:4567 stable
2022-10-18T07:01:03.644145Z 0 [Note] [MY-000000] [Galera] forgetting 6bb5cb25-aaa1 (ssl://x.xx.xx.225:4567)
2022-10-18T07:01:03.645270Z 0 [Note] [MY-000000] [Galera] Node 1dab87a5-acee state primary
2022-10-18T07:01:03.646042Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(PRIM,1dab87a5-acee,71)
memb {
	1dab87a5-acee,0
	3942a4bd-905d,0
	}
joined {
	}
left {
	}
partitioned {
	6bb5cb25-aaa1,0
	}
)
2022-10-18T07:01:03.646096Z 0 [Note] [MY-000000] [Galera] Save the discovered primary-component to disk
2022-10-18T07:01:03.651067Z 0 [Note] [MY-000000] [Galera] forgetting 6bb5cb25-aaa1 (ssl://x.xx.xx.225:4567)
2022-10-18T07:01:03.651333Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
2022-10-18T07:01:03.651539Z 0 [Note] [MY-000000] [Galera] STATE_EXCHANGE: sent state UUID: a159c5cc-4eb2-11ed-9e95-37ff67e85fb1
2022-10-18T07:01:03.656616Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: sent state msg: a159c5cc-4eb2-11ed-9e95-37ff67e85fb1
2022-10-18T07:01:03.657647Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: a159c5cc-4eb2-11ed-9e95-37ff67e85fb1 from 0 (hostnamet7)
2022-10-18T07:01:03.657710Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: a159c5cc-4eb2-11ed-9e95-37ff67e85fb1 from 1 (hostnamet9)
2022-10-18T07:01:03.657742Z 0 [Note] [MY-000000] [Galera] Quorum results:
	version    = 6,
	component  = PRIMARY,
	conf_id    = 21,
	members    = 2/2 (primary/total),
	act_id     = 1534,
	last_appl. = 1510,
	protocols  = 2/10/4 (gcs/repl/appl),
	vote policy= 0,
	group UUID = e3828439-4552-11ed-b097-f7e0d163013e
2022-10-18T07:01:03.657929Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [141, 141]
2022-10-18T07:01:03.658276Z 12 [Note] [MY-000000] [Galera] ####### processing CC 1535, local, ordered
2022-10-18T07:01:03.658327Z 12 [Note] [MY-000000] [Galera] Maybe drain monitors from 1534 upto current CC event 1535 upto:1534
2022-10-18T07:01:03.658354Z 12 [Note] [MY-000000] [Galera] Drain monitors from 1534 up to 1534
2022-10-18T07:01:03.658383Z 12 [Note] [MY-000000] [Galera] ####### My UUID: 1dab87a5-455c-11ed-acee-e7925eac4b07
2022-10-18T07:01:03.658408Z 12 [Note] [MY-000000] [Galera] Skipping cert index reset
2022-10-18T07:01:03.658432Z 12 [Note] [MY-000000] [Galera] REPL Protocols: 10 (5)
2022-10-18T07:01:03.658456Z 12 [Note] [MY-000000] [Galera] ####### Adjusting cert position: 1534 -> 1535
2022-10-18T07:01:03.658574Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.
2022-10-18T07:01:03.664011Z 12 [Note] [MY-000000] [Galera] ================================================
View:
  id: e3828439-4552-11ed-b097-f7e0d163013e:1535
  status: primary
  protocol_version: 4
  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
  final: no
  own_index: 0
  members(2):
	0: 1dab87a5-455c-11ed-acee-e7925eac4b07, hostnamet7
	1: 3942a4bd-4953-11ed-905d-0a5568cb87fa, hostnamet9
=================================================
2022-10-18T07:01:03.664077Z 12 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2022-10-18T07:01:03.675993Z 12 [Note] [MY-000000] [Galera] Recording CC from group: 1535
2022-10-18T07:01:03.676057Z 12 [Note] [MY-000000] [Galera] Lowest cert index boundary for CC from group: 1511
2022-10-18T07:01:03.676086Z 12 [Note] [MY-000000] [Galera] Min available from gcache for CC from group: 926
2022-10-18T07:01:09.104801Z 0 [Note] [MY-000000] [Galera]  cleaning up 6bb5cb25-aaa1 (ssl://x.xx.xx.225:4567)
[root@hostnamet7 log]#

Logs on Slave DR node (errors you see related to LOCK TABLE/FLUSH TABLE come during restore of sql dump from master to slave), but replication between Primary master node and slave node working fine.

2022-10-18T06:58:17.940141Z 27 [System] [MY-010916] [Server] @@GLOBAL.GTID_PURGED was changed from '' to '10130818-45f9-11ed-924c-27f08e6f88e4:1-3, 38df0cdf-4d1a-11ed-a6b3-0b92abb9a557:1-10'.
11824 2022-10-18T06:58:17.940217Z 27 [System] [MY-010917] [Server] @@GLOBAL.GTID_EXECUTED was changed from 'e3828439-4552-11ed-b097-f7e0d163013e:1' to '10130818-45f9-11ed-924c-27f08e6f88e4:1-3, 38df0cdf-4d1a-11ed-a6      b3-0b92abb9a557:1-10, e3828439-4552-11ed-b097-f7e0d163013e:1'.
11825 2022-10-18T06:58:18.019463Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11826 2022-10-18T06:58:18.103127Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11827 2022-10-18T06:58:18.195188Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11828 2022-10-18T06:58:18.301681Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11829 2022-10-18T06:58:18.377869Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11830 2022-10-18T06:58:18.468862Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11831 2022-10-18T06:58:18.537406Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11832 2022-10-18T06:58:18.661763Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11833 2022-10-18T06:58:18.780342Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11834 2022-10-18T06:58:18.919704Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11835 2022-10-18T06:58:19.061906Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11836 2022-10-18T06:58:19.278519Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11837 2022-10-18T06:58:19.353623Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11838 2022-10-18T06:58:19.447235Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11839 2022-10-18T06:58:19.523316Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11840 2022-10-18T06:58:19.609538Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11841 2022-10-18T06:58:19.674726Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11842 2022-10-18T06:58:19.745211Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11843 2022-10-18T06:58:19.832070Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11844 2022-10-18T06:58:19.944945Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11845 2022-10-18T06:58:20.016321Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11846 2022-10-18T06:58:20.116453Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
11847 2022-10-18T06:58:20.222985Z 27 [ERROR] [MY-000000] [WSREP] Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK/FOR EXPORT with pxc_strict_mode = ENFORCING
                                                                                                                                                                                                     11822,1       98%

Pep_Pla · October 18, 2022, 9:08am

Hi Adi,

Is gtid_mode the same across all the nodes?

2022-10-18T07:01:03.609044Z 0 [Note] [MY-000000] [Galera] Member 1(hostnamet9) initiates vote on e3828439-4552-11ed-b097-f7e0d163013e:1534,a39f651ed2a1bfaf: Can't create database 'stess1'; database exists, Error_code: 1007; @@SESSION.GTID_NEXT cannot be set to UUID:NUMBER when @@GLOBAL.GTID_MODE = OFF., Error_code: 1781;

Looks like you had a partition. I recommend you ensure that all the nodes have the same parameters that affect replication and binary logging. Make sure that every node has a different server_id also.

Thank you!

Pep

adi · October 18, 2022, 10:13am

Hi Sir,
So when I enabled gtid_mode on every node of DR, complete DR env crashed and I was unable to even login to any mysql instance, yes but I haven’t set another slave-related parameters in my.cnf.

I will add all parameters exactly same as in slave my.cnf in other 2 nodes of DR and will also restore SQL dump in another 2 nodes. will test and report back the results.
Thanks
Adi

Pep_Pla · October 18, 2022, 10:18am

Hi Adi,

Do not rebuild the other nodes by using a dump; once you know that all the nodes have the same configuration, bootstrap the cluster using the DR replica node. Then add the remaining nodes to the cluster and wait for the SST to complete.

Pep

adi · October 19, 2022, 7:42am

Hi @Pep_Pla,
Thanks now the other 2 nodes in DR are also able to replicate from the DR slave node.

Sir but can something be done so that if the Master in primary goes down slave in DR automatically switches to any other node in Primary and consider it Master and starts replicating from it?

Secondly, if slave nodes go down automatically new slave node is assigned within the DR env.

Is any of the above 2 scenarios possible and if yes can you please guide how to achieve them?

Thanks
Adi

Pep_Pla · October 19, 2022, 9:23am

Hi Adi,

Glad to know that your issue was fixed. I’ll mark this thread as solved.

Regarding managing replication, you have two open source options:

[GitHub - peppla/replication_managerng: Stateless async replication manager for PXC Clusters]
and
[Mysql-tools/PXC at master · y-trudeau/Mysql-tools · GitHub]

Thank you!

Topic		Replies	Views
Async replication from XtraDB cluster? Percona XtraDB Cluster 5.x	0	807	February 3, 2015
Enabling GTID base replication without production pxc downtime Percona XtraDB Cluster 8.x percona	4	974	January 8, 2023
Replication from Primary Xtradb 8 Cluster to DR Xtradb 8 Cluster Percona XtraDB Cluster 8.x percona	2	650	September 20, 2022
Async replication breaking after another a node joins the cluster Percona XtraDB Cluster 5.x	5	697	August 12, 2021
Async replication between 2 XtraDB clusters. Percona XtraDB Cluster 5.x	0	865	May 19, 2014

Gtid based async replication from Primary to DR xtradb cluster breaks replication on DR nodes

Related topics