Percona xtradb Cluster 5.6 for Disaster Recovery on LAN/WAN problems

Hi, we have a Percona XtraDB Cluster with 3 nodes on our environment.

We need to create other two Disaster Recovery Percona XtraDB Clusters starting from first Cluster, so we decide to use the MySQL replication from one node of first Cluster to the first node of the second Cluster, we know that we can have that node out from Cluster, but we don’t know other way to do it.

So this is that we would like to have (or if there is another way, we would like to know)

Cluster One

Node 1 (Master) , Node 2 (Master), Node 3 (Master, Master for second Percona Cluster)

Cluster Two

Node 1 (Master, Slave of first Percona Cluster) Node 2 (Master) Node 3 (Master, Master for third Percona Cluster)

Cluster Three

Node 1 (Master, Slave of second Percona Cluster) Node 2 (Master) Node 3 (Master)

The node 3 of first and second Cluster have

log_slave_updates = 1

else we had inconsistence data replication on the first node of DR Clusters (error on updates records, not found).

After setting up the first node of the second Cluster like Slave, we started Percona XtraDB cluster on the other two nodes and all seemed right, all the 3 nodes were synced, updates seemed to work with no problems.

But we discovered that when we are adding a new user/grant on the first Cluster, it is only replicated on the Slave of the second Cluster, but not on the other Cluster nodes of second Cluster, we have verified that also on the third Cluster no user/grant was created (because data was not existing on the node 3 of second Cluster).

Is there a way to fix it ?

Is it a bug ?

Is there a best practice to do Disaster Recovery on N (2 or mote) Percona XtraDB Clusters ?

Thanks for your time and best regards.

Claudio

Hi, would just like to quickly check if you saw this page on limitations? Just want to make sure that the first paragraph doesn’t apply in your case.
https://www.percona.com/doc/percona-xtradb-cluster/LATEST/limitation.html

Hi lorraine,

yes i know the limitation page, but i used CREATE USER command that would works in multi-master environment, infact it works on the First Cluster, works on the First node of Second Cluster but not on the other nodes of second Cluster.

We use innoDB and we have on all nodes

default_storage_engine = InnoDB
enforce_storage_engine = InnoDB
innodb_autoinc_lock_mode = 2 (on all nodes of three Clusters)
binlog_format = ROW
wsrep_replicate_myisam = ON

Anyway i forgot to mention that, the user/grant is not replicated but the Cluster shows all the nodes in sync and at the same level.

This is the strange thing, i was thinking to have not synced nodes, if something is not supported.

Claudio

Hi Claudio,

I’ve created a bug in jira (our bug-tracking system for this issue). I’ll be trying to see if I can repro this behavior.

https://jira.percona.com/browse/PXC-2064

I have some questions

(1) What’s the exact version of PXC that you are using? you can run “SELECT @@version” to check
(2) What version of linux are you running?
(3) Can you provide the cnf files for the nodes?

You can also check the logs to see if there are any errors with the CREATE USER.

Hi Kenn, these are the infos you asked for

1: mysql> SELECT @@version;
±-------------------+
| @@version |
±-------------------+
| 5.6.38-83.0-56-log |
±-------------------+
1 row in set (0.00 sec)

2: more /etc/redhat-release
CentOS Linux release 7.4.1708 (Core)

3: i attached the config files for nodes of Cluster1 and Cluster2

On the cluster1-node1 i issued

mysql> create user ‘sst_test2’@‘localhost’ IDENTIFIED BY ‘password’;
Query OK, 0 rows affected (0.00 sec)

In /var/log/messages of

  • cluster1-node1 (where the data is created)
  • cluster2-node1 (which is the slave and the data is replicated)
  • cluster2-node2 (which is the 2nd master, where the data is not replicated)

we can’t find any messages.

Hope it can help you

Claudio

config_file.zip (8.43 KB)

I add that the cluster2-node3 is the master for the cluster3 where node1 will be the slave and the other nodes will be masters

Claudio

I have found a workaround, you will have to enable log-slave-updates on both ends of the async channels.

So, in your setup, enable log-slave-updates on
cluster1-node3 async-master
cluster2-node1 async-slave
cluster2-node3 async-master
cluster3-node1 async-slave

Enabling log-slave-updates on the async-master nodes allows the CREATE USER to replicate from galera to the async channel.
Enabling log-slave-updates on the async-slave nodes allows the CREATE USER to replicate from the async channel to galera

Note that the config files provided have log-slave-updates enabled on the async master nodes, not the async slave nodes.

I have not fully tested this (just looked at CREATE USER), so you will have to ensure that things are working correctly for your setup.

Hi kenn we added the log_slave_updates = 1 like you recomended but as soon as we have done it, we got with the command

SHOW SLAVE STATUS \G;

the following sql error in the Last_SQL_Error field

Could not execute Write_rows event on table XXXX.j3_2017_session; Got error 122 from storage engine, Error_code: 1030; handler error No Error!; the event’s master log master3-bin.000063, end_log_pos 137918825

so the slave results out of sync from master

Let us know if you have some hints.

Thanks
Claudio

Hi kenn,

we have updated today at the last version 5.6.39-26.25 hoping the bug was solved, but instead it is still not fixed.

And adding

log_slave_updates=1 on the async-slave too break the mysql replication from master to slave node for commands that DELETE ROWS with the following error

Could not execute Delete_rows event on table XXXX_j3.qf4rl_session; Got error -1 from storage engine, Error_code: 1030; handler error No Error!; the event’s master log master3-bin.000066, end_log_pos 43204831

Hope it will be fixed as soon as possible, because a Disaster Recovery in this way is not possible.

Thanks for your time
Claudio

Hi all,

i solved my problem following the new documentation at

https://www.percona.com/blog/2018/03/19/percona-xtradb-cluster-mysql-asynchronous-replication-and-log-slave-updates/

all nodes of the first, second and third PXC must have

log_slave_updates=1

in this way the replication works, no more create user problems from any nodes of first PXC

Thanks Ken

Claudio