multi-master with 3 nodes - force old node to be primary, boostrop

So i was experimenting with initial multi-master 2-node setup (node1, node2) and was working. I added a new node (node3) to the cluster and didn’t notice for whatever reason node1 was shutting down or not running (not sure). As soon as I start node1, it was already synced with node3 and my old db in node1 is gone.

By any chance i can force node2 (with old data) to bootstrap and primary to sync to node1 and node3?

I’m not sure if the scenario makes sense but now im left with old db gone. :frowning:

Thanks.

Hello there, I’ll try to bring this to the team’s attention, but could you just update the post please with version and environment information and any other details (logs maybe, my.cnf) that might give them the information that they might well ask for before making their observations? Thanks…

Hi Lorraine. all are on Ubuntu 16.x LTS and using Percona 5.7 XtraDB-Cluster 5.7.22-22-57-log. all config uses the following:


# /etc/mysql/my.cnf
[mysqld] wsrep_provider=/usr/lib/libgalera_smm.so
wsrep_cluster_name=pxc-cluster
wsrep_cluster_address=gcomm://192.168.9.5,192.168.9.6
wsrep_node_name=udb[node_number]
wsrep_node_address=192.168.9.5
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sstuser:passw0rd
pxc_strict_mode=ENFORCING
binlog_format=ROW
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2

I’ll try to gather logs.

I have tried
[LIST]
[]editing node2’s grastate.dat “safe_to_bootstrap” to “1”
[
]modify the cnf and create a new cluster name
[*]removed the cluster address
[/LIST] all of these unable to start the service using “sudo /etc/init.d/mysql bootstrap-pxc”

I am afraid your scenario description is a bit confusing.
It looks like you actually not “added” node3 but rather bootstrapped a new cluster (with no data) and node1 later joined and simply took the empty data backup from node3?
Was node2 down at that time? This is not clear at this point. But if that’s the last node having good data, you should first stop the other nodes with wrong data, and then try to recover node2. If it doesn’t start in bootstrap mode, you may try to start it as standalone (wsrep_provider commended out) and see if you can get access to the data.

Please remember that bootstrapping process has to be taken with great care as if done wrongly can lead to data loss or split brain situations, which are usually very hard to solve later. Basically, always make sure cluster is up and healthy before you try to add a new node. And make sure no node is up before you try to bootstrap cluster again. Also, you have to bootstrap the right node.
You may refer to this article for more examples and details: https://www.percona.com/blog/2014/09/01/galera-replication-how-to-recover-a-pxc-cluster/

If you have problems getting the node with good data up, please attach it’s error log (as file attachment).

sorry if it is not clear. i’ll try to explain this per line in sequence of what happened

node1 was down
node2 was down
node3 was up that time, after setting up my.cnf
node1 was brought up - synced to the bootstrapped node2
node1 - show database = back to scratch

so i was trying to bring up node2 by these:
sudo /etc/init.d/mysql bootstrap-pxc – failed to start
change config to make it a new cluster – failed to start
edit the safe_to_bootstrap file – failed to start
also deleted the file sst_something_something forgot the filename – failed to start

I am sorry, but I still don’t quite understand this sequence:

OK, so at this point your original cluster is all down…

So, as node1 and node2 are down, if you managed to start node3, it means it bootstrapped a new cluster. Or, do you mean it actually joined one of the nodes above? So it was started when at least one of them was still running?

Sorry - node2 was down, so how could node1 now sync with node2?

You mean no databases there? Then probably node1 synced from node3, which was brought up as a new cluster.

Can you upload the error log from this node? This is the primary source of information on what really blocks it from starting.

im unable to retrieve the logs from my test env… got it deleted or something. basically i left node2 running probably bootstrapped or so… node1 and node2 were both down and when i brought up node1 (likely not bootstrapped) i realized i lost the DBs i created.