Thanxs @CTutte
What version are you in ?
percona-xtradb-cluster-full-57
Are both nodes same version?
Yes.
and how big your data set is ?
To my understanding it has to sync the entire DB … so like 750Gb or so.
innobackup.log from DONOR:
211214 10:32:32 >> log scanned up to (14740248342320)
211214 10:32:33 >> log scanned up to (14740248351203)
211214 10:32:34 >> log scanned up to (14740248359240)
xtrabackup: Error writing file 'UNOPENED' (Errcode: 32 - Broken pipe)
xb_stream_write_data() failed.
compress: write to the destination stream failed.
xtrabackup: Error writing file 'UNOPENED' (Errcode: 32 - Broken pipe)
[04] xtrabackup: Error: xtrabackup_copy_datafile() failed.
[04] xtrabackup: Error: failed to copy datafile.
mysql.log from DONOR
2021-12-14T09:30:07.821070Z 0 [ERROR] WSREP: Process was aborted.
2021-12-14T09:30:07.821182Z 0 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.0.2.1:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --mysqld-version '5.7.33-36-57' --binlog '/var/log/mysql/mysql-bin' --gtid 'f4b2cd0b-1721-11e4-9cfc-830b046aee12:12424599779' : 2 (No such file or directory)
2021-12-14T09:30:07.821295Z 0 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.0.2.1:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --mysqld-version '5.7.33-36-57' --binlog '/var/log/mysql/mysql-bin' --gtid 'f4b2cd0b-1721-11e4-9cfc-830b046aee12:12424599779'
2021-12-14T09:30:07.821336Z 0 [Warning] WSREP: Could not find peer: 73b39a0b-5bc8-11ec-b90b-eb514a870cac
2021-12-14T09:30:07.821402Z 0 [Warning] WSREP: 0.0 (17db3): State transfer to -1.-1 (left the group) failed: -2 (No such file or directory)
2021-12-14T09:37:37.161125Z WSREP_SST: [INFO] Streaming with xbstream
2021-12-14T09:37:47.231616Z WSREP_SST: [INFO] Streaming the backup to joiner at 10.0.2.1 4444
2021/12/14 09:49:06 socat[3394852] E write(6, 0x55cf54eb47b0, 8192): Connection reset by peer
2021-12-14T10:12:38.415636Z 0 [ERROR] WSREP: Process was aborted.
2021-12-14T10:12:38.415777Z 0 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.0.2.1:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --mysqld-version '5.7.33-36-57' --binlog '/var/log/mysql/mysql-bin' --gtid 'f4b2cd0b-1721-11e4-9cfc-830b046aee12:12425100275' : 2 (No such file or directory)
2021-12-14T10:12:38.415876Z 0 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address '10.0.2.1:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --mysqld-version '5.7.33-36-57' --binlog '/var/log/mysql/mysql-bin' --gtid 'f4b2cd0b-1721-11e4-9cfc-830b046aee12:12425100275'
2021-12-14T10:12:38.416533Z 0 [Warning] WSREP: Could not find peer: 5d6d3dbe-5be7-11ec-8fe7-ba5747d51203
2021-12-14T10:12:38.416592Z 0 [Warning] WSREP: 1.0 (17db3): State transfer to -1.-1 (left the group) failed: -2 (No such file or directory)
2021-12-14T10:14:01.662563Z WSREP_SST: [INFO] Streaming with xbstream
2021-12-14T10:14:11.729361Z WSREP_SST: [INFO] Streaming the backup to joiner at 10.0.2.1 4444
2021/12/14 10:32:34 socat[3399629] E write(6, 0x558d6c53f7b0, 8192): Connection reset by peer
mysql.log from JOINER
Log of wsrep recovery (--wsrep-recover):
2021-12-14T09:37:24.620190Z 0 [Warning] The syntax '--log_warnings/-W' is deprecated and will be removed in a future release. Please use '--log_error_verbosity' instead.
2021-12-14T09:37:24.620420Z 0 [Warning] options --log-slow-admin-statements, --log-queries-not-using-indexes and --log-slow-slave-statements have no effect if --slow-query-log is not set
2021-12-14T09:37:24.620427Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2021-12-14T09:37:31.336308Z 0 [Warning] InnoDB: Skipping buffer pool dump/restore during wsrep recovery.
mysqld: Table 'mysql.plugin' doesn't exist
2021-12-14T09:37:31.337330Z 0 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.
2021-12-14T09:37:31.337984Z 0 [Warning] WSREP: Read non-wsrep XID from storage engines.
2021-12-14T09:37:35.156842Z 0 [Warning] The syntax '--log_warnings/-W' is deprecated and will be removed in a future release. Please use '--log_error_verbosity' instead.
2021-12-14T09:37:35.157057Z 0 [Warning] options --log-slow-admin-statements, --log-queries-not-using-indexes and --log-slow-slave-statements have no effect if --slow-query-log is not set
2021-12-14T09:37:35.157064Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2021-12-14T09:37:35.683869Z 2 [Warning] WSREP: Gap in state sequence. Need state transfer.
2021-12-14T09:37:36.155851Z WSREP_SST: [INFO] Streaming with xbstream
2021-12-14T09:37:36.167044Z WSREP_SST: [WARNING] Found a stale sst_in_progress file: /var/lib/mysql//sst_in_progress
2021-12-14T09:37:37.311431Z WSREP_SST: [INFO] Streaming with xbstream
2021-12-14T09:37:37.322334Z WSREP_SST: [INFO] WARNING: Stale temporary SST directory: /var/lib/mysql//.sst from previous state transfer. Removing
2021-12-14T09:37:37.341404Z WSREP_SST: [INFO] Proceeding with SST.........
2021-12-14T09:37:37.371811Z WSREP_SST: [INFO] ............Waiting for SST streaming to complete!
2021-12-14T09:49:06.052017Z WSREP_SST: [ERROR] Killing SST (3632138) with SIGKILL after stalling for 120 seconds
/usr/bin/wsrep_sst_xtrabackup-v2: line 195: 3632140 Killed socat -u TCP-LISTEN:4444,reuseaddr,retry=30 stdio
3632141 | xbstream -x $xbstream_eopts
2021-12-14T09:49:06.075142Z WSREP_SST: [ERROR] ******************* FATAL ERROR **********************
2021-12-14T09:49:06.077741Z WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 137 137
2021-12-14T09:49:06.080179Z WSREP_SST: [ERROR] ******************************************************
2021-12-14T09:49:06.084031Z WSREP_SST: [ERROR] Cleanup after exit with status:32
2021-12-14T09:49:06.110694Z 0 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.0.2.1' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '3631529' --mysqld-version '5.7.35-38-57' --binlog '/var/log/mysql/mysql-bin' : 32 (Broken pipe)
2021-12-14T09:49:06.110774Z 0 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
2021-12-14T09:49:06.110788Z 0 [ERROR] WSREP: SST script aborted with error 32 (Broken pipe)
2021-12-14T09:49:06.110882Z 0 [ERROR] WSREP: SST failed: 32 (Broken pipe)
2021-12-14T09:49:06.110911Z 0 [ERROR] Aborting
make sure that wsrep_cluster_status is “Primary”
Yes … the donor returns “Primary” on
show status like 'wsrep_cluster_status';
wsrep_cluster_status Primary
As said, our donor is stalled on “wsrep_local_state_comment Donor/Desynced” state even after the SST has failed and it’s aborted. We cannot push it out of that state so we can try again or do anything else. Its operational R/W and providing service but we don’t know how to clean that status so we can attempt a new rejoin
P.D.
I have also realized about wsrep_cluster_size 2
… so primary it’s thinking there is another node in the cluster … but the other node is down.
P.D.2
We DO have an arbitrator node as well, yes … but it’s not a data node.