We are also facing similar problem when SST.
We maintain CI tests which just create a cluster with 3 machines, but donor can’t send fresh database snapshot with xtrabackup.
Here is a donor log:
"2022-02-20T23:57:34.140203-00:00 0 [Note] [MY-000000] [Galera] Member 1.0 (bionic-ansible210-c) requested state transfer from '*any*'. Selected 0.0 (bionic-ansible210-a)(SYNCED) as donor.",
"2022-02-20T23:57:34.140250-00:00 0 [Note] [MY-000000] [Galera] Shifting SYNCED -> DONOR/DESYNCED (TO: 4)",
"2022-02-20T23:57:34.140299-00:00 9 [Note] [MY-000000] [Galera] Detected STR version: 1, req_len: 130, req: STRv1",
"2022-02-20T23:57:34.140361-00:00 9 [Note] [MY-000000] [Galera] Cert index preload: 4 -> 4",
"2022-02-20T23:57:34.140721-00:00 9 [Note] [MY-000000] [WSREP] Server status change synced -> donor",
"2022-02-20T23:57:34.140727-00:00 0 [Note] [MY-000000] [Galera] async IST sender starting to serve tcp://172.29.95.33:4568 sending 4-4, preload starts from 4",
"2022-02-20T23:57:34.140746-00:00 9 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.",
"2022-02-20T23:57:34.140817-00:00 0 [Note] [MY-000000] [WSREP] Initiating SST/IST transfer on DONOR side (wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.29.95.33:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --basedir '/usr/' --plugindir '/usr/lib/mysql/plugin/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --mysqld-version '8.0.26-16.1' '' --gtid 'cee08bf0-92a8-11ec-83c0-63b922f830ea:4' )",
"2022-02-20T23:57:34.144327-00:00 0 [Note] [MY-000000] [Galera] IST sender 4 -> 4",
"2022-02-20T23:57:34.147471-00:00 9 [Note] [MY-000000] [WSREP] DONOR thread signaled with 0",
"2022-02-20T23:57:35.970628-00:00 0 [Note] [MY-000000] [Galera] (cedfb3c6-a977, 'tcp://0.0.0.0:4567') turning message relay requesting off",
"2022-02-20T23:57:35.970693-00:00 0 [Note] [MY-000000] [Galera] cleaning up d2aa0b8d-b8b7 (tcp://172.29.95.32:4567)",
"2022-02-20T23:57:44.567521-00:00 0 [Note] [MY-000000] [WSREP-SST] Streaming the backup to joiner at 172.29.95.33 4444",
Warning: 2022-02-20T23:57:44.669525-00:00 18 [Warning] [MY-013712] [Server] No suitable 'keyring_component_metadata_query' service implementation found to fulfill the request.",
"2022-02-20T23:57:47.181008-00:00 18 [Note] [MY-010914] [Server] Aborted connection 18 to db: 'unconnected' user: 'mysql.pxc.sst.user' host: 'localhost' (Got an error reading communication packets).",
Error: "2022-02-20T23:57:47.190979-00:00 0 [ERROR] [MY-000000] [WSREP-SST] ******************* FATAL ERROR ********************** ",
Error: "2022-02-20T23:57:47.191020-00:00 0 [ERROR] [MY-000000] [WSREP-SST] xtrabackup finished with error: 1. Check /var/lib/mysql//innobackup.backup.log",
Error: "2022-02-20T23:57:47.191028-00:00 0 [ERROR] [MY-000000] [WSREP-SST] Line 1948",
Error: "2022-02-20T23:57:47.191070-00:00 0 [ERROR] [MY-000000] [WSREP-SST] ------------ innobackup.backup.log (START) ------------",
"\txtrabackup: recognized server arguments: --datadir=/var/lib/mysql --innodb_adaptive_hash_index=0 --innodb_buffer_pool_size=1073741824 --innodb_checksum_algorithm=strict_crc32 --innodb_data_file_path=ibdata1:12M:autoextend --innodb_flush_log_at_trx_commit=0 --innodb_flush_method=O_DIRECT_NO_FSYNC --innodb_io_capacity=2000 --innodb_log_buffer_size=33554432 --innodb_log_file_size=134217728 --innodb_log_files_in_group=2 --innodb_read_io_threads=2 --innodb_write_io_threads=2 --open_files_limit=100000 --defaults_group=mysqld ",
"\txtrabackup: recognized client arguments: --socket=/var/run/mysqld/mysqld.sock --parallel=4 --user=mysql.pxc.sst.user --*** --socket=/var/run/mysqld/mysqld.sock --lock-ddl=1 --backup=1 --galera-info=1 --stream=xbstream --xtrabackup-plugin-dir=/usr/bin/pxc_extra/pxb-8.0/lib/plugin --target-dir=/tmp/pxc_sst_ZuAV/donor_xb_C48f ",
"\t/usr/bin/pxc_extra/pxb-8.0/bin/xtrabackup version 8.0.26-18 based on MySQL server 8.0.26 Linux (x86_64) (revision id: 4aecf82)",
"\t220220 23:57:44 version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;mysql_socket=/var/run/mysqld/mysqld.sock' as 'mysql.pxc.sst.user' (using password: YES).",
"\t220220 23:57:44 version_check Connected to MySQL server",
"\t220220 23:57:44 version_check Executing a version check against the server...",
"\t220220 23:57:44 version_check Done.",
"\t220220 23:57:44 Connecting to MySQL server host: localhost, user: mysql.pxc.sst.user, password: set, port: not set, socket: /var/run/mysqld/mysqld.sock",
"\tUsing server version 8.0.26-16.1",
"\t220220 23:57:44 Executing LOCK TABLES FOR BACKUP...",
"\txtrabackup: uses posix_fadvise().",
"\txtrabackup: cd to /var/lib/mysql",
"\txtrabackup: open files limit requested 100000, set to 16364",
"\txtrabackup: using the following InnoDB configuration:",
"\txtrabackup: innodb_data_home_dir = .",
"\txtrabackup: innodb_data_file_path = ibdata1:12M:autoextend",
"\txtrabackup: innodb_log_group_home_dir = ./",
"\txtrabackup: innodb_log_files_in_group = 2",
"\txtrabackup: innodb_log_file_size = 134217728",
"\txtrabackup: using O_DIRECT_NO_FSYNC",
"\tNumber of pools: 1",
"\txtrabackup: inititialize_service_handles suceeded",
"\t220220 23:57:44 Connecting to MySQL server host: localhost, user: mysql.pxc.sst.user, password: set, port: not set, socket: /var/run/mysqld/mysqld.sock",
"\txtrabackup: Redo Log Archiving is not set up.",
"\tWARNING: unknown option --binlog-info=ON",
"\t220220 23:57:44 >> log scanned up to (18245831)",
"\txtrabackup: Generating a list of tablespaces",
"\txtrabackup: Generating a list of tablespaces",
"\tScanning './'",
"\tCompleted space ID check of 2 files.",
"\tAllocated tablespace ID 1 for sys/sys_config, old maximum was 0",
"\tUsing undo tablespace './undo_001'.",
"\tUsing undo tablespace './undo_002'.",
"\tOpened 2 existing undo tablespaces.",
"\txtrabackup: Starting 4 threads for parallel data files transfer",
"\t220220 23:57:45 [01] Streaming ./ibdata1",
"\t[01] xtrabackup: Database page corruption detected at page 41, retrying...",
"\t220220 23:57:45 [02] Streaming ./sys/sys_config.ibd",
"\t220220 23:57:45 [03] Streaming ./mysql/wsrep_cluster.ibd",
"\t220220 23:57:45 [04] Streaming ./mysql/wsrep_cluster_members.ibd",
"\t220220 23:57:45 [04] ...done",
"\t220220 23:57:45 [03] ...done",
"\t220220 23:57:45 [02] ...done",
"\t220220 23:57:45 [04] Streaming ./mysql.ibd",
"\t220220 23:57:45 [02] Streaming ./mysql/wsrep_streaming_log.ibd",
"\t220220 23:57:45 [03] Streaming ./undo_002",
"\t220220 23:57:45 [02] ...done",
"\t220220 23:57:45 [02] Streaming ./undo_001",
"\t[04] xtrabackup: Database page corruption detected at page 992, retrying...",
"\t[01] xtrabackup: Database page corruption detected at page 41, retrying...",
"\t220220 23:57:45 [03] ...done",
"\t[04] xtrabackup: Database page corruption detected at page 992, retrying...",
"\t220220 23:57:45 [02] ...done",
"\t[01] xtrabackup: Database page corruption detected at page 41, retrying...",
"\t[04] xtrabackup: Database page corruption detected at page 992, retrying...",
"\t[01] xtrabackup: Database page corruption detected at page 41, retrying...",
"\t[04] xtrabackup: Database page corruption detected at page 992, retrying...",
"\t[01] xtrabackup: Database page corruption detected at page 41, retrying...",
"\t[04] xtrabackup: Database page corruption detected at page 992, retrying...",
"\t[01] xtrabackup: Database page corruption detected at page 41, retrying...",
"\t220220 23:57:45 >> log scanned up to (18245831)",
"\t[04] xtrabackup: Database page corruption detected at page 992, retrying...",
"\t[01] xtrabackup: Database page corruption detected at page 41, retrying...",
"\t[04] xtrabackup: Database page corruption detected at page 992, retrying...",
"\t[01] xtrabackup: Database page corruption detected at page 41, retrying...",
"\t[04] xtrabackup: Database page corruption detected at page 992, retrying...",
"\t[01] xtrabackup: Database page corruption detected at page 41, retrying...",
"\t[04] xtrabackup: Database page corruption detected at page 992, retrying...",
"\t[01] xtrabackup: Error: failed to read page after 10 retries. File ./ibdata1 seems to be corrupted.",
"\t[01] xtrabackup: Error: xtrabackup_copy_datafile() failed.",
"\t[01] xtrabackup: Error: failed to copy datafile.",
"\t[04] xtrabackup: Error: failed to read page after 10 retries. File ./mysql.ibd seems to be corrupted.",
"\t[04] xtrabackup: Error: xtrabackup_copy_datafile() failed.",
"\t[04] xtrabackup: Error: failed to copy datafile.",
"\t220220 23:57:46 >> log scanned up to (18245831)",
"",
Error: "2022-02-20T23:57:47.191101-00:00 0 [ERROR] [MY-000000] [WSREP-SST] ------------ innobackup.backup.log (END) ------------",
Error: "2022-02-20T23:57:47.191108-00:00 0 [ERROR] [MY-000000] [WSREP-SST] ****************************************************** ",
Error: "2022-02-20T23:57:47.191115-00:00 0 [ERROR] [MY-000000] [WSREP-SST] Cleanup after exit with status:22",
Error: "2022-02-20T23:57:47.200042-00:00 0 [ERROR] [MY-000000] [WSREP] Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.29.95.33:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --basedir '/usr/' --plugindir '/usr/lib/mysql/plugin/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --mysqld-version '8.0.26-16.1' '' --gtid 'cee08bf0-92a8-11ec-83c0-63b922f830ea:4' : 22 (Invalid argument)",
"2022-02-20T23:57:47.202144-00:00 0 [Note] [MY-000000] [Galera] SST sending failed: -22",
"2022-02-20T23:57:47.202178-00:00 0 [Note] [MY-000000] [WSREP] Server status change donor -> joined",
"2022-02-20T23:57:47.202198-00:00 0 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.",
Error: "2022-02-20T23:57:47.202266-00:00 0 [ERROR] [MY-000000] [WSREP] Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.29.95.33:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --basedir '/usr/' --plugindir '/usr/lib/mysql/plugin/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --mysqld-version '8.0.26-16.1' '' --gtid 'cee08bf0-92a8-11ec-83c0-63b922f830ea:4' ",
Warning: 2022-02-20T23:57:47.202879-00:00 0 [Warning] [MY-000000] [Galera] 0.0 (bionic-ansible210-a): State transfer to 1.0 (bionic-ansible210-c) failed: -22 (Invalid argument)",
Here is a joiner log:
"2022-02-20T23:57:33.482509-00:00 0 [Note] [MY-000000] [WSREP] Initiating SST/IST transfer on JOINER side (wsrep_sst_xtrabackup-v2 --role 'joiner' --address '172.29.95.33' --datadir '/var/lib/mysql/' --basedir '/usr/' --plugindir '/usr/lib/mysql/plugin/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '6829' --mysqld-version '8.0.26-16.1' '' )",
"2022-02-20T23:57:34.138097-00:00 2 [Note] [MY-000000] [WSREP] Prepared SST request: xtrabackup-v2|172.29.95.33:4444/xtrabackup_sst//1",
"2022-02-20T23:57:34.138394-00:00 2 [Note] [MY-000000] [Galera] Check if state gap can be serviced using IST",
"2022-02-20T23:57:34.138607-00:00 2 [Note] [MY-000000] [Galera] Local UUID: 00000000-0000-0000-0000-000000000000 != Group UUID: cee08bf0-92a8-11ec-83c0-63b922f830ea",
"2022-02-20T23:57:34.138836-00:00 2 [Note] [MY-000000] [Galera] ####### IST uuid:00000000-0000-0000-0000-000000000000 f: 0, l: 4, STRv: 3",
"2022-02-20T23:57:34.139136-00:00 2 [Note] [MY-000000] [Galera] IST receiver addr using tcp://172.29.95.33:4568",
"2022-02-20T23:57:34.139499-00:00 2 [Note] [MY-000000] [Galera] Prepared IST receiver for 0-4, listening at: tcp://172.29.95.33:4568",
"2022-02-20T23:57:34.140186-00:00 0 [Note] [MY-000000] [Galera] Member 1.0 (bionic-ansible210-c) requested state transfer from '*any*'. Selected 0.0 (bionic-ansible210-a)(SYNCED) as donor.",
"2022-02-20T23:57:34.141174-00:00 0 [Note] [MY-000000] [Galera] Shifting PRIMARY -> JOINER (TO: 4)",
"2022-02-20T23:57:34.141464-00:00 2 [Note] [MY-000000] [Galera] Requesting state transfer: success, donor: 0",
"2022-02-20T23:57:34.141705-00:00 2 [Note] [MY-000000] [Galera] Resetting GCache seqno map due to different histories.",
"2022-02-20T23:57:34.141956-00:00 2 [Note] [MY-000000] [Galera] GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> cee08bf0-92a8-11ec-83c0-63b922f830ea:4",
"2022-02-20T23:57:34.966609-00:00 0 [Note] [MY-000000] [WSREP-SST] Proceeding with SST.........",
"2022-02-20T23:57:34.989606-00:00 0 [Note] [MY-000000] [WSREP-SST] ............Waiting for SST streaming to complete!",
"2022-02-20T23:57:35.974475-00:00 0 [Note] [MY-000000] [Galera] (de612b05-91ed, 'tcp://0.0.0.0:4567') turning message relay requesting off",
Error: "2022-02-20T23:57:47.185896-00:00 0 [ERROR] [MY-000000] [WSREP-SST] ******************* FATAL ERROR ********************** ",
Error: "2022-02-20T23:57:47.186470-00:00 0 [ERROR] [MY-000000] [WSREP-SST] xtrabackup_checkpoints missing. xtrabackup/SST failed on DONOR. Check DONOR log",
Error: "2022-02-20T23:57:47.186486-00:00 0 [ERROR] [MY-000000] [WSREP-SST] Line 2264",
Error: "2022-02-20T23:57:47.186499-00:00 0 [ERROR] [MY-000000] [WSREP-SST] ****************************************************** ",
Error: "2022-02-20T23:57:47.186512-00:00 0 [ERROR] [MY-000000] [WSREP-SST] Cleanup after exit with status:2",
Warning: 2022-02-20T23:57:47.203111-00:00 0 [Warning] [MY-000000] [Galera] 0.0 (bionic-ansible210-a): State transfer to 1.0 (bionic-ansible210-c) failed: -22 (Invalid argument)",
Error: "2022-02-20T23:57:47.203151-00:00 0 [ERROR] [MY-000000] [Galera] gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():1214: Will never receive state. Need to abort.",
We did not run innochecksum
at all, so it will not be the cause.