Node Not Performing IST

I am working with a 3-node cluster that is already having SST issues and recently came up with a plan to run the cluster with a combination of IST and running Xtrabackup manually while we find a solution for SST (likely via version change, since we are on 5.7).

Gcache size and gcache.recover option have both been appropriately set.

After performing the Xtrabackup process and successfully streaming data from to the target node, IST does not take place. This is after performing all the necessary data unpacking and preparation steps. The grastate.dat file is also in place with the proper values to pick up from that position.

The service on the target node starts and incorrectly finds a saved state of 00000000-0000-0000-0000-000000000000:-1. Following that, it initiates SST and fails.

Not sure why the correct position is not being picked up, but all the data is ready to go from there, and it just gets deleted once the SST process begins. Might I be missing a step?

PXC version: 2.7.34-37
Donor Xtrabackup version: 2.4.26
Joiner Xtrabackup version: 2.4.24
Hashicorp Vault version: 1.11.0

Joiner error log:

2023-01-25T15:14:12.415766Z 0 [Warning] The syntax '--language/-l' is deprecated and will be removed in a future release. Please use '--lc-messages-dir' instead.
2023-01-25T15:14:12.415888Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2023-01-25T15:14:12.416773Z 0 [Note] /usr/sbin/mysqld (mysqld 5.7.34-37-57-log) starting as process 2726 ...
2023-01-25T15:14:12.416799Z 0 [Warning] Using pre 5.5 semantics to load error messages from /usr/share/mysql/english/.
2023-01-25T15:14:12.416801Z 0 [Warning] If this is not intended, refer to the documentation for valid usage of --lc-messages-dir and --language parameters.
2023-01-25T15:14:12.419777Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them.
2023-01-25T15:14:12.419784Z 0 [Note] WSREP: Skipping automatic SSL certificate generation (enabled only in bootstrap mode)
2023-01-25T15:14:12.419789Z 0 [Note] WSREP: Setting wsrep_ready to false
2023-01-25T15:14:12.419795Z 0 [Note] WSREP: No pre-stored wsrep-start position found. Skipping position initialization.
2023-01-25T15:14:12.419798Z 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera3/libgalera_smm.so'
2023-01-25T15:14:12.435882Z 0 [Note] WSREP: wsrep_load(): Galera 3.51(rd992932) by Codership Oy <info@codership.com> loaded successfully.
2023-01-25T15:14:12.436008Z 0 [Note] WSREP: CRC-32C: using 64-bit x86 acceleration.
2023-01-25T15:14:12.438142Z 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 0
2023-01-25T15:14:12.511581Z 0 [Note] WSREP: Skipped GCache ring buffer recovery: could not determine history UUID.
2023-01-25T15:14:12.513586Z 0 [Note] WSREP: Passing config to GCS: base_dir = /mydatadir/mydir; base_host = 172.31.255.53; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 10; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 4; evs.view_forget_timeout = PT24H; gcache.dir = /mydatadir/mydir; gcache.freeze_purge_at_seqno = -1; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /mydatadir/mydir/galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 50G; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 100; gcs.fc_master_slave = 1; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; ist.recv_addr = 172.31.255.52; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery = true; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = PT30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 9; socket.checksum = 2; socket.recv_buf_size = auto; socket.send_buf_size = auto; socket.ssl_ca = ca.pem; socket.ssl_cert = server-cert.pem; socket.ssl_cipher = ; socket.ssl_compression = YES; socket.ssl_key = server-key.pem; 
2023-01-25T15:14:12.520115Z 0 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 00000000-0000-0000-0000-000000000000:-1
2023-01-25T15:14:12.520173Z 0 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
2023-01-25T15:14:12.520181Z 0 [Note] WSREP: Preparing to initiate SST/IST
2023-01-25T15:14:12.520184Z 0 [Note] WSREP: Starting replication
2023-01-25T15:14:12.520189Z 0 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
2023-01-25T15:14:12.520277Z 0 [Note] WSREP: Using CRC-32C for message checksums.
2023-01-25T15:14:12.520300Z 0 [Note] WSREP: initializing ssl context
2023-01-25T15:14:12.520474Z 0 [Note] WSREP: gcomm thread scheduling priority set to other:0 
2023-01-25T15:14:12.520922Z 0 [Warning] WSREP: Fail to access the file (/mydatadir/mydir/gvwstate.dat) error (No such file or directory). It is possible if node is booting for first time or re-booting after a graceful shutdown
2023-01-25T15:14:12.520928Z 0 [Note] WSREP: Restoring primary-component from disk failed. Either node is booting for first time or re-booting after a graceful shutdown
2023-01-25T15:14:12.521023Z 0 [Note] WSREP: GMCast version 0
2023-01-25T15:14:12.521121Z 0 [Note] WSREP: (ec963753, 'ssl://0.0.0.0:4567') listening at ssl://0.0.0.0:4567
2023-01-25T15:14:12.521125Z 0 [Note] WSREP: (ec963753, 'ssl://0.0.0.0:4567') multicast: , ttl: 1
2023-01-25T15:14:12.521339Z 0 [Note] WSREP: EVS version 0
2023-01-25T15:14:12.521381Z 0 [Note] WSREP: gcomm: connecting to group 'my_pxc_cluster', peer '172.31.255.51:,172.31.255.52:,172.31.255.53:'
2023-01-25T15:14:12.523767Z 0 [Note] WSREP: SSL handshake successful, remote endpoint ssl://172.31.255.53:4567 local endpoint ssl://172.31.255.53:55970 cipher: TLS_AES_256_GCM_SHA384 compression: none
2023-01-25T15:14:12.523987Z 0 [Note] WSREP: SSL handshake successful, remote endpoint ssl://172.31.255.53:55970 local endpoint ssl://172.31.255.53:4567 cipher: TLS_AES_256_GCM_SHA384 compression: none
2023-01-25T15:14:12.524073Z 0 [Note] WSREP: (ec963753, 'ssl://0.0.0.0:4567') Found matching local endpoint for a connection, blacklisting address ssl://172.31.255.53:4567
2023-01-25T15:14:12.529906Z 0 [Note] WSREP: SSL handshake successful, remote endpoint ssl://172.31.255.51:4567 local endpoint ssl://172.31.255.53:56928 cipher: TLS_AES_256_GCM_SHA384 compression: none
2023-01-25T15:14:12.530680Z 0 [Note] WSREP: SSL handshake successful, remote endpoint ssl://172.31.255.52:4567 local endpoint ssl://172.31.255.53:35058 cipher: TLS_AES_256_GCM_SHA384 compression: none
2023-01-25T15:14:12.531503Z 0 [Note] WSREP: (ec963753, 'ssl://0.0.0.0:4567') connection established to d4caa4f4 ssl://172.31.255.51:4567
2023-01-25T15:14:12.531599Z 0 [Note] WSREP: (ec963753, 'ssl://0.0.0.0:4567') turning message relay requesting on, nonlive peers: 
2023-01-25T15:14:12.532340Z 0 [Note] WSREP: (ec963753, 'ssl://0.0.0.0:4567') connection established to 61e09acf ssl://172.31.255.52:4567
2023-01-25T15:14:13.022601Z 0 [Note] WSREP: declaring 61e09acf at ssl://172.31.255.52:4567 stable
2023-01-25T15:14:13.022634Z 0 [Note] WSREP: declaring d4caa4f4 at ssl://172.31.255.51:4567 stable
2023-01-25T15:14:13.023036Z 0 [Note] WSREP: Node 61e09acf state primary
2023-01-25T15:14:13.023382Z 0 [Note] WSREP: Current view of cluster as seen by this node
view (view_id(PRIM,61e09acf,81)
memb {
	61e09acf,0
	d4caa4f4,0
	ec963753,0
	}
joined {
	}
left {
	}
partitioned {
	}
)
2023-01-25T15:14:13.023400Z 0 [Note] WSREP: Save the discovered primary-component to disk
2023-01-25T15:14:13.521960Z 0 [Note] WSREP: gcomm: connected
2023-01-25T15:14:13.522067Z 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2023-01-25T15:14:13.522170Z 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
2023-01-25T15:14:13.522194Z 0 [Note] WSREP: Waiting for SST/IST to complete.
2023-01-25T15:14:13.522222Z 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2023-01-25T15:14:13.522277Z 0 [Note] WSREP: STATE EXCHANGE: sent state msg: ece8ca65-9cc2-11ed-9874-2f19660b1540
2023-01-25T15:14:13.522291Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: ece8ca65-9cc2-11ed-9874-2f19660b1540 from 0 (node2srvr)
2023-01-25T15:14:13.522313Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: ece8ca65-9cc2-11ed-9874-2f19660b1540 from 1 (srdcmysql01)
2023-01-25T15:14:13.522826Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: ece8ca65-9cc2-11ed-9874-2f19660b1540 from 2 (node3srvr)
2023-01-25T15:14:13.522841Z 0 [Note] WSREP: Quorum results:
	version    = 6,
	component  = PRIMARY,
	conf_id    = 79,
	members    = 2/3 (primary/total),
	act_id     = 606943709,
	last_appl. = -1,
	protocols  = 0/9/3 (gcs/repl/appl),
	group UUID = 0e59dc44-61da-11ed-bb91-e7c08fe77924
2023-01-25T15:14:13.522869Z 0 [Note] WSREP: Flow-control interval: [100, 100]
2023-01-25T15:14:13.522877Z 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 606943709)
2023-01-25T15:14:13.522963Z 2 [Note] WSREP: State transfer required: 
	Group state: 0e59dc44-61da-11ed-bb91-e7c08fe77924:606943709
	Local state: 00000000-0000-0000-0000-000000000000:-1
2023-01-25T15:14:13.523004Z 2 [Note] WSREP: REPL Protocols: 9 (4, 2)
2023-01-25T15:14:13.523014Z 2 [Note] WSREP: REPL Protocols: 9 (4, 2)
2023-01-25T15:14:13.523034Z 2 [Note] WSREP: New cluster view: global state: 0e59dc44-61da-11ed-bb91-e7c08fe77924:606943709, view# 80: Primary, number of nodes: 3, my index: 2, protocol version 3
2023-01-25T15:14:13.523043Z 2 [Note] WSREP: Setting wsrep_ready to true
2023-01-25T15:14:13.523051Z 2 [Warning] WSREP: Gap in state sequence. Need state transfer.
2023-01-25T15:14:13.523058Z 2 [Note] WSREP: Setting wsrep_ready to false
2023-01-25T15:14:13.523212Z 0 [Note] WSREP: Initiating SST/IST transfer on JOINER side (wsrep_sst_xtrabackup-v2 --role 'joiner' --address '172.31.255.53' --datadir '/mydatadir/mydir/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '2726' --mysqld-version '5.7.34-37-57'   '' )
	2023-01-25T15:14:13.765941Z WSREP_SST: [DEBUG] The xtrabackup version is 2.4.24
	2023-01-25T15:14:13.904584Z WSREP_SST: [WARNING] WARNING: xtrabackup-plugin-dir (under xtrabackup section) missing
	2023-01-25T15:14:13.906044Z WSREP_SST: [WARNING] xtrabackup installation will lookout for plugin at default location
	2023-01-25T15:14:14.105100Z WSREP_SST: [DEBUG] pxc_encrypt_cluster_traffic is enabled, using PXC auto-ssl configuration
	2023-01-25T15:14:14.107081Z WSREP_SST: [DEBUG] with encrypt=4  ssl_ca=/mydatadir/mydir//ca.pem  ssl_cert=/mydatadir/mydir//server-cert.pem  ssl_key=/mydatadir/mydir//server-key.pem
	2023-01-25T15:14:14.121800Z WSREP_SST: [INFO] Streaming with xbstream
	2023-01-25T15:14:14.123350Z WSREP_SST: [DEBUG] Using socat as streamer
	2023-01-25T15:14:14.148562Z WSREP_SST: [DEBUG] Using openssl based encryption with socat: with key, crt, and ca
	2023-01-25T15:14:14.173982Z WSREP_SST: [DEBUG] ssl_ca (absolute) : /mydatadir/mydir/ca.pem
	2023-01-25T15:14:14.176746Z WSREP_SST: [DEBUG] ssl_cert (absolute) : /mydatadir/mydir/server-cert.pem
	2023-01-25T15:14:14.180067Z WSREP_SST: [DEBUG] ssl_key (absolute) : /mydatadir/mydir/server-key.pem
	2023-01-25T15:14:14.219262Z WSREP_SST: [DEBUG] Decrypting with CERT: /mydatadir/mydir/server-cert.pem, KEY: /mydatadir/mydir/server-key.pem, CA: /mydatadir/mydir/ca.pem
	2023-01-25T15:14:14.266170Z WSREP_SST: [DEBUG] 1208: Using /proc/pid/net/tcp for socat/nc discovery
	2023-01-25T15:14:14.274080Z WSREP_SST: [DEBUG] 1239: local_address index is 2
	2023-01-25T15:14:14.281589Z WSREP_SST: [DEBUG] 1252: Entering loop body : 1
	2023-01-25T15:14:14.295153Z WSREP_SST: [DEBUG] Evaluating socat -u openssl-listen:4444,reuseaddr,cert=/mydatadir/mydir/server-cert.pem,key=/mydatadir/mydir/server-key.pem,cafile=/mydatadir/mydir/ca.pem,verify=1,retry=30 stdio | xbstream -x  $xbstream_eopts
	2023-01-25T15:14:14.505610Z WSREP_SST: [DEBUG] 1252: Entering loop body : 2
	2023-01-25T15:14:14.529100Z WSREP_SST: [DEBUG] 1261: Examining pid: 3249
	2023-01-25T15:14:14.697420Z WSREP_SST: [DEBUG] 1278: sockets: 50351|50352|50353
	2023-01-25T15:14:14.706634Z WSREP_SST: [DEBUG] 1302: found a match for pid: 3249
	2023-01-25T15:14:14.709543Z WSREP_SST: [DEBUG] 1318: wait_for_listen() exiting
2023-01-25T15:14:14.709913Z 2 [Note] WSREP: Prepared SST/IST request: xtrabackup-v2|172.31.255.53:4444/xtrabackup_sst//1
2023-01-25T15:14:14.709946Z 2 [Note] WSREP: Auto Increment Offset/Increment re-align with cluster membership change (Offset: 1 -> 3) (Increment: 1 -> 3)
2023-01-25T15:14:14.709999Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2023-01-25T15:14:14.710030Z 2 [Note] WSREP: Assign initial position for certification: 606943709, protocol version: 4
2023-01-25T15:14:14.710111Z 0 [Note] WSREP: Service thread queue flushed.
2023-01-25T15:14:14.710165Z 2 [Note] WSREP: Check if state gap can be serviced using IST
2023-01-25T15:14:14.710205Z 2 [Note] WSREP: Local UUID: 00000000-0000-0000-0000-000000000000 != Group UUID: 0e59dc44-61da-11ed-bb91-e7c08fe77924
2023-01-25T15:14:14.710264Z 2 [Note] WSREP: State gap can't be serviced using IST. Switching to SST
2023-01-25T15:14:14.710274Z 2 [Note] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (0e59dc44-61da-11ed-bb91-e7c08fe77924): 1 (Operation not permitted)
	 at galera/src/replicator_str.cpp:prepare_for_IST():516. IST will be unavailable.
2023-01-25T15:14:14.710793Z 0 [Note] WSREP: Member 2.0 (node3srvr) requested state transfer from 'node2srvr'. Selected 0.0 (node2srvr)(SYNCED) as donor.
2023-01-25T15:14:14.710810Z 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 606943734)
2023-01-25T15:14:14.710829Z 2 [Note] WSREP: Requesting state transfer: success, donor: 0
2023-01-25T15:14:14.710849Z 2 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 0e59dc44-61da-11ed-bb91-e7c08fe77924:606943709
2023-01-25T15:14:16.022150Z 0 [Note] WSREP: (ec963753, 'ssl://0.0.0.0:4567') turning message relay requesting off
	2023-01-25T15:14:16.362807Z WSREP_SST: [INFO] Streaming with xbstream
	2023-01-25T15:14:16.373821Z WSREP_SST: [INFO] WARNING: Stale temporary SST directory: /mydatadir/mydir//.sst from previous state transfer. Removing
	2023-01-25T15:14:17.039283Z WSREP_SST: [INFO] Proceeding with SST.........
	2023-01-25T15:14:17.039752Z WSREP_SST: [DEBUG] Evaluating (@ Joiner-Recv-SST) socat -u openssl-listen:4444,reuseaddr,cert=/mydatadir/mydir/server-cert.pem,key=/mydatadir/mydir/server-key.pem,cafile=/mydatadir/mydir/ca.pem,verify=1,retry=30 stdio | xbstream -x  $xbstream_eopts; RC=( ${PIPESTATUS[@]} )
	2023-01-25T15:14:17.042854Z WSREP_SST: [DEBUG] Cleaning the existing datadir and innodb-data/log directories
	2023-01-25T15:14:17.618416Z WSREP_SST: [DEBUG] Cleaning the binlog directory /mydatadir/mydir_binlogs as well
	2023-01-25T15:14:17.627876Z WSREP_SST: [INFO] ............Waiting for SST streaming to complete!
2023-01-25T16:14:45.648079Z 0 [Note] WSREP: 0.0 (node2srvr): State transfer to 2.0 (node3srvr) complete.
	2023-01-25T16:14:45.682213Z WSREP_SST: [INFO] Preparing the backup at /mydatadir/mydir//.sst
	2023-01-25T16:14:45.685953Z WSREP_SST: [DEBUG] Evaluating (@ Xtrabackup prepare stage) xtrabackup   --use-memory=200G --prepare --binlog-info=ON $rebuildcmd $keyringapplyopt $encrypt_prepare_options --target-dir=${DATA} &>${DATA}/innobackup.prepare.log
	2023-01-25T16:14:45.724470Z WSREP_SST: [ERROR] Cleanup after exit with status:1
2023-01-25T16:14:50.627512Z 0 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '172.31.255.53' --datadir '/mydatadir/mydir/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '2726' --mysqld-version '5.7.34-37-57'   '' : 1 (Operation not permitted)
2023-01-25T16:14:50.627563Z 0 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
2023-01-25T16:14:50.627575Z 0 [ERROR] WSREP: SST script aborted with error 1 (Operation not permitted)
2023-01-25T16:14:50.627652Z 0 [ERROR] WSREP: SST failed: 1 (Operation not permitted)
2023-01-25T16:14:50.627686Z 0 [ERROR] Aborting

2023-01-25T16:14:50.627697Z 0 [Note] WSREP: Signalling cancellation of the SST request.
2023-01-25T16:14:50.627718Z 0 [Note] WSREP: SST request was cancelled
2023-01-25T16:14:50.627739Z 0 [Note] Giving 2 client threads a chance to die gracefully
2023-01-25T16:14:50.627812Z 2 [Note] WSREP: Closing send monitor...
2023-01-25T16:14:50.627847Z 2 [Note] WSREP: Closed send monitor.
2023-01-25T16:14:50.627868Z 2 [Note] WSREP: gcomm: terminating thread
2023-01-25T16:14:50.627889Z 2 [Note] WSREP: gcomm: joining thread
2023-01-25T16:14:50.627992Z 2 [Note] WSREP: gcomm: closing backend
2023-01-25T16:14:50.628478Z 2 [Note] WSREP: Current view of cluster as seen by this node
view (view_id(NON_PRIM,61e09acf,81)
memb {
	ec963753,0
	}
joined {
	}
left {
	}
partitioned {
	61e09acf,0
	d4caa4f4,0
	}
)
2023-01-25T16:14:50.628532Z 2 [Note] WSREP: Current view of cluster as seen by this node
view ((empty))
2023-01-25T16:14:50.628875Z 2 [Note] WSREP: gcomm: closed
2023-01-25T16:14:50.628904Z 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2023-01-25T16:14:50.628938Z 0 [Note] WSREP: Flow-control interval: [100, 100]
2023-01-25T16:14:50.628948Z 0 [Note] WSREP: Received NON-PRIMARY.
2023-01-25T16:14:50.628956Z 0 [Note] WSREP: Shifting JOINER -> OPEN (TO: 607156625)
2023-01-25T16:14:50.628966Z 0 [Note] WSREP: Received self-leave message.
2023-01-25T16:14:50.628974Z 0 [Note] WSREP: Flow-control interval: [100, 100]
2023-01-25T16:14:50.628982Z 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2023-01-25T16:14:50.628989Z 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 607156625)
2023-01-25T16:14:50.629004Z 0 [Note] WSREP: RECV thread exiting 0: Success
2023-01-25T16:14:50.629099Z 2 [Note] WSREP: recv_thread() joined.
2023-01-25T16:14:50.629112Z 2 [Note] WSREP: Closing replication queue.
2023-01-25T16:14:50.629119Z 2 [Note] WSREP: Closing slave action queue.
2023-01-25T16:14:50.629149Z 2 [Note] WSREP: Ignorng trx(606943710) due to SST failure
2023-01-25T16:14:50.629164Z 2 [Note] WSREP: Ignorng trx(606943711) due to SST failure
2023-01-25T16:14:51.375121Z 2 [Note] WSREP: Ignorng trx(607156625) due to SST failure
2023-01-25T16:14:51.375140Z 2 [Note] WSREP: New cluster view: global state: 00000000-0000-0000-0000-000000000000:0, view# -1: non-Primary, number of nodes: 0, my index: -1, protocol version -1
2023-01-25T16:14:51.375144Z 2 [Note] WSREP: Setting wsrep_ready to false
2023-01-25T16:14:51.375147Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2023-01-25T16:14:51.375151Z 2 [Note] WSREP: applier thread exiting (code:6)
2023-01-25T16:14:52.627876Z 0 [Note] WSREP: Waiting for active wsrep applier to exit
2023-01-25T16:14:52.627924Z 1 [Note] WSREP: rollbacker thread exiting
2023-01-25T16:14:52.628009Z 0 [Note] WSREP: Service disconnected.
2023-01-25T16:14:52.628023Z 0 [Note] WSREP: Waiting to close threads......
2023-01-25T16:14:57.628152Z 0 [Note] WSREP: Some threads may fail to exit.
2023-01-25T16:14:57.639508Z 0 [Note] WSREP: Service thread queue flushed.
2023-01-25T16:14:57.639625Z 0 [Note] WSREP: MemPool(SlaveTrxHandle): hit ratio: 0.999995, misses: 1, in use: 0, in pool: 1
2023-01-25T16:14:57.647259Z 0 [Note] Binlog end
2023-01-25T16:14:57.647382Z 0 [Note] /usr/sbin/mysqld: Shutdown complete

Donor backup log

xtrabackup: recognized server arguments: --datadir=/mydatadir/mydir --tmpdir=/rmds --server-id=11 --open_files_limit=65536 --innodb_buffer_pool_size=200G --innodb_log_file_size=1G --innodb_log_buffer_size=4M --innodb_flush_method=O_DIRECT --innodb_flush_log_at_trx_commit=1 --innodb_file_per_table=1 --log_bin=/mydatadir/mydir_binlogs/mysql-bin.log --open_files_limit=10000 --defaults_group=mysqld --parallel=4 --transition-key=* 
xtrabackup: recognized client arguments: --user=sstuser --password=* --socket=/var/run/mysqld/mysqld.sock --lock-ddl=1 --backup=1 --galera-info=1 --binlog-info=ON --stream=xbstream --target-dir=/rmds/pxc_sst_VfVS/donor_xb_BT4Y 
230125 09:14:25  version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;mysql_socket=/var/run/mysqld/mysqld.sock' as 'sstuser'  (using password: YES).
230125 09:14:25  version_check Connected to MySQL server
230125 09:14:25  version_check Executing a version check against the server...
230125 09:14:25  version_check Done.
230125 09:14:25 Connecting to MySQL server host: localhost, user: sstuser, password: set, port: not set, socket: /var/run/mysqld/mysqld.sock
Using server version 5.7.34-37-57-log
230125 09:14:25 Executing LOCK TABLES FOR BACKUP...
xtrabackup version 2.4.26 based on MySQL server 5.7.35 Linux (x86_64) (revision id: 19de43b)
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /mydatadir/mydir
xtrabackup: open files limit requested 10000, set to 65536
xtrabackup: using the following InnoDB configuration:
xtrabackup:   innodb_data_home_dir = .
xtrabackup:   innodb_data_file_path = ibdata1:12M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 2
xtrabackup:   innodb_log_file_size = 1073741824
xtrabackup: using O_DIRECT
InnoDB: Number of pools: 1
230125 09:14:25 Added plugin 'keyring_vault.so' to load list.
Plugin keyring_vault reported: 'Probing my-mount-point for being a mount point unsuccessful - skipped.'
230125 09:14:30 >> log scanned up to (36420223097455)
xtrabackup: Generating a list of tablespaces
InnoDB: Allocated tablespace ID 140692 for SCn/Conxn, old maximum was 0
230125 09:14:31 >> log scanned up to (36420223106018)
xtrabackup: Starting 4 threads for parallel data files transfer
230125 09:14:32 [01] Streaming ./ibdata1
230125 10:14:44 [01] Streaming ./aaa/bbb.frm to <STDOUT>
230125 10:14:44 [01]        ...done
230125 10:14:44 Finished backing up non-InnoDB tables and files
230125 10:14:44 Executing LOCK BINLOG FOR BACKUP...
230125 10:14:44 [00] Streaming <STDOUT>
230125 10:14:44 [00]        ...done
230125 10:14:44 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
xtrabackup: The latest check point (for incremental): '36420276927857'
xtrabackup: Stopping log copying thread.
.230125 10:14:44 >> log scanned up to (36420276927866)

230125 10:14:45 Executing UNLOCK BINLOG
230125 10:14:45 Executing UNLOCK TABLES
230125 10:14:45 All tables unlocked
230125 10:14:45 [00] Streaming ib_buffer_pool to <STDOUT>
230125 10:14:45 [00]        ...done
230125 10:14:45 Backup created in directory '/rmds/pxc_sst_VfVS/donor_xb_BT4Y/'
MySQL binlog position: filename 'mysql-bin.006930', position '214829759'
230125 10:14:45 [00] Streaming <STDOUT>
230125 10:14:45 [00]        ...done
230125 10:14:45 [00] Streaming <STDOUT>
230125 10:14:45 [00]        ...done
230125 10:14:45 Saving xtrabackup_keys.
xtrabackup: Transaction log of lsn (36419911736592) to (36420276927866) was copied.
Shutting down plugin 'keyring_vault'
230125 10:14:45 completed OK!

grastate.dat

# GALERA saved state
version: 2.1
uuid:    0e23zz44-61fa-22cf-bb91-e7c08fe77924
seqno:   238604481
safe_to_bootstrap: 0

Hi, as far as I can tell grastate.dat is properly set, so you should get an IST. Did you check that all of the files are owned by mysql user?

Yes sir.

We actually ran the chown command shortly before starting the service to make sure.

Hi,

It looks like you are having some permission issues. There are some operation not permitted errors.
Is SElinux enabled? If so, could you try disabling it?

Do you have the correct permissions and ownership for the data directory?

After some more exhaustive testing, we found that the two different versions of Xtrabackup were causing the issue.

We dropped the version on the donor server to 2.4.24, and that resolved our issue.

Thank you all for your input!

1 Like