Replication to Percona Xtradb Cluster node stalls

Hi All,

I am trying to migrate my existing Master-Slave setup to the Percona Xtradb Cluster setup.

I have created a 3 node Percona Xtradb Cluster from the existing environment and configured one of these nodes as a partial slave (specific schema using replicate-ignore-db) of the existing Master database (from the master-slave setup).
This seems similar to this setup .

However, the replication (from Master to Cluster node) just stalls and I see the below warning continuously printed in the error log. The Seconds_Behind_Source keeps on increasing every second while Replica_IO_Running and Replica_SQL_Running are still running.

[Warning] [MY-000000] [WSREP] Pending to replicate MySQL GTID event (probably a stale event). Discarding it now.

I see the Replica Status as below.

mysql@pxc3 > show replica status\G
*************************** 1. row ***************************
             Replica_IO_State: Waiting for source to send event
                  Source_Host: 172.31.xxx.xxx
                  Source_User: repl_username
                  Source_Port: 3306
                Connect_Retry: 60
              Source_Log_File: mysql-bin-master.001041
          Read_Source_Log_Pos: 248200155
               Relay_Log_File: mysql-pxcdb1-relay-bin.000049
                Relay_Log_Pos: 35938933
        Relay_Source_Log_File: mysql-bin-master.001041
           Replica_IO_Running: Yes
          Replica_SQL_Running: Yes
              Replicate_Do_DB: db1,db2,db3
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Source_Log_Pos: 170825266
              Relay_Log_Space: 147704027
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Source_SSL_Allowed: No
           Source_SSL_CA_File:
           Source_SSL_CA_Path:
              Source_SSL_Cert:
            Source_SSL_Cipher:
               Source_SSL_Key:
        Seconds_Behind_Source: 85653
Source_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Source_Server_Id: 17231xxxxxx
                  Source_UUID: b6e2bbb9-5363-11ec-a1ee-xxxxxxxxxxxx
             Source_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
    Replica_SQL_Running_State: Waiting for dependent transaction to commit
           Source_Retry_Count: 86400
                  Source_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Source_SSL_Crl:
           Source_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Source_TLS_Version:
       Source_public_key_path:
        Get_Source_public_key: 0
            Network_Namespace:
1 row in set (0.00 sec)

The purpose of doing this is to minimize the downtime during the actual switchover.

Does this setup work? Are there any pre-requisites that we need to take care?
All the versions are 8.0.

Thanks in advance.

1 Like

Hello Abhijit,
Welcome to the Percona Community.

Could you please check the link below, it may content solution for you.
Related PXC Bug: [URL][PXC-2591] Percona cluster with async gtid replication - Percona JIRA

Regards,
Denis Subbota.
Managed Services, Percona.

1 Like

Hello @Abhijit_Buchake
The event you describe seems to be related to MTR (Multi Thread Replication) which basically means you have more than one SQL thread on the replica node. This can be verified by showing the current value of slave_parallel_workers. Having these messages is not a problem, it is a sign that some transactions are dependent of other transactions. Can you share the output of the following from the replica node?

show global variables like '%parallel%';

Can you please share the actual values for variable innodb_flush_log_at_trx_commit and sync_binlog on your replica node? Relaxing those variables can help get better performance while replicating, and allow reduce replication lag faster, but be advised that those variables will not be ACID compliant.

1 Like