so i dumped mysql-prod-replica-02-relay-bin.007426 — and pasted an excerpt around “826986907”
i am not sure where to go from here to fix this…
thanks for any ideas people can help with
# at 826986732
#221124 18:06:25 server id 1 end_log_pos 826986667 CRC32 0x4c28e931 Update_rows: table id 86 flags: STMT_END_F
BINLOG '
obJ/YxMBAAAASgAAABvQSjEAAFYAAAAAAAEABXNsaWNlAA1lbWFpbF9kb21haW5zAAUIDw8PEgf6
APoA+gAAHgEBgAIBCB0CNsc=
obJ/Yx8BAAAAkAAAAKvQSjEAAFYAAAAAAAEAAgAF//8AnGgXUgAAAAAIem9oby5jb20Pc210cGlu
LnpvaG8uY29tDjEzNi4xNDMuMTkxLjIzma5xISMAnGgXUgAAAAAIem9oby5jb20Pc210cGluLnpv
aG8uY29tDjEzNi4xNDMuMTkxLjIzma5xIZkx6ShM
'/*!*/;
# at 826986876
#221124 18:06:25 server id 1 end_log_pos 826986698 CRC32 0x66eb27ad Xid = 31457734691
COMMIT/*!*/;
# at 826986907
#221124 18:05:51 server id 1 end_log_pos 826986783 CRC32 0x16ea18b7 GTID last_committed=136821 sequence_number=136822 rbr_only=yes original_committed_timestamp=16693131852318
13 immediate_commit_timestamp=1669313185231813 transaction_length=1518829485
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
# original_commit_timestamp=1669313185231813 (2022-11-24 18:06:25.231813 GMT)
# immediate_commit_timestamp=1669313185231813 (2022-11-24 18:06:25.231813 GMT)
/*!80001 SET @@session.original_commit_timestamp=1669313185231813*//*!*/;
/*!80014 SET @@session.original_server_version=80026*//*!*/;
/*!80014 SET @@session.immediate_server_version=80026*//*!*/;
SET @@SESSION.GTID_NEXT= '42053cc7-d6d9-11ec-a377-0200170e2487:5152282533'/*!*/;
# at 826986992
#221124 18:05:51 server id 1 end_log_pos 826986863 CRC32 0x0567319e Query thread_id=1713651 exec_time=0 error_code=0
SET TIMESTAMP=1669313151/*!*/;
BEGIN
/*!*/;
# at 826987072
#221124 18:05:51 server id 1 end_log_pos 826987124 CRC32 0x716e1e1d Table_map: `scrles`.`trade_algo_ta_append_finalcsv_ec0ya` mapped to number 44173693
# at 826987333
#221124 18:05:51 server id 1 end_log_pos 826994442 CRC32 0x504a2c01 Update_rows: table id 44173693
# at 826994651
#221124 18:05:51 server id 1 end_log_pos 827002162 CRC32 0x39ca7d06 Update_rows: table id 44173693
# at 827002371
#221124 18:05:51 server id 1 end_log_pos 827009634 CRC32 0xf413dc95 Update_rows: table id 44173693
Are your relay logs still growing, or has growth also halted? If growth of relay logs has also halted, then there’s something on the source side that isn’t sending to the replica. This seems quite likely since both of your replicas have halted.
/!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED//!/;
This line concerns me. I’m not sure why this would be present in the replication stream. Are you using GTID AUTO_POSITION=1? If so, I would skip to the next position and see if replication resumes.
i was trying to execute the above – but the ititial command seemed to hang… so i ctrl-c’d out of it – figuring maybe i should stop the replica first… so i tried a stop replica…
Woah woah, hold on. What is that query that’s been running for 353,026 seconds?!!! That’s 4 days! You have some MASSIVE update query which is running and that is the reason replication is not progressing. That’s also the reason why your STOP SLAVE appeared hung. Use the master’s binlog and exec_master_pos to find that query in the master and see what that is. Whatever that query is that is updating so many rows is locking up your replicas.
/!80001 SET @@session.original_commit_timestamp=1669313185231813//!/;
/!80014 SET @@session.original_server_version=80026//!/;
/!80014 SET @@session.immediate_server_version=80026//!/;
SET @@SESSION.GTID_NEXT= ‘42053cc7-d6d9-11ec-a377-0200170e2487:5152282533’/!/;
at 826986783
#221124 18:05:51 server id 1 end_log_pos 826986863 CRC32 0x0567319e Query thread_id=1713651 exec_time=0 error_code=0
SET TIMESTAMP=1669313151/!/;
BEGIN
/!/;
at 826986863
#221124 18:05:51 server id 1 end_log_pos 826987124 CRC32 0x716e1e1d Table_map: scrub_tables.trade_algo_ta_append_finalcsv_ec0ya mapped to number 44173693
at 826987124
#221124 18:05:51 server id 1 end_log_pos 826994442 CRC32 0x504a2c01 Update_rows: table id 44173693