I have a primary P with sync_binlog=0 and 2 replicas R1 and R2 with sync_binlog=1. Topology is P->R1->R2. I use orchestrator to monitor those servers. If I try to relocate R2 under P to have a topology P->(R2,R1) I receive an error where Orchestrator find different events on the P and R2 binary log. I’m using Automatic Pseudo GTID, and I presume that sync_binlog=0 on master is not committing all on binary logs and this is the reason of the error in relocation. Did anybody had same issue before?
hello @Dario_Rigolin ,
Please share logs of orcheastrator and errors you get during the failover process. It helps to understand the issue by checking at the logs. thanks
Hi I have found the issue. Pseudo GTD want mandatory sync_binlog=1, after changing this slave migration went well. Following documentation of Orchestrator sync_binlog=1 and replication_commit_order need to be set to guaranteed same order of events on all slaves and I presume also on master. we have also some myisam tables on some databases and we are using MIXED binlog format. Sometimes MYISAM operation requires more time and this will increase lag on slaves. Moving from MyISAM to INNODB on bigger tables helped.
From my understanding of the replication process, master can work in parallel on many different database at same time and commit many transactions, instead replicas has to execute events in order and cannot work in parallel if we want to maintain same bin_log events on all replicas.
If we relax sync_binlog and replication_commit_order replicas can work lot faster but orchestrator will not be able to relocate replicas.