mysql replication picks wrong Relay_Master_Log_File

Hello,

Today i faced strange error on slave. I restarted mysql slave and suddenly lot of duplicate entries errors arises on differenet tables while it shouldn’t be. I found the cause of that but not sure why it happened.

STEP1: (ON MASTER)
shell] ls -la master* (master server log files in mysql directory)

master-bin.000021
master-bin.000022
master-bin.000023


master-bin.000033

STEP 2: (ON SLAVE)

mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: xxx.xxx.x.xx
Master_User: xxxx
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master-bin.000033
Read_Master_Log_Pos: 215255756
Relay_Log_File: slave-relay-bin.000016
Relay_Log_Pos: 215255907
Relay_Master_Log_File: master-bin.000033
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB: database1
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 215255756
Seconds_Behind_Master: 0
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
1 row in set (0.01 sec)

At this point Master_Log_File & Relay_Master_Log_File are at on same position.

STEP 3: (ON SLAVE)

I restarted mysql and checked replication status. and found that wrong Relay_Master_Log_File picked i.e. master-bin.000021 which is already replicated. Before mysql restart Relay_Master_Log_File was on
master-bin.000033 means SQL thread executed queries till than.

shell] /etc/init.d/mysql restart

mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: xxx.xxx.x.xx
Master_User: xxxx
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master-bin.000033
Read_Master_Log_Pos: 215255756
Relay_Log_File: slave-relay-bin.000016
Relay_Log_Pos: 215255907
Relay_Master_Log_File: master-bin.000021
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB: database1
Last_Errno: 1062
Last_Error: Error ‘Duplicate entry for key ‘PRIMARY’’ on query.
Skip_Counter: 0
Exec_Master_Log_Pos: 215255756
Seconds_Behind_Master: 0
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
1 row in set (0.01 sec)

Also there are many relay logs created on slave. I know duplicate entries errors came because of wrong (older) Relay_Master_Log_File. Can someone please point why mysql restart picked wrong (older) Relay_Master_Log_File file ?

Thanks in advance.

Check the log for clues. Just before it says it started replication, does it say anything else?

Hello,

Baron, thanks for your reply.
I already checked the error log and didn’t find any errors,warnings.
It really driving me crazy that how mysql replication thread picked wrong Relay_Master_Log_File value. Master/Slaves version are same. After this i tried one another slave by restarting falls under same master server but other slave restarted properly without any issues.
I am using MySQL 5.1.37

Thank you.