Backup stopped working !! Slave_SQL_Running: No

systemali · October 8, 2013, 3:19am

Hello everyone,

My latest issues are just piling up and i am not able to get them all sorted

To add to my chaos, now my backup server too is giving me issues.

When i run the command :-

TheSlave|mysql> SHOW SLAVE STATUS \G
...
Slave_IO_Running: Yes
Slave_SQL_Running: No

see that "Slave_SQL_Running" is “NO” !!

i tried to stop and start the replication service, but in vein

My log files are throwing up this error :-


tail -f /var/log/messages

Oct 8 13:32:37 DatabackupSRV xinetd[3396]: EXIT: nrpe status=0 pid=3295 duration=0(sec)
Oct 8 13:34:19 DatabackupSRV xinetd[3396]: START: nrpe pid=3300 from=10.222.32.22
Oct 8 13:34:19 DatabackupSRV xinetd[3396]: EXIT: nrpe status=0 pid=3300 duration=0(sec)
Oct 8 13:34:29 DatabackupSRV xinetd[3396]: START: nrpe pid=3305 from=10.222.32.22
Oct 8 13:34:29 DatabackupSRV xinetd[3396]: EXIT: nrpe status=0 pid=3305 duration=0(sec)

It was working all this while, But i guess the recent start / stop of mysql services and the server of the master is causing this issue.

Any suggestions to rectify this would be of great help.

Thank you

systemali · October 8, 2013, 3:28am

Thought for a moment and decided why not post the whole output of the command : SHOW SLAVE STATUS \G

mysql> SHOW SLAVE STATUS \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.222.1.218
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: newcrmdb1-bin.000061
Read_Master_Log_Pos: 315098143
Relay_Log_File: DatabackupSRV-relay-bin.000088
Relay_Log_Pos: 667796113
Relay_Master_Log_File: newcrmdb1-bin.000054
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1146
Last_Error: Error 'Table 'asteriskcdr.bpleadcf' doesn't exist' on query. Default database: '
newcrmdb'. Query: 'INSERT INTO `newcrmdb`.`bpleadcf` SELECT * FROM `asteriskcdr`.`bpleadcf`'
Skip_Counter: 0
Exec_Master_Log_Pos: 667795964
Relay_Log_Space: 1134162270
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1146
Last_SQL_Error: Error 'Table 'asteriskcdr.bpleadcf' doesn't exist' on query. Default database: '
newcrmdb'. Query: 'INSERT INTO `newcrmdb`.`bpleadcf` SELECT * FROM `asteriskcdr`.`bpleadcf`'
1 row in set (0.00 sec)

Thank you everyone for your assistance.

systemali · October 8, 2013, 4:05am

Ahhh…

With regards to my similar post earlier, I tried to trouble shoot the issue, but in vein…

Now when i run the command :- “show slave status \G”

mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 10.222.1.218
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000055
Read_Master_Log_Pos: 315098143
Relay_Log_File: DatabackupSRV-relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: mysql-bin.000055
Slave_IO_Running: No
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 315098143
Relay_Log_Space: 106
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not f
ind first log file name in binary log index file'
Last_SQL_Errno: 0
Last_SQL_Error:
1 row in set (0.00 sec)

I’ll let everyone know, what i did :-

This is the first change i did :-

mysql> change master to MASTER_LOG_FILE='mysql-bin.000055', Master_Log_Pos=4;

Later i changed to this :-

mysql> change master to MASTER_LOG_FILE='mysql-bin.000055', Master_Log_Pos=315098143;

Those are the 2 chnages i have done and now i get the status as shown above

i hope i have not messed it up too much !!

Thank you

scott.nemes · October 8, 2013, 2:39pm

Yeah you stuck it to yourself on this one.

The first issue is that you changed the replication position instead of fixing the error, and used an incorrect binlog file name format (you likely just used the one from that post you linked I’d guess). To get back to where you started, you need to find the binlog file and position that the slave sql_thread stopped at. Based on your slave status output, it looks like the slave is reading from a new binlog file (you can see that the Read_Master_Log_Pos value is smaller than the Exec_Master_Log_Pos value, which means it has to be reading a newer binlog file than where the slave sql_thread stopped at), so you need to find the binlog file that the slave sql_thread actually failed on. So look in the error log for something like the below:


2013-10-08 12:48:51 37545 [ERROR] Slave SQL: Error 'Table 'testdb.test2' doesn't exist' on query. Default database: 'testdb'. Query: 'insert into test1 select * from test2', Error_code: 1146
2013-10-08 12:48:51 37545 [Warning] Slave: Table 'testdb.test2' doesn't exist Error_code: 1146
2013-10-08 12:48:51 37545 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000001' position 3427

This is a sample I re-created, so yours will be a bit different. Note the ERROR is similar to what you see in your slave status. So find your specific error message in the error log file, and then locate the end part where is gives you the file name and position (“We stopped at log ‘mysql-bin.000001’ position 3427” in my example). The position should be 315098143 based on your show slave status, as that is when it the slave sql_thread stopped executing events (Exec_Master_Log_Pos ) but the io_thread kept reading in new ones (Read_Master_Log_Pos).

Once you find the correct binlog file name and position, re-run your change master statement on your slave using the information you located in the error log. Note that your file name should be something like “newcrmdb1-bin.XXXXXX”, not mysql-bin.XXXXXX (you can see this naming convention your show slave status above).


mysql> change master to MASTER_LOG_FILE='newcrmdb1-bin.XXXXXX', Master_Log_Pos=315098143;

Once you get pointed back to the original replication location where the slave sql_thread failed, you need to then fix the error that it was complaining about to start with.

The initial replication error appears to be telling you that the table asteriskcdr.bpleadcf does not exist on the slave, so the insert statement is failing when it attempts to select the data from that table. So the problem there is that your slave appears to be already out of sync with your master. If the table in question on the master is static or mostly static, you could likely solve this by exporting the data from just that table on the master using mysqldump and loading it into the slave. If that is not possible, or you do not care about that data, you could always just skip the replication statement with sql_slave_skip_counter, but then the slave would be further out of sync with the master.

And if all else fails, you can always rebuild the slave from the master as a last resort as well. =)

systemali · October 9, 2013, 9:39am

Hello Scott,

Phewww…My apologies for causing you such a pain :(, but none the less the above explanation was very very useful and i did learn a few things extra

But, I could not salvage the data from its last position and hence i had to redo the whole replication stuff once again, which i have just completed

All thanks to your guidance …

Keep it up !!!

Gnite

scott.nemes · October 9, 2013, 9:52am

Glad you got it going again anyway!

If you want some practice, I’d recommend downloading MySQL Sandbox and setting up a quick replication environment on a test server. You can easily test situations like this, and practice recovering from them.

I.e. in your test environment:

Drop a table on the slave (not the master)
Do a “insert into … select from” statement on the master that uses the table you dropped on the slave in the FROM part (like your actual issue above)
Use mysqldump to dump the missing table from the master and insert it on the slave
Restart replication, which should then work now that the correct table exists again on the slave

MySQL Sandbox:
https://launchpad.net/mysql-sandbox/…-3.0.42.tar.gz

Guide for installation / setup:
[URL=“MySQL::Sandbox::Recipes - A cookbook for MySQL Sandbox - metacpan.org”]http://search.cpan.org/~gmax/MySQL-S...box/Recipes.pm[/URL]

systemali · October 9, 2013, 11:23pm

Thank you so much for the head’sup Scott…

You have been a life saver !!

Topic		Replies	Views
Slave_SQL_Running: No Percona XtraBackup	4	1391	March 21, 2013
Replication troubleshooting with Slave_SQL Percona Server for MySQL 5.6	1	900	July 4, 2019
binlog positions "off" a bit after restore Percona XtraBackup	1	919	April 4, 2015
XtraBackup \| Replication \| Table 'mysql.slave_master_info' cannot be opened. Percona XtraBackup	3	6956	July 14, 2015
i [Failed] How to setup a slave for replication ... please help Percona XtraBackup	2	887	February 15, 2018

Backup stopped working !! Slave_SQL_Running: No

Related topics