We are using master slave replication and we have 1 master instance and 2 slave instances.
The application is currently configured to use master only for writing operations and configured VIP for slaves for reading operation.
Last week we face one problem that our both the slaves are unresponsive due to migration on master server. We have change the data type of some column from VARCHAR(255) to TEXT using ALTER command.
We have executed below SQL on master:
SET FOREIGN_KEY_CHECKS=0; ALTER TABLE table1 CHANGE device1 device11 VARCHAR(255) DEFAULT ‘‘NULL’’; ALTER TABLE table2 CHANGE device2 device22 VARCHAR(255) DEFAULT ‘‘NULL’’; SET FOREIGN_KEY_CHECKS=1; SET FOREIGN_KEY_CHECKS=0; ALTER TABLE links CHANGE redirect redirect TEXT NOT NULL COLLATE utf8_unicode_ci, CHANGE redirect_query redirect_query TEXT DEFAULT NULL COLLATE utf8_unicode_ci; SET FOREIGN_KEY_CHECKS=1; After executing the above SQL , slave server becomes unresponsive(stop).
I don’t know why this happened.
Due to running ALTER on the table, the statistics on the slave became corrupted which causes queries to perform poorly and cause issue on the server. this leads to timeout and which leads to inability to access the server.
Then we tried to reset the statistics on slave server by running "ANALYZE TABLE " and it fixed the issue but it takes some time.
Our system is 24/7 so please suggest me what wrong we did?
I am wondering how this problem occurred and how to avoid such problem in future.
What was the root cause of this problem