Not the answer you need?
Register and ask your own question!

Percona Master-Master lag issues

blacksoulblacksoul EntrantInactive User Role Beginner
I have 2 servers with master-master replication using Percona MySQL 5.6 and I am having replication lag in the second server: db-02, however db-01 never had a lag situation.

What could be causing this lag? It rises sometimes up to >1000 secs of lag and sunddenly it fells to 0 secs. I have checked it with Percona's pt-heartbeat tool and is real lag.

This are the server configurations, both have 8 cores, 32GB RAM and SSD disks:

DB-01 (Xeon E3-1245 V2 @ 3.40GHz)
    [mysqld]
    server-id                = 1
    log_bin                   = /var/lib/mysql/mysql-bin.log
    binlog-ignore-db       = mysql
    binlog-ignore-db       = test
    datadir                = /var/lib/mysql
    
    expire-logs-days = 10
    
    auto_increment_increment= 2
    auto_increment_offset   = 1
    
    max_connections=3000
    
    query_cache_size=0
    query_cache_type=0
    
    innodb_file_per_table
    innodb_file_format=barracuda
    innodb_flush_method=O_DIRECT
    innodb_log_file_size=128M
    innodb_thread_concurrency=0
    
    innodb_flush_log_at_trx_commit=1
    sync_binlog=1
    
    # ~90% of memory
    innodb_buffer_pool_size=27000M
    
    # 4 * # of CPU cores
    innodb_read_io_threads=32
    innodb_write_io_threads=32
    
    innodb_io_capacity = 5000


DB-02 (i7-4770 CPU @ 3.40GHz)
// same that db-01 but with this options different:
    server-id                 = 2
    auto_increment_offset   = 2

The output of SHOW SLAVE STATUS on each server:

DB-01
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: db-02
                  Master_Log_File: mysql-bin.000022
              Read_Master_Log_Pos: 3348962
                   Relay_Log_File: db-01-relay-bin.000042
                    Relay_Log_Pos: 3349125
            Relay_Master_Log_File: mysql-bin.000022
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
                       Last_Errno: 0
                       Last_Error:
                     Skip_Counter: 0
              Exec_Master_Log_Pos: 3348962
                  Relay_Log_Space: 3349462
                  Until_Condition: None
                   Until_Log_File:
                    Until_Log_Pos: 0
            Seconds_Behind_Master: 0
                    Last_IO_Errno: 0
                    Last_IO_Error:
                   Last_SQL_Errno: 0
                   Last_SQL_Error:
      Replicate_Ignore_Server_Ids:
                 Master_Server_Id: 2
                 Master_Info_File: /var/lib/mysql/master.info
                        SQL_Delay: 0
              SQL_Remaining_Delay: NULL
          Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
               Master_Retry_Count: 86400
          Last_IO_Error_Timestamp:
         Last_SQL_Error_Timestamp:

DB-02
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: db-01
                  Master_Log_File: mysql-bin.000019
              Read_Master_Log_Pos: 38905060
                   Relay_Log_File: db-02-relay-bin.000048
                    Relay_Log_Pos: 17834447
            Relay_Master_Log_File: mysql-bin.000019
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
                       Last_Errno: 0
                       Last_Error:
                     Skip_Counter: 0
              Exec_Master_Log_Pos: 28196753
                  Relay_Log_Space: 28542928
                  Until_Condition: None
                   Until_Log_File:
                    Until_Log_Pos: 0
            Seconds_Behind_Master: 310
                    Last_IO_Errno: 0
                    Last_IO_Error:
                   Last_SQL_Errno: 0
                   Last_SQL_Error:
      Replicate_Ignore_Server_Ids:
                 Master_Server_Id: 1
                 Master_Info_File: /var/lib/mysql/master.info
                        SQL_Delay: 0
              SQL_Remaining_Delay: NULL
          Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
               Master_Retry_Count: 86400
          Last_IO_Error_Timestamp:
         Last_SQL_Error_Timestamp:

Comments

  • scott.nemesscott.nemes MySQL Sage Current User Role Patron
    Hi blacksoul;

    Likely the lag is being caused by long running DML statements on the master (db-01). For instance a large UPDATE runs on db-01, and then once it finishes and flows through to db-02 the slave lag jumps up once it realizes it has a lot of work to do to catch up. If this is the case, whatever is doing the long running DML probably only hits db01 for whatever reason (might be hard coded there, might always get sent there by whatever load balancing logic you have).

    To get around this, generally you would want to try and split up these long running DMLs, i.e. instead of deleting 1 million rows, delete the rows in small chunks so all of it does not flow through to the slave at once.

    -Scott
  • blacksoulblacksoul Entrant Inactive User Role Beginner
    Hi blacksoul;

    Likely the lag is being caused by long running DML statements on the master (db-01). For instance a large UPDATE runs on db-01, and then once it finishes and flows through to db-02 the slave lag jumps up once it realizes it has a lot of work to do to catch up. If this is the case, whatever is doing the long running DML probably only hits db01 for whatever reason (might be hard coded there, might always get sent there by whatever load balancing logic you have).

    To get around this, generally you would want to try and split up these long running DMLs, i.e. instead of deleting 1 million rows, delete the rows in small chunks so all of it does not flow through to the slave at once.

    -Scott

    Hi Scott, thanks for your answer. So, I could know that is caused by long running DML statements just logging slow queries on DB-01 and studying them. Right?
  • scott.nemesscott.nemes MySQL Sage Current User Role Patron
    Hi blacksoul;

    Yeah you can see long running DML in the slow query log as long as they meet the long query time requirement (10 seconds by default). You may also want to enable "log-slow-admin-statements" in-case it happens to be getting caused by one of those.

    http://dev.mysql.com/doc/refman/5.5/en/server-options.html#option_mysqld_log-slow-admin-statements

    -Scott
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.