Why MySQL 5.7 replication is verry slow on my new server?

I have a problem with MySQL replication and I wanted to find advice or information on it.

Let me show you the problem.

Observation of delays with replication

Master
Master work on the file mysql-bin.024080

show master status\G
*************************** 1. row ***************************
             File: mysql-bin.024080
         Position: 62186738

Slave
Slave has all files ( mysql-bin.024080 ) and it works on the file mysql-bin.024075 . The gap between these files increases over time. Currently SQL_THREAD is behind 5 binary logs.

show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.55.55.21
                  Master_User: slave_replicator
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.024080
          Read_Master_Log_Pos: 60111446
               Relay_Log_File: mysql-relay-bin.002952
                Relay_Log_Pos: 154669142
        Relay_Master_Log_File: mysql-bin.024075
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 154668929
              Relay_Log_Space: 2631436265
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 68158
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 195390
                  Master_UUID: 361633ab-6bb7-11e7-829c-1a76636a64a1
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Reading event from the relay log
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 

When i set sync_binlog = 0 and innodb_flush_log_at_trx_commit = 2 parameters the server performance improves greatly and allows the replication of Slave to catch up. However, I want to leave sync_binlog = 1 and innodb_flush_log_at_trx_commit = 1.

MySQL Global

MySQL Global status: pastebin.com/pnVxjtBi
MySQL Global variables: pastebin.com/f80JJb0u


I think my problem must be with something like Fsync performance.
https://www.percona.com/blog/2018/02/08/fsync-performance-storage-devices/

How I can make my MySQL Slave catch up the Master ?

The result of the sysbench seems bad for a server like this.
What do you think about this ?

sysbench
sudo sysbench fileio --time=60 --file-num=1 --file-extra-flags= --file-total-size=4096 --file-block-size=4096 --file-fsync-all=on --file-test-mode=rndwr --file-fsync-freq=0 --file-fsync-end=0  --percentile=99 run
sysbench 1.0.11 (using system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time


Extra file open flags: 0
1 files, 4KiB each
4KiB total file size
Block size 4KiB
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Calling fsync() after each write operation.
Using synchronous I/O mode
Doing random write test
Initializing worker threads...

Threads started!


File operations:
    reads/s:                      0.00
    writes/s:                     117.51
    fsyncs/s:                     117.51

Throughput:
    read, MiB/s:                  0.00
    written, MiB/s:               0.46

General statistics:
    total time:                          60.0081s
    total number of events:              7052

Latency (ms):
         min:                                  0.42
         avg:                                  8.51
         max:                                 17.87
         99th percentile:                      8.74
         sum:                              59994.48

Threads fairness:
    events (avg/stddev):           7052.0000/0.00
    execution time (avg/stddev):   59.9945/0.00
Hardware detail of the machines

Master:

HOST
Host OS Xenserver
CPU Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz
RAM on Host 128 Go
Space on Host RAID 10 SSD 4x744,687Go (1,454 TB)
VM (MySQL)
VM OS Ubuntu 16.04
RAM on VM 120 Go
Space on VM 1,3TB

SLAVE

HOST (MySQL)
Host OS Ubuntu 18.04
CPU 2x Intel(R) Xeon(R) Silver 4214R CPU @ 2.40GHz
RAM on Host 192 Go
Space on Host RAID 10 SSD (4x3,492TB) 6,984TB
2 Likes

The number of CPUs in the system may limit this but try changing to the following on the master/slave:
slave_parallel_type=LOGICAL_CLOCK
slave_parallel_workers=8

2 Likes

The change does not resolve the issue because the additional workers are often in the state: Waiting for an event from Coordinator

+-------+-------------+--------------------+------+---------+-------+-----------------------------------------------+------------------+
| Id    | User        | Host               | db   | Command | Time  | State                                         | Info             |
+-------+-------------+--------------------+------+---------+-------+-----------------------------------------------+------------------+
|     7 | slave    | 192.168.1.1:38402  | NULL | Sleep   |     5 |                                               | NULL             |
|    12 | slave    | 192.168.1.2:57048 | NULL | Sleep   |     1 |                                               | NULL             |
| 11551 | slave    | 192.168.1.1:34200  | NULL | Sleep   |  1514 |                                               | NULL             |
| 11640 | slave    | 192.168.1.1:34388  | NULL | Query   |     0 | starting                                      | show processlist |
| 11821 | system user |                    | NULL | Connect |   226 | Waiting for master to send event              | NULL             |
| 11822 | system user |                    | NULL | Connect |     0 | Waiting for Slave Worker to release partition | NULL             |
| 11823 | system user |                    | NULL | Connect | 56491 | Waiting for an event from Coordinator         | NULL             |
| 11824 | system user |                    | NULL | Connect |   226 | Waiting for an event from Coordinator         | NULL             |
| 11825 | system user |                    | NULL | Connect | 56494 | Waiting for an event from Coordinator         | NULL             |
| 11826 | system user |                    | NULL | Connect | 56491 | Waiting for an event from Coordinator         | NULL             |
| 11827 | system user |                    | NULL | Connect | 56491 | Executing event                               | NULL             |
| 11828 | system user |                    | NULL | Connect | 56491 | Executing event                               | NULL             |
| 11829 | system user |                    | NULL | Connect | 56491 | Executing event                               | NULL             |
| 11830 | system user |                    | NULL | Connect | 56491 | Waiting for an event from Coordinator         | NULL             |
| 11866 | slave    | 192.168.1.1:34820  | NULL | Sleep   |     5 |                                               | NULL             |
| 11869 | slave    | 192.168.1.1:34826  | NULL | Sleep   |     5 |                                               | NULL             |
| 11870 | slave    | 192.168.1.1:34828  | NULL | Sleep   |     2 |                                               | NULL             |
| 11872 | slave    | 192.168.1.1:34832  | NULL | Sleep   |     2 |                                               | NULL             |
+-------+-------------+--------------------+------+---------+-------+-----------------------------------------------+------------------+
1 Like

disable double write along with parallel works which will improve performance.

1 Like

Hi,

Do consider that disabling double write buffer, along with relaxing sync_binlog and/or innodb_flush_log_at_trx_commit will improve performance at the cost of relaxing consistency, which in the event of a crash will surely cause data loss.

Do you have same configuration on both envs? i.e same io_capacity, innodb-log-file-size, etc… ? You can compare configs with pt-config-diff — Percona Toolkit Documentation

Can you also check disk schedulers and file system ? For example “cfq” has lower performance compared to other schedulers Linux schedulers in tpcc like benchmark - Percona Database Performance Blog
And file system selection and flags is important too. It’s suggested you use either xfs or ext4. Please double check you have same filesystem and flags on both envs too.

Please confirm if everything is same on both envs.

Regards

1 Like

Hi,
Thanks for your help.

You can find the results of pt-config-diff:

5 config differences
Variable                  ./mysql-slave.cnf ./mysql-master.cnf
========================= ================= ==================
bind_address              192.168.1.1       192.168.1.3
innodb_buffer_pool_size   161061273600      118111600640
max_binlog_size           2147483648        524288000
max_connections           5000              500
server_id                 194643            195390

I changed the disk schedulers to noop on my replication and the performance has improved greatly. Replication has even caught up.

I don’t know how noop work but i will continue to learn about this.

Are there any risk with noop scheduler ? Do you advise me to keep this mode ?

1 Like

Hi again Snake,

Thanks for the heads up.
Noop and deadline are the suggested disk schedulers, since cfq will add some unneeded overhead while checking other disk queues from other processes which are surely empty due to a dedicated database.

You should also check filesystem differences. Used filesystem is suggested to be xfs or ext4.

Regards

1 Like

You are facing a typical issue with replication, the concurrency on the replica is less so the impact of latency for fsyncs is larger. Unless you are on ZFS, I wouldn’t disable the doublewrite buffer. Setting sync_binlog = 0 and innodb_flush_log_at_trx_commit = 0 is much safer. Why do you want to keep these to 1 on a replica?

The only way of getting more concurrency on the replica and leverage the LOGICAL_CLOCK parallel mode is to force more group commit on the primary. When you see many threads in show processlist at the “COMMIT” state on the primary, those transactions (and only those) can be applied in parallel on the replica. Try rising, carefully, binlog_group_commit_sync_delay. The value is in microseconds and is a delay for commit to add a grouping opportunity. Start at 500 (0.5ms) and rise slowly until the replica gets faster. Rising the value will affect only the new binlog files so you’ll need to wait for the replica to be executing them. Be aware that doing so adds a latency to every transactions on the primary.

1 Like

My replication uses ext4 as the filesystem.

I want to keep sync_binlog = 1 and innodb_flush_log_at_trx_commit = 1 because the replication will replace my Master.
Before the changes and with the slow replication. I had doubts on this server about the ability to handle the traffic. With all the info you gave me, I think I have several possibilities for improvement. but changing the disk scheduler to noop seems to have solved my problem.

1 Like

Sure but those settings are dynamic, just switch back to 1 as part of the promotion to primary. You need anyway to enable writes on the node. There is no need for keeping them to 1 when the node is a replica.

1 Like