MySQL replication slave parallel workers only work 1 worker

I have mysql replication using gtid on and set slave_parallel_workers to 4 because very slow apply lagging on slave servers.

Topology : 1 master server with 3 slaves server

Mysql percona version : 8.0.18

Test replication performance using insert 10 rows of 1.2MB and nohup in shell script looping 100x concurrently.

When check on show processlist, 3 workers almost always having event : waiting for coordinator.

Is there any config at /etc/my.cnf i need to add ?

1 Like

In our scenario, replication lag with 100 concurrent sessions and each insert 10 rows of 1.2MB, delay lag could be ~ 500-700 seconds on second_behind_master (show slave status\G), which is too long / very slow replication performance.

Innodb_buffer_pool_size set to 3GB on each 7GB memory on each of 4 servers.

slave_parallel_type = LOGICAL_CLOCK;
slave_parallel_workers = 4;
slave_preserve_commit_order = OFF;

Really frustated since try on mariadb, the slave parallel workers is working. Why on percona mysql NOT WORKING :sob:

1 Like

Hello @Hany_Widodo,
What are your other InnoDB parameters? What is your flush_log_at_trx_commit? What is your log_buffer_size? All of these parameters, and the server hardware itself was the same when you tested on MariaDB?

What does the processlist show on the master?

1 Like

Thanks @matthewb for the reply.

Master : flush_log_at_trx_commit = 1
Slave : flush_log_at_trx_commit = 2

All servers has innodb_log_file_size=150M

On Mariadb, I haven’t tested yet, since the Mariadb environments not on my side. I just have percona environments. My colleague told me on Mariadb the slave workers running well on parallel.

1 Like

@Hany_Widodo,
Your innodb_log_file_size is small. You should set this to 1G and restart MySQL. This small log file could be bottlenecking writes to disk.

1 Like

Hi @matthewb

Thanks so so so much … will try to test increase it tommorow, test in UAT environment with stresstest. After delay lag issue solved, then deploy to production env.

1 Like

Hi @matthewb

We have increase innodb_log_file_size to 1G from 150M. But the workers only working 1 worker.

If there’s no solution, we think temporary to using rsync. Or the last option, migrate the database to mysql enterprise (mysql oracle)

1 Like

Our add new parameter on my.cnf

Key_buffer_size=2G
Key_cache_age_threshold=7200
Key_cache_block_size=16384
Key_cache_division_limit=50
Slave_allow_batching=ON
Log_slave_updates=0
Slave_parallel_type=LOGICAL_CLOCK
slave_parallel_workers=4
Slave_transaction_retries=80
Slave_pending_job_size_max=2G
max_allowed_packet=1G
Mysqlx_max_allowed_packet=1G
Thread_handling=pool-of-threads
Thread_pool_size=4
Mysqlx_min_worker_threads=10
Slave_checkpoint_period=60
Slave_checkpoint_group=2048
wait_timeout=60
Innodb_flush_log_at_trx_commit=2
Innodb_thread_concurrency=8
Innodb_buffer_pool_size=4G
innodb_log_file_size=1G
innodb_io_capacity=30000
Innodb_write_io_threads=8
Innodb_log_buffer_size=1G

1 Like

Lots of problems here:

Key_buffer_size=2G

This is only for MyISAM tables. If you have no MyISAM tables, remove this parameter and all other Key_* params.

Thread_handling=pool-of-threads

Unless you have 20,000 connections on a regular basis, remove this and Thread_pool_size. Thread pooling is only beneficial when you have massive amount of connections.

Slave_checkpoint_*

Remove these as these place limits on parallel replication

Innodb_thread_concurrency

This should be 0 unless you are facing CPU/context issues.

innodb_io_capacity=30000

This parameter is wrong. There are many blog posts on why increasing this value can have BAD performance. Please change this to 800.

Make the above changes and restart MySQL.

Or the last option, migrate the database to mysql enterprise (mysql oracle)

Because Percona is a rebase of MySQL-Oracle, you will find the same behavior. There is no reason to switch to Enterprise, unless you like paying more $$$ for MySQL.

1 Like

Will try your recommendation @matthewb … updated you soon 12 hours again

1 Like

Parallel workers on slave still didn’t working with ur recommendation @matthewb

1 Like

Therefore data lagging still happen :frowning:

1 Like

Hello @Hany_Widodo
Unfortunately, without access to your system I cannot further diagnose what the issue might be. LOGICAL_CLOCK replication creates a dependency tree based on transactions. If txn B is dependent on tnx A, then B and A cannot be processed in parallel and must be serialized. You can do more diagnosis on this by using mysqlbinlog tool on your binary logs. The sequence_no field in the binlogs is this dependency tree. If every binlog event is a new number, then your transactions are forcing serialization.

1 Like

Hi @matthewb … update : at last parallel workers running after few parameter changing

Thanks so much for the help :pray:t2:

Regards,
Hany Haryadi Widodo

1 Like

Hello @Hany_Widodo,
What parameters did you change, so that others can benefit from this post?

1 Like

at @Hany_Widodo .
I’m facing similar problem , what parameters were changed in your case ?