Kkk I’ll tell you a little about my replication problems … hope it helps …
I had a lot of timeout problem …
1 - systemd timeout in mysql start
2 - I discovered with the help of Rene that the next bottleneck was my firewall that was generating timeout when it got the processor in 100%, with that it knocked down all the connections.
3 - I closed a VPC with aws
4 - timeout settings within my.cnf (wsrep_provider_options = " gcs.max_packet_size=1048576; evs.send_window=512; evs.user_send_window=512; evs.inactive_timeout = PT90S; evs.suspect_timeout = PT30S; evs.install_timeout = PT60S; evs.keepalive_period = PT6S; evs.max_install_timeouts = 8 ")
5 - memory confguration problems in joiner server my.cnf
6 - to run without crashes I upgraded the insternet link from 10Mb to 50Mb.
I think that was all that … kkkk but solved my problems … today my bank of 80G takes 240 minutes to replicate everything, this nor generate any line of warning in the logs.
Besides that I did tuning the operating system.
net.core.somaxconn = 1024
net.core.netdev_max_backlog = 5000
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_wmem = 4096 12582912 16777216
net.ipv4.tcp_rmem = 4096 12582912 16777216
net.ipv4.tcp_max_syn_backlog = 8096
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 10240 65535
fs.file-max=200000
kernel.sem=250 32000 100 1024
kernel.shmmax=4294967295
net.ipv4.tcp_retries2 = 2
#net.ipv4.tcp_syn_retries = 0
net.ipv4.tcp_synack_retries = 0
net.ipv4.tcp_keepalive_time = 30
net.ipv4.tcp_keepalive_intvl = 1
net.ipv4.tcp_keepalive_probes = 2
vm.swappiness = 0
vm.dirty_ratio = 80
vm.dirty_background_ratio = 5
vm.dirty_expire_centisecs = 12000