replication problem

Xtra cluster was fine yesterday. But it can’t make any replication today when I check after finished install proxy server. Here are the my.cnf and /var/log/mysqld.log files on both proxy and db

mysqld1.txt (12.3 KB)

mysqlproxy2.txt (22.5 KB)

mysqld2.txt (14.2 KB)

mysqlproxy1.txt (51.3 KB)

Hello lance.jud can I just get some more information from you…
[LIST]
[]version numbers of Percona XtraDB Cluster (is that what you have?), OS and version information.
[
]how many nodes are there in the cluster? can we have the .conf files for all nodes
[]was this installation working well before adding proxy server or is it a new installation all around?
[
]are there any error logs being written that provide more information?
[*]Can you also confirm you are installing ProxySQL and what version you are using.
[/LIST] If you could provide information around the issue, then I will see if anyone can take a look for you. Please be aware that the team can only talk in generic terms about how things should work, they can’t go into great detail about things that are very specific to your environment.

I might move the post to the Percona XtraDB Cluster thread, but I’ll wait for your reply.

Hi Lorraine,
Thanks for your reply. Node 2 can’t join the cluster. Please help me to check if any problem on the configuration. The system I installed are :
Percona XtraDB Cluster 5.7, Redhat 6.9 and there are 2 nodes : 172.18.23.12 and 172.18.23.13. Both are VMs without firewall. ProxySQL has not yet installed.
Here is the procedures to start the cluster:
mysql@NODE1 : service mysql bootstrap-pxc​
/etc/init.d/mysql stop​
/etc/init.d/mysql start
SET GLOBAL wsrep_cluster_address=‘gcomm://’;​

±--------------------------±------------------+​
| Variable_name | Value |​
±--------------------------±------------------+​
| wsrep_local_state_comment | Synced |​
| wsrep_incoming_addresses | 172.18.23.12:3306 |​
| wsrep_cluster_conf_id | 1 |​
| wsrep_cluster_size | 1 |​
| wsrep_cluster_status | Primary |​
| wsrep_connected | ON |​
| wsrep_ready | ON |​
±--------------------------±------------------+​

mysql@NODE 2: service mysql start
SET GLOBAL wsrep_cluster_address=‘gcomm://172.18.23.12’;

±--------------------------±---------------------+​
| Variable_name | Value |​
±--------------------------±---------------------+​
| wsrep_local_state_comment | Initialized |​
| wsrep_cluster_conf_id | 18446744073709551615 |​
| wsrep_cluster_size | 0 |​
| wsrep_cluster_status | Disconnected |​
| wsrep_ready | OFF |​
±--------------------------±---------------------+​

setup20181213.txt (10.5 KB)

lance.jud, the problem seems to be related to the section where system variables are placed in your my.cnf:

[mysqld_safe]
log-error=/var/log/mysqld.err
pid-file=/var/run/mysqld/mysqld.pid
wsrep_on=ON
wsrep_provider=/usr/lib64/galera3/libgalera_smm.so
wsrep_provider_options="gcache.size=1G"
#wsrep_cluster_address=gcomm://172.18.23.12,172.18.23.13
wsrep_cluster_address=gcomm://

default_storage_engine=InnoDB
wsrep_slave_threads=4
wsrep_cluster_name=frameworkdb
wsrep_node_name=devlsql03
...

I would suggest to move these three to the bottom of the file then shutdown node1 and re-bootstrap

[mysqld_safe]
log-error=/var/log/mysqld.err
pid-file=/var/run/mysqld/mysqld.pid

Do the same changes on node2’s my.cnf and then start node2 normally.

Hi, Loraine and Jrivera,

Thanks for your prompt reply. After disable the hardening in the firewall setting, I run the cluster setup and found differet messages in err log file. Here is the information :

Node 1

netstat -ant | grep LISTEN

tcp 0 0 0.0.0.0:4567 0.0.0.0:* LISTEN​
tcp 0 0 :::3306 :::* LISTEN​

±--------------------------±------------------+​
| Variable_name | Value |​
±--------------------------±------------------+​
| wsrep_local_state_comment | Synced |​
| wsrep_incoming_addresses | 172.18.23.12:3306 |​
| wsrep_cluster_conf_id | 3 |​
| wsrep_cluster_size | 1 |​
| wsrep_cluster_status | Primary |​
| wsrep_connected | ON |​
| wsrep_ready | ON |​
±--------------------------±------------------+​
] cat /var/log/mysqld.err

2018-12-17T08:14:07.034545Z 1166144 [Note] Access denied for user ‘proxysql’@‘devlsql01.poc.et’ (using password: YES)​
2018-12-17T08:14:07.063258Z 1062963 [Note] Aborted connection 1062963 to db: ‘unconnected’ user: ‘sstuser’ host: ‘dt-poc-72.poc.et’ (Got an error reading communication packets)​
2018-12-17T08:14:07.064512Z 1062962 [Note] Aborted connection 1062962 to db: ‘unconnected’ user: ‘sstuser’ host: ‘dt-poc-72.poc.et’ (Got an error reading communication packets)​

======================================================
Node 2

[root@devlsql04 ~]# netstat -ant | grep LISTEN​
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN​
tcp 0 0 :::3306 :::* LISTEN​

| Variable_name | Value |​
±--------------------------±---------------------+​
| wsrep_local_state_comment | Initialized |​
| wsrep_incoming_addresses | |​
| wsrep_cluster_conf_id | 18446744073709551615 |​
| wsrep_cluster_size | 0 |​
| wsrep_cluster_status | Disconnected |​
| wsrep_connected | OFF |​
| wsrep_ready | OFF |​
±--------------------------±---------------------+​

And attached the log file at node 2. Please help me to solve the cluster configuration problem.

Hope our first mySQL cluster is launch before Christmas. Thank you very much !

Rgds,
Lance

Here is the error log on node 2.

node20181217.txt (10.7 KB)

I check the firewall : service iptables status . It prompt " Firewall is not running" . When I run : mysql@devmysql01 ~ ] telnet devsql102 3306 or telnet devsql102 4567 , it prompt “No route to host”.

The error log from second node is not complete, but it shows that this node connected the first node, apparently was running for some time but later it needed another state transfer, but IST was nt possible:

2018-12-17T01:39:49.324180Z 5 [Note] WSREP: Setting wsrep_ready to true
2018-12-17T01:39:49.324187Z 5 [Warning] WSREP: Gap in state sequence. Need state transfer.?
2018-12-17T01:39:49.324193Z 5 [Note] WSREP: Setting wsrep_ready to false?
2018-12-17T01:39:49.324976Z 5 [Note] WSREP: You have configured 'xtrabackup-v2' state snapshot transfer method which cannot be performed on a running server. Wsrep provider won't be able to fall back to it if other means of state transfer are unavailable. In that case you will need to restart the server.?

After restart, the log misses the main WSREP log entries - was it filtered? It should not go to “ready for connections” state without syncing with the cluster.

Btw, “No route to host” means the network connection between the two, at least using the hostnames, is not configured properly. Why don’t you check the IP address with telnet command instead?

Dear [B]Przemek,

Thanks for your reply. I have to setup 16 pair mysql clusters this couple of days. U are right. They are synced in a while , but return : “Error while getting data from donor node: exit codes: 137 0”. Attached please find the logfiles on both nodes for your study. Please let me know if any more information is required.[/B]

You have a typo in your configuration file, the node2 has inconsistent own IP set:

# Node IP address
wsrep_node_address=172.19.101.108:4567
wsrep_sst_receive_address=172.19.101.107

Just comment out the second one and it will inherit node address one.

Dear Przemek,

Thanks for your tips. The fatal error is solved. It comes with another error at node 2: "Received self-leave message. State transfer request failed unrecoverably: 113 (No route to host). Most likely it is due to inability to communicate with the cluster primary component. Restart required…[ERROR] WSREP: Requesting state transfer failed: -113(No route to host), " Attached please find the error log file for your investigation.

At node 1 : the error is the same : “[Warning] WSREP: Member 0.0 (pld1fmk0114) requested state transfer from ‘172.19.101.107’, but it is impossible to select State Transfer donor: No route to host”

Our Infra team and system team said mysql config file can open the corresponding port to solve this communication error. Please let me know if any idea or any queries.

node1.txt (12.5 KB)

node2.txt (25.7 KB)

node220190104.txt (19.7 KB)

node120190104.txt (5.54 KB)

lance.jud on your node2
wsrep_sst_receive_address must match your local ip and sst donor ip should also be commented out if you are passing the parameter from command line. Sst donor selection is only needed first time.

Dear Jazikh,
Thanks for your reply. I did yr suggestion and find another error from donor node. BTW, the ports 4444 and 4568 are not listened on both servers though our infra team and system team confirmed me that the ports are available. Attached please find the log file for your investigation. Thanks !

Rgds,
Lance

node120190107.txt (16.6 KB)

node220190107.txt (45.5 KB)

Hello lance.jud

Start your first machine like this:

[B]service [EMAIL="mysql@bootstrap.service"]mysql@bootstrap.service[/EMAIL] start[/B]

NOTE: Do not forget to add user “sstuser” with password in your mysql database and grant access.

Second machine:

[B]service mysql start[/B]

This is my server configuration. Please set it according to your environment.


[mysqld]

default_storage_engine=InnoDB
log_bin
binlog_format = ROW
innodb_buffer_pool_size = 4096M
innodb_flush_log_at_trx_commit = 1
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 2G
innodb_file_per_table = 1
datadir = /data/mysql

wsrep_cluster_address = gcomm://10.2.2.11,10.2.2.12,10.2.2.13
wsrep_provider = /usr/lib64/galera3/libgalera_smm.so

wsrep_provider_options="gmcast.listen_addr=tcp://0.0.0.0:4567;gcache.size=2G"
wsrep_slave_threads = 24
wsrep_cluster_name = database-cluster
wsrep_node_name = dba3.cluster.local # This is my local machine hostname. Change it on every machine.

max_binlog_files = 20
expire_logs_days = 14
skip-name-resolve

sync-binlog = 1

wsrep_auto_increment_control = OFF # I have set this OFF because I am only using single machine for both read and write.

wsrep_node_address=10.2.2.13 # This is my local machine IP address. Change it on every machine.
# SST method
wsrep_sst_method=xtrabackup-v2

# Authentication for SST method
wsrep_sst_auth="sstuser:p@ssw0rd"

innodb_autoinc_lock_mode = 2

[mysqld_safe]
pid-file = /run/mysqld/mysql.pid

log-error = /var/log/mysql-error.log

!includedir /etc/my.cnf.d