Node/1/2/3 and HAproxy node. The cluster is up and running, but if I see “show processlist” on all 3 nodes, then it shows connection from application
only on one node. If other nodes leave the cluster and join back, then they are synced in time. If I stop/reboot the node on which I see connection, then
as expected the application loses connection and nothing gets added to table. In effect the application has crashed.
So, should I see connections from the applications only on one node? Since if this is so, then cluster concept is invalidated. Am I missing something here?
If I understand correctly your concern is that the “show processlist” output run on a node shows only the client connections from that single node?
Note that cluster concept in Galera based - Percona XtraDB Cluster (PXC) is completely different than MySQL Cluster using NDB storage engine.
In case of PXC - each MySQL instance being part of a cluster has a complete data copy and runs separate mysqld instance that provide direct access to only this single node. The way the cluster works here is more about how the write transactions are being handled rather then how you organize and access data. The sole of PXC idea is synchronous replication between equal nodes. In turn in NDB cluster - one SQL node represents access to data located on many data nodes. So both concepts are much different.
That’s why it’s recommended to use external load balancers, like HAProxy if you want common connection point to your PXC cluster. Also HAProxy web console gives you a view to the statistics of connections established to all nodes.
Przemek, thanks for your response. The problem is this…
I have 3 nodes and one HAproxy node. The application connects to the HAproxy and then HAproxy distributes to the back-end nodes.
In operation, while application is writing, HAproxy operates in round robin mode and all nodes get the data correctly.
When I do a “show processlist” I see connection from application on a single node. Now if any other node which doesnt show connection fails, then application/cluster works correctly. I can remove a node and when I bring it back, it syncs correctly.
Now if the node which shows the application connection fails, then the application thinks that MYSQL connection is lost and then no more rows are added to tables.
Application retries but does not connect. ( even though HAproxy is working and I can connect using command line mysql to the cluster)
e.g this is what I see in processlist…
±----±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
±----±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
| 1 | system user | | NULL | Sleep | 42522 | committed 220053 | NULL | 0 | 0 | 0 |
| 2 | system user | | NULL | Sleep | 61997 | wsrep aborter idle | NULL | 0 | 0 | 0 |
| 470 | root | chi2restore01.gelbergroup.com:60052 | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 0 |
±----±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
±-------±------------±------------------------------------±---------±--------±------±-------------------±-----------------±----------±--------------±----------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
±-------±------------±------------------------------------±---------±--------±------±-------------------±-----------------±----------±--------------±----------+
| 1 | system user | | NULL | Sleep | 33002 | committed 119974 | NULL | 0 | 0 | 0 |
| 2 | system user | | NULL | Sleep | 80230 | wsrep aborter idle | NULL | 0 | 0 | 0 |
| 86230 | gv_user | testvictim2.gelbergroup.com:57023 | audit_db | Sleep | 24442 | | NULL | 0 | 0 | 0 |
| 114321 | root | chi2restore01.gelbergroup.com:54167 | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 0 |
±-------±------------±------------------------------------±---------±--------±------±-------------------±-----------------±----------±--------------±----------+
±------±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
±------±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
| 1 | system user | | NULL | Sleep | 24430 | committed 220053 | NULL | 0 | 0 | 0 |
| 2 | system user | | NULL | Sleep | 31030 | wsrep aborter idle | NULL | 0 | 0 | 0 |
| 14054 | root | chi2restore01.gelbergroup.com:32834 | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 0 |
±------±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
±------±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
±------±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
| 1 | system user | | NULL | Sleep | 24442 | committed 220053 | NULL | 0 | 0 | 0 |
| 2 | system user | | NULL | Sleep | 61545 | wsrep aborter idle | NULL | 0 | 0 | 0 |
| 28820 | root | chi2restore01.gelbergroup.com:42656 | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 0 |
±------±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
I recently added a fourth node and it works fine. See the connection “gv_user” from host testvictim2. testvictim2 is my haproxy node. Now if the mysql node with this connection dies, then the application cannot handle it. It does not try another node, even though the haproxy node and other nodes are working.
So, is this problem with the application? ( it’s a C++ application)
thanks very much for your time and help in advance…
HAproxy is doing simple TCP load balancing, it will not (nor will the mysql client) transparently fail you to another node if the one you connect to fails.
It’s up to your application to reconnect and/or do something intelligent if it gets disconnected.