Przemek, thanks for your response. The problem is this…
- I have 3 nodes and one HAproxy node. The application connects to the HAproxy and then HAproxy distributes to the back-end nodes.
- In operation, while application is writing, HAproxy operates in round robin mode and all nodes get the data correctly.
- When I do a “show processlist” I see connection from application on a single node. Now if any other node which doesnt show connection fails, then application/cluster works correctly. I can remove a node and when I bring it back, it syncs correctly.
- Now if the node which shows the application connection fails, then the application thinks that MYSQL connection is lost and then no more rows are added to tables.
Application retries but does not connect. ( even though HAproxy is working and I can connect using command line mysql to the cluster)
e.g this is what I see in processlist…
±----±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
±----±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
| 1 | system user | | NULL | Sleep | 42522 | committed 220053 | NULL | 0 | 0 | 0 |
| 2 | system user | | NULL | Sleep | 61997 | wsrep aborter idle | NULL | 0 | 0 | 0 |
| 470 | root | chi2restore01.gelbergroup.com:60052 | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 0 |
±----±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
±-------±------------±------------------------------------±---------±--------±------±-------------------±-----------------±----------±--------------±----------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
±-------±------------±------------------------------------±---------±--------±------±-------------------±-----------------±----------±--------------±----------+
| 1 | system user | | NULL | Sleep | 33002 | committed 119974 | NULL | 0 | 0 | 0 |
| 2 | system user | | NULL | Sleep | 80230 | wsrep aborter idle | NULL | 0 | 0 | 0 |
| 86230 | gv_user | testvictim2.gelbergroup.com:57023 | audit_db | Sleep | 24442 | | NULL | 0 | 0 | 0 |
| 114321 | root | chi2restore01.gelbergroup.com:54167 | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 0 |
±-------±------------±------------------------------------±---------±--------±------±-------------------±-----------------±----------±--------------±----------+
±------±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
±------±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
| 1 | system user | | NULL | Sleep | 24430 | committed 220053 | NULL | 0 | 0 | 0 |
| 2 | system user | | NULL | Sleep | 31030 | wsrep aborter idle | NULL | 0 | 0 | 0 |
| 14054 | root | chi2restore01.gelbergroup.com:32834 | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 0 |
±------±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
±------±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read |
±------±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
| 1 | system user | | NULL | Sleep | 24442 | committed 220053 | NULL | 0 | 0 | 0 |
| 2 | system user | | NULL | Sleep | 61545 | wsrep aborter idle | NULL | 0 | 0 | 0 |
| 28820 | root | chi2restore01.gelbergroup.com:42656 | NULL | Query | 0 | sleeping | show processlist | 0 | 0 | 0 |
±------±------------±------------------------------------±-----±--------±------±-------------------±-----------------±----------±--------------±----------+
I recently added a fourth node and it works fine. See the connection “gv_user” from host testvictim2. testvictim2 is my haproxy node. Now if the mysql node with this connection dies, then the application cannot handle it. It does not try another node, even though the haproxy node and other nodes are working.
So, is this problem with the application? ( it’s a C++ application)
thanks very much for your time and help in advance…