Description:
We are running a Percona Operator for MySQL (XtraDB Cluster) cluster with 3 PXC replicas and 3 HAProxy replicas. This cluster is serving a PHP application that connects using PDO to the <cluster-name>-haproxy
service.
The cluster appears to be healthy; however every few minutes we see an error like this (as reported from PHP):
SQLSTATE[HY000]: General error: 2006 MySQL server has gone away
Sometimes, this message is seen instead:
SQLSTATE[08S01]: Communication link failure: 1158 Got an error reading communication packets
Debugging attempted so far
We disabled HAProxy and connected directly to the <cluster-name>-pxc
service instead. With this change applied, no further errors were encountered during the whole time the change was live. On switching back to <cluster-name>-haproxy
, the errors have begun again.
We are using the default HAProxy config provided by the operator and I’m unsure where to start in trying to resolve this problem. I checked the HAProxy logs and there are a mixture of CD
and SD
error codes for termination_state
(as far as I can see), e.g.:
[pod/mysql-haproxy-0/haproxy] {"time":"16/May/2025:07:52:20.337", "client_ip": "10.244.6.140", "client_port":"35524", "backend_source_ip": "10.244.7.208", "backend_source_port": "34858", "frontend_name": "galera-in", "backend_name": "galera-nodes", "server_name":"mysql-pxc-0", "tw": "1", "tc": "1", "Tt": "2", "bytes_read": "83", "termination_state": "SD", "actconn": "233", "feconn" :"232", "beconn": "231", "srv_conn": "231", "retries": "0", "srv_queue": "0", "backend_queue": "0" }
[pod/mysql-haproxy-0/haproxy] {"time":"16/May/2025:07:47:07.227", "client_ip": "10.244.6.140", "client_port":"46498", "backend_source_ip": "10.244.7.208", "backend_source_port": "40846", "frontend_name": "galera-in", "backend_name": "galera-nodes", "server_name":"mysql-pxc-0", "tw": "1", "tc": "148", "Tt": "318405", "bytes_read": "1189188", "termination_state": "CD", "actconn": "245", "feconn" :"244", "beconn": "243", "srv_conn": "243", "retries": "0", "srv_queue": "0", "backend_queue": "0" }
Any guidance on what to change to solve this would be very much appreciated please, as these errors are coming in regularly.
Version:
- Operator: 1.15.0
- PXC: percona/percona-xtradb-cluster:8.0.35
- HAProxy: percona/haproxy:2.8.5