SmartUpdate breaks pxc pod when "applying changes"

dkorbginski · August 23, 2024, 10:58am

Description:

I’m running the operator and cluster deployed via helm. Initially, the cluster is running without problems, but after a few moments, the operator wants to “apply changes to secondary pod” according to its logs. This restarts the targeted pod, but it’s not able to reach a working state. The mysqld --wsrep_start_position=... command fails.

In the pod logs, I see these errors:

pxc pod logs

{"log":"2024-08-23T10:48:09.639919Z 0 [Note] [MY-000000] [Galera] PC protocol downgrade 1 -> 0\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2024-08-23T10:48:09.640023Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node\nview ((empty))\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2024-08-23T10:48:09.647832Z 0 [ERROR] [MY-000000] [Galera] failed to open gcomm backend connection: 110: failed to reach primary view (pc.wait_prim_timeout): 110 (Connection timed out)\n\t at /mnt/jenkins/workspace/pxc80-autobuild-RELEASE/test/rpmbuild/BUILD/Percona-XtraDB-Cluster-8.0.36/percona-xtradb-cluster-galera/gcomm/src/pc.cpp:connect():176\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2024-08-23T10:48:09.653509Z 0 [ERROR] [MY-000000] [Galera] /mnt/jenkins/workspace/pxc80-autobuild-RELEASE/test/rpmbuild/BUILD/Percona-XtraDB-Cluster-8.0.36/percona-xtradb-cluster-galera/gcs/src/gcs_core.cpp:gcs_core_open():219: Failed to open backend connection: -110 (Connection timed out)\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2024-08-23T10:48:10.658832Z 0 [Note] [MY-000000] [Galera] gcomm: terminating thread\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2024-08-23T10:48:10.658894Z 0 [Note] [MY-000000] [Galera] gcomm: joining thread\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2024-08-23T10:48:10.658996Z 0 [ERROR] [MY-000000] [Galera] /mnt/jenkins/workspace/pxc80-autobuild-RELEASE/test/rpmbuild/BUILD/Percona-XtraDB-Cluster-8.0.36/percona-xtradb-cluster-galera/gcs/src/gcs.cpp:gcs_open():1880: Failed to open channel 'testapp-db-pxc' at 'gcomm://testapp-db-pxc-0.testapp-db-pxc,testapp-db-pxc-1.testapp-db-pxc': -110 (Connection timed out)\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2024-08-23T10:48:10.665159Z 0 [ERROR] [MY-000000] [Galera] gcs connect failed: Connection timed out\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2024-08-23T10:48:10.665196Z 0 [ERROR] [MY-000000] [WSREP] Provider/Node (gcomm://testapp-db-pxc-0.testapp-db-pxc,testapp-db-pxc-1.testapp-db-pxc) failed to establish connection with cluster (reason: 7)\n","file":"/var/lib/mysql/mysqld-error.log"}

I don’t know/understand what kind of update the operator is trying to do here. Any ideas what it might be and why it causes mysql to fail to start?

Version:

Helm charts “pxc-operator” and “pxc-db” 1.15.0

Helm values:

pxc-operator:

watchAllNamespaces: true

pxc-db:

pxc:
  persistence:
    storageClass: rook-ceph-block
    size: 2Gi
  disableTLS: true

yunus.uyanik · September 25, 2024, 11:56am

Hi @dkorbginski,

I don’t know/understand what kind of update the operator is trying to do here. Any ideas what it might be and why it causes mysql to fail to start?

The cluster setting by following variable wsrep_start_position starting up MySQL processes.

Regarding the pod logs;
{"log":"2024-08-23T10:48:10.665159Z 0 [ERROR] [MY-000000] [Galera] gcs connect failed: Connection timed out\n","file":"/var/lib/mysql/mysqld-error.log"}
You should ensure all nodes can communicate with each other.

Topic		Replies	Views
Rolling updates - re-created node cannot connect Percona XtraDB Cluster 8.x mysql , percona	1	962	March 10, 2023
Random pxc node fails with gcomm issues Percona Operator for MySQL mysql , percona	7	1442	January 12, 2023
Questions about updateStrategy Percona Operator for MySQL	8	209	June 7, 2024
PXC cluster, 3rd pod stuck in CrashLoopBackOff Percona XtraDB Cluster 8.x	1	946	September 20, 2023
Pxc-db cluster unable to recover after crash Percona Operator for MySQL percona	4	112	February 27, 2025

SmartUpdate breaks pxc pod when "applying changes"

Description:

Version:

Helm values:

Related topics