One node is running out of space after cluster crash

Description:

Hi, we had a cluster crash 8 days ago which I thought was resolved after running kubectl -n pxc exec cluster1-pxc-2 -c pxc -- sh -c 'kill -s USR1 1' , but something is still not quite right.
It’s a 3 nodes cluster, cluster1-pxc-0 and cluster1-pxc-1 have /var/lib/mysql folder 66% full, but the cluster1-pxc-2 node has it 100% full. Looking in the /var/lib/mysql folder of cluster1-pxc-2, there are many binlog and cluster1-pxc-2-relay-bin and GRA_ files. The other two nodes only have 7 binlog files, but node 3 has 18.

Earlier today I found the cluster not accepting more connections complaining with “too many connections”.
Connecting locally as root on cluster1-pxc-0 and running show processlist; showed 138 connections, most of them trying to insert data into a table and reporting this status message “wsrep: replicating and certifying write set”.
I then realized that cluster1-pxc-2 had the disk full.
I tried increasing spec.pxc.volumeSpec.persistentVolumeClaim.resources.requests.storage in deploy/cr.yaml from 10G to 12G and applied with kubectl but the PVCs for the cluster are still set to 10GB.
I terminated the cluster1-pxc-2 pod and when it restarted 148 MB in /var/lib/mysql were free, so right now the cluster is up, but the free space is slowly going down.

Version:

Percona Operator for MySQL based on Percona XtraDB Cluster v1.16.1

Logs:

> kubectl get pxc -n pxc
NAME       ENDPOINT       STATUS   PXC   PROXYSQL   HAPROXY   AGE
cluster1   10.22.208.29   ready    3                3         337d
> kubectl get pods -n pxc
NAME                                               READY   STATUS      RESTARTS         AGE
cluster1-haproxy-0                                 2/2     Running     24 (7d21h ago)   29d
cluster1-haproxy-1                                 2/2     Running     0                7d20h
cluster1-haproxy-2                                 2/2     Running     24 (7d21h ago)   29d
cluster1-pxc-0                                     1/1     Running     1 (7d6h ago)     7d20h
cluster1-pxc-1                                     1/1     Running     2 (7d6h ago)     29d
cluster1-pxc-2                                     1/1     Running     1 (29m ago)      29m
percona-xtradb-cluster-operator-6d75d68c9d-s7v8s   1/1     Running     0                7d20h
xb-cron-cluster1-fs-pvc-202521211016-cv29g-g7zcm   0/1     Completed   0                7d6h
xb-cron-cluster1-fs-pvc-202521311016-cv29g-2dtfb   0/1     Completed   0                6d10h
xb-cron-cluster1-fs-pvc-202521411016-cv29g-xk92t   0/1     Completed   0                5d10h
xb-cron-cluster1-fs-pvc-202521511016-cv29g-qhqch   0/1     Completed   0                4d10h
xb-cron-cluster1-fs-pvc-202521611016-cv29g-b4xt7   0/1     Completed   0                3d10h
xb-cron-cluster1-fs-pvc-202521711016-cv29g-hq8t2   0/1     Completed   0                2d10h
xb-cron-cluster1-fs-pvc-202521811016-cv29g-4xtpl   0/1     Completed   0                34h
xb-cron-cluster1-fs-pvc-202521911016-cv29g-lvqlq   0/1     Completed   0                10h
> kubectl get events -n pxc
LAST SEEN   TYPE      REASON                OBJECT                     MESSAGE
40m         Normal    Killing               pod/cluster1-pxc-2         Stopping container pxc
30m         Warning   Unhealthy             pod/cluster1-pxc-2         Readiness probe failed: + [[ '' == \P\r\i\m\a\r\y ]]...
30m         Normal    Scheduled             pod/cluster1-pxc-2         Successfully assigned pxc/cluster1-pxc-2 to k8s10444-workers-b279j-6xp7w-xpk2f
30m         Normal    Pulling               pod/cluster1-pxc-2         Pulling image "percona/percona-xtradb-cluster-operator:1.16.1"
30m         Normal    Pulled                pod/cluster1-pxc-2         Successfully pulled image "percona/percona-xtradb-cluster-operator:1.16.1" in 566ms (566ms including waiting)
30m         Normal    Created               pod/cluster1-pxc-2         Created container pxc-init
30m         Normal    Started               pod/cluster1-pxc-2         Started container pxc-init
29m         Normal    Pulling               pod/cluster1-pxc-2         Pulling image "docker-remote.binrepo.example.com/percona/percona-xtradb-cluster:8.0.39-30.1"
30m         Normal    Pulled                pod/cluster1-pxc-2         Successfully pulled image "docker-remote.binrepo.example.com/percona/percona-xtradb-cluster:8.0.39-30.1" in 1.664s (1.664s including waiting)
29m         Normal    Created               pod/cluster1-pxc-2         Created container pxc
29m         Normal    Started               pod/cluster1-pxc-2         Started container pxc
29m         Normal    Pulled                pod/cluster1-pxc-2         Successfully pulled image "docker-remote.binrepo.example.com/percona/percona-xtradb-cluster:8.0.39-30.1" in 1.093s (1.093s including waiting)
29m         Warning   Unhealthy             pod/cluster1-pxc-2         Readiness probe failed: ERROR 2003 (HY000): Can't connect to MySQL server on '192.168.5.180:33062' (111)...
30m         Warning   RecreatingFailedPod   statefulset/cluster1-pxc   StatefulSet pxc/cluster1-pxc is recreating failed Pod cluster1-pxc-2
30m         Normal    SuccessfulDelete      statefulset/cluster1-pxc   delete Pod cluster1-pxc-2 in StatefulSet cluster1-pxc successful
30m         Normal    SuccessfulCreate      statefulset/cluster1-pxc   create Pod cluster1-pxc-2 in StatefulSet cluster1-pxc successful

pxc0-logs.txt (17.4 KB)
pxc2-logs.txt (76.9 KB)

output of SHOW STATUS LIKE 'wsrep_%' on the 3 nodes:
show status wsrep_.txt (50.0 KB)

contents of /var/lib/mysql in node 0 vs node 2:

Could you please try this docummentation Horizontal and vertical scaling - Percona Operator for MySQL

Thank you, I was able to expand the PVCs so that bought me some time. I was missing the enableVolumeExpansion: true option.
Now it remain to figure out why cluster1-pxc-2 is not deleting the old logs, which is what caused it to run out of space.
Or were you recommending to follow the “Manual scaling without Volume Expansion capability” steps as a way to delete cluster1-pxc-2 and its storage and have it recreated?

If you use operator version higher then 1.14.0 you can use Automated scaling with Volume Expansion capability. To use manual scaling is also an option for sure.
As I understand you expanded pvcs on all nodes. Do you still have the problem with logs on cluster1-pxc-2?

Yes, expanded PCVs on all nodes using the automated scaling, but still have the problem with logs on cluster1-pxc-2. There are many more binlog.000XXX files in that node.
The oldest binlog on the other nodes is from Feb 12 2025, but on cluster1-pxc-2 it’s from Feb 4 2025.

As far as I can tell, replication is fine (I connected to each node individually and ran select queries and they all show the same data, and I can’t spot anything wrong in the show status like ‘wsrep_%’ output).
So perhaps if I delete the cluster1-pxc-2 node and its PVC using the procedure explained in “Manual scaling without Volume Expansion capability” it will fix itself… or create a singularity that will wipe everything :slight_smile:

Deleting the pod cluster1-pxc-2 might help, but before taking that step, could you first verify the replication status on this node?

Please check the output of the following commands:
:small_blue_diamond: SHOW MASTER STATUS;
:small_blue_diamond: SHOW SLAVE STATUS\G

If everything looks good and replication is functioning correctly, you can proceed with purging old binlogs.

Let me know if the issue persists.

I ran the two commands on each node, here’s the output:
cluster1-pxc-0

mysql> show master status;
+---------------+-----------+--------------+------------------+-------------------------------------------------+
| File          | Position  | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                               |
+---------------+-----------+--------------+------------------+-------------------------------------------------+
| binlog.000350 | 702866262 |              |                  | e225fbd5-e580-11ee-8dad-bec6e432df2c:1-49138997 |
+---------------+-----------+--------------+------------------+-------------------------------------------------+
1 row in set (0.00 sec)

mysql> SHOW SLAVE STATUS\G
Empty set, 1 warning (0.00 sec)

cluster1-pxc-1

mysql> show master status;
+---------------+-----------+--------------+------------------+-------------------------------------------------+
| File          | Position  | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                               |
+---------------+-----------+--------------+------------------+-------------------------------------------------+
| binlog.000348 | 795451669 |              |                  | e225fbd5-e580-11ee-8dad-bec6e432df2c:1-49139015 |
+---------------+-----------+--------------+------------------+-------------------------------------------------+
1 row in set (0.00 sec)

mysql> show slave status\G
Empty set, 1 warning (0.00 sec)

cluster1-pxc-2

mysql> show master status;
+---------------+-----------+--------------+------------------+-------------------------------------------------+
| File          | Position  | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                               |
+---------------+-----------+--------------+------------------+-------------------------------------------------+
| binlog.000354 | 326699984 |              |                  | e225fbd5-e580-11ee-8dad-bec6e432df2c:1-49139030 |
+---------------+-----------+--------------+------------------+-------------------------------------------------+
1 row in set (0.00 sec)

mysql> show slave status\G
Empty set, 1 warning (0.00 sec)

the warning is about SHOW SLAVE STATUS being deprecated

From what I can understand, you are using the PXC cluster, right? Then why did you enable bin logs on all the nodes?

if you are having any explicit slave then check the slave status on that node, else you can easily purge binary logs and enable binary logs expiry.

1 Like

Thanks for looking into this.
Yes, using the PXC cluster.
I didn’t explicitly enable bin logs on it, I thought it came standard. Months ago I reduced binlog_expire_logs_seconds to 14 days instead of the default 30 because I didn’t see the usefulness of it. The first 2 nodes hold 14 days of logs, with a log file getting created approx every two days, while the 3rd node holds many more days.
This is the deploy/cr.yaml I’ve been using, with minimal redactions

apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBCluster
metadata:
  name: cluster1
  finalizers:
    - percona.com/delete-pxc-pods-in-order
spec:
  crVersion: 1.16.1
  enableVolumeExpansion: true
  tls:
    enabled: true
  updateStrategy: SmartUpdate
  upgradeOptions:
    versionServiceEndpoint: https://check.percona.com
    apply: disabled
    schedule: "0 4 * * *"
  pxc:
    size: 3
    image: docker-remote.example.com/percona/percona-xtradb-cluster:8.0.39-30.1
    autoRecovery: true
    configuration: |
      [mysqld]
      sort_buffer_size = 256000000
      # Change automatic default purge expiry from 30 days to 14
      binlog_expire_logs_seconds = 1209600
      [sst]
      xbstream-opts=--decompress
      [xtrabackup]
      compress=lz4
    imagePullSecrets:
      - name: pxc-pull-secrets
    resources:
      requests:
        memory: 1200M
        cpu: 200m
    affinity:
      antiAffinityTopologyKey: "kubernetes.io/hostname"
    podDisruptionBudget:
      maxUnavailable: 1
    volumeSpec:
      persistentVolumeClaim:
        resources:
          requests:
            storage: 12G
    gracePeriod: 600
  haproxy:
    enabled: true
    size: 3
    image: docker-remote.example.com/percona/haproxy:2.8.11
    imagePullSecrets:
      - name: pxc-pull-secrets
    exposePrimary:
      enabled: true
      type: LoadBalancer
      loadBalancerIP: 10.1.2.3
    envVarsSecret: ha-proxy-secrets
    resources:
      requests:
        memory: 1G
        cpu: 200m
    affinity:
      antiAffinityTopologyKey: "kubernetes.io/hostname"
    podDisruptionBudget:
      maxUnavailable: 1
    gracePeriod: 30
  proxysql:
    enabled: false
    size: 3
    image: percona/proxysql2:2.7.1
    resources:
      requests:
        memory: 1G
        cpu: 600m
    affinity:
      antiAffinityTopologyKey: "kubernetes.io/hostname"
    volumeSpec:
      persistentVolumeClaim:
        resources:
          requests:
            storage: 2G
    podDisruptionBudget:
      maxUnavailable: 1
    gracePeriod: 30
  logcollector:
    enabled: false
    image: percona/percona-xtradb-cluster-operator:1.16.1-logcollector-fluentbit3.2.2
    resources:
      requests:
        memory: 100M
        cpu: 200m
  pmm:
    enabled: false
    image: percona/pmm-client:2.44.0
    serverHost: monitoring-service
    resources:
      requests:
        memory: 150M
        cpu: 300m
  backup:
    image: percona/percona-xtradb-cluster-operator:1.16.1-pxc8.0-backup-pxb8.0.35
    pitr:
      enabled: false
      storageName: fs-pvc
      timeBetweenUploads: 300
    storages:
      fs-pvc:
        type: filesystem
        volume:
          persistentVolumeClaim:
            storageClassName: offsite-smb
            accessModes: [ "ReadWriteOnce" ]
            resources:
              requests:
                storage: 12G
    schedule:
    - name: "daily-backup-iad"
      schedule: "0 11 * * *"
      keep: 8
      storageName: fs-pvc

You can use the below parameter as well and can set the value to 1 or 2

expire_logs_days=2

got it - looks like binlog_expire_logs_seconds is the replacement for expire_logs_days https://dev.mysql.com/worklog/task/?id=10924 so I should be all set on that front (and it works as expected on two out of the three nodes)

you can try to run

flush logs

on the 3rd node.

Before that check the binary logs with below command

show binary logs;

if those listed here and the required parameters are in place then flush logs will clear all the obsolete files if they are not then it must be removed from mysql-bin.index file but not from disk then you need to remove manually.

thank you!
flush logs deleted the excess logs, and in the 3 days after running that command the server kept deleting the oldest ones as expected

1 Like