TimM
November 19, 2022, 10:17am
1
Firstly, thank you to the devs for an excellent operator - it has been a pleasure setting it up over the last couple of days.
However, I cannot get backup to local filesystem working. I am using Helm (through Ansible) to run it on a 3 node bare metal test K3S cluster. I am testing local backup at this stage. At the appointed backup time, the db-pxc-2 pod enters a crashloop erroring in stage 3.
Just tried to add attachments but as a new user I’m not permitted so I’ll try to add them on a second post.
1 Like
TimM
November 19, 2022, 10:21am
2
Looks like I still cannot add attachments so here at least is the section log for the db-pxc-2 pod showing errors:
{“log”:“2022-11-19T09:53:56.861821Z 0 [ERROR] [MY-000000] [Galera] failed to open gcomm backend connection: 110: failed to reach primary view (pc.wait_prim_timeout): 110 (Connection timed out)\n\t at gcomm/src/pc.cpp:connect():161\n”,“file”:“/var/lib/mysql/mysqld-error.log”}
{“log”:“2022-11-19T09:53:56.861884Z 0 [ERROR] [MY-000000] [Galera] gcs/src/gcs_core.cpp:gcs_core_open():219: Failed to open backend connection: -110 (Connection timed out)\n”,“file”:“/var/lib/mysql/mysqld-error.log”}
{“log”:“2022-11-19T09:53:57.862114Z 0 [Note] [MY-000000] [Galera] gcomm: terminating thread\n”,“file”:“/var/lib/mysql/mysqld-error.log”}
{“log”:“2022-11-19T09:53:57.862199Z 0 [Note] [MY-000000] [Galera] gcomm: joining thread\n”,“file”:“/var/lib/mysql/mysqld-error.log”}
{“log”:“2022-11-19T09:53:57.862365Z 0 [ERROR] [MY-000000] [Galera] gcs/src/gcs.cpp:gcs_open():1758: Failed to open channel ‘vaultwarden-db-pxc-db-pxc’ at ‘gcomm://vaultwarden-db-pxc-db-pxc-0.vaultwarden-db-pxc-db-pxc,vaultwarden-db-pxc-db-pxc-1.vaultwarden-db-pxc-db-pxc’: -110 (Connection timed out)\n”,“file”:“/var/lib/mysql/mysqld-error.log”}
{“log”:“2022-11-19T09:53:57.862411Z 0 [ERROR] [MY-000000] [Galera] gcs connect failed: Connection timed out\n”,“file”:“/var/lib/mysql/mysqld-error.log”}
{“log”:“2022-11-19T09:53:57.862445Z 0 [ERROR] [MY-000000] [WSREP] Provider/Node (gcomm://vaultwarden-db-pxc-db-pxc-0.vaultwarden-db-pxc-db-pxc,vaultwarden-db-pxc-db-pxc-1.vaultwarden-db-pxc-db-pxc) failed to establish connection with cluster (reason: 7)\n”,“file”:“/var/lib/mysql/mysqld-error.log”}
{“log”:“2022-11-19T09:53:57.862479Z 0 [ERROR] [MY-010119] [Server] Aborting\n”,“file”:“/var/lib/mysql/mysqld-error.log”}
{“log”:“2022-11-19T09:53:57.862869Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.27-18.1) Percona XtraDB Cluster (GPL), Release rel18, Revision ac35177, WSREP version 26.4.3.\n”,“file”:“/var/lib/mysql/mysqld-error.log”}
1 Like
TimM
November 19, 2022, 10:25am
3
and here is the kubectl describe of the backup section of the pxc resource:
Spec:
Backup:
Image: percona/percona-xtradb-cluster-operator:1.11.0-pxc8.0-backup
Pitr:
Enabled: false
Schedule:
Keep: 3
Name: daily-backup
Schedule: 09 53 * * *
Storage Name: backup-vaultwarden-db
Storages:
Backup - Vaultwarden - Db:
Type: filesystem
Volume:
Persistent Volume Claim:
Access Modes:
ReadWriteOnce
Resources:
Requests:
Storage: 4Gi
Fs - Pvc:
Type: filesystem
Volume:
Persistent Volume Claim:
Access Modes:
ReadWriteOnce
Resources:
Requests:
Storage: 6Gi
Cr Version: 1.11.0
Enable CR Validation Webhook: false
Haproxy:
Affinity:
Anti Affinity Topology Key: kubernetes.io/hostname
Annotations:
Enabled: true
Grace Period: 30
Image: percona/percona-xtradb-cluster-operator:1.11.0-haproxy
Labels:
Liveness Delay Sec: 300
Liveness Probes:
Failure Threshold: 4
Initial Delay Seconds: 60
Period Seconds: 30
Success Threshold: 1
Timeout Seconds: 5
Node Selector:
Pod Disruption Budget:
Max Unavailable: 1
Readiness Delay Sec: 15
Readiness Probes:
Failure Threshold: 3
Initial Delay Seconds: 15
Period Seconds: 5
Success Threshold: 1
Timeout Seconds: 1
Replicas Service Enabled: true
Resources:
Limits:
Cpu: 1
Memory: 1Gi
Requests:
Cpu: 500m
Memory: 0.5Gi
Sidecar PV Cs:
Sidecar Resources:
Limits:
Requests:
Sidecar Volumes:
Sidecars:
Size: 3
Tolerations:
Volume Spec:
Empty Dir:
Init Image: percona/percona-xtradb-cluster-operator:1.11.0
Log Collector Secret Name: vaultwarden-db-pxc-db-log-collector
Logcollector:
Enabled: true
Image: percona/percona-xtradb-cluster-operator:1.11.0-logcollector
Resources:
Limits:
Requests:
Cpu: 200m
Memory: 100M
Pause: false
Pmm:
Enabled: false
Proxysql:
Enabled: false
Pxc:
Affinity:
Anti Affinity Topology Key: kubernetes.io/hostname
Annotations:
Auto Recovery: true
Grace Period: 600
Image: percona/percona-xtradb-cluster:8.0.27-18.1
Labels:
Liveness Delay Sec: 300
Liveness Probes:
Failure Threshold: 3
Initial Delay Seconds: 300
Period Seconds: 10
Success Threshold: 1
Timeout Seconds: 5
Node Selector:
Pod Disruption Budget:
Max Unavailable: 1
Readiness Delay Sec: 15
Readiness Probes:
Failure Threshold: 5
Initial Delay Seconds: 15
Period Seconds: 30
Success Threshold: 1
Timeout Seconds: 15
Resources:
Limits:
Cpu: 1
Memory: 1Gi
Requests:
Cpu: 500m
Memory: 0.5Gi
Sidecar PV Cs:
Sidecar Resources:
Limits:
Requests:
Sidecar Volumes:
Sidecars:
Size: 3
Tolerations:
Volume Spec:
Persistent Volume Claim:
Access Modes:
ReadWriteOnce
Resources:
Requests:
Storage: 2Gi
Storage Class Name: local-path
Secrets Name: vaultwarden-db-pxc-db
Ssl Internal Secret Name: vaultwarden-db-pxc-db-ssl-internal
Ssl Secret Name: vaultwarden-db-pxc-db-ssl
Update Strategy: SmartUpdate
Upgrade Options:
Apply: 8.0-recommended
Schedule: 0 4 * * *
Version Service Endpoint: https://check.percona.com
Vault Secret Name: vaultwarden-db-pxc-db-vault
1 Like
TimM
November 19, 2022, 11:04am
5
Just tried explicitly adding backup.storages.volume.persistentVolumeClaim.storageClassName as this was missing from the Helm chart but still got the same error crashloop in db-pxc-2 pod.
1 Like
Hey @TimM ,
Can you please share you values.yaml that you use to deploy the cluster?
What do you use for storage?
1 Like
TimM
November 28, 2022, 10:02am
7
Hi Sergey, thank you for your response.
Here is the values.yaml. I am currently testing with in-cluster storage but will use ultimately use S3. In the last attempt I was using longhorn but I also tried with (Rancher) local-path storageClass. Have I missed/misconfigured something?
pxc:
size: 3
resources:
requests:
cpu: 250m
memory: 0.5Gi
limits:
cpu: 1
memory: 1Gi
persistence:
enabled: true
storageClass: local-path
accessMode: ReadWriteOnce
size: 2Gi
haproxy:
size: 3
resources:
requests:
cpu: 100m
memory: 0.5Gi
limits:
cpu: 1
memory: 1Gi
backup:
enabled: true
image: "percona/percona-xtradb-cluster-operator:1.11.0-pxc8.0-backup"
schedule:
- name: "daily-backup"
schedule: "12 15 * * *"
keep: 3
storageName: backup-vaultwarden-db
storages:
backup-vaultwarden-db:
type: filesystem
volume:
persistentVolumeClaim:
storageClassName: longhorn
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 4Gi
1 Like
Hello @TimM ,
I cannot see any issues with your config.
For every backup there are backup pods created. Can you please show the log from one of them?
1 Like