Hello,
I am following some instructions from this document in order to restore my PXC cluster from my backup located on S3.
As my first step, I have created a secret and installed PMM into the freshly created namespace.
helm install pmm --namespace percona --set secret.create=false --set secret.name=pmm-secret percona/pmm
Then I have installed Percona Operator for MySQL into the same NS:
helm install percona-operator percona/pxc-operator --namespace percona
And the Percona XtraDB Cluster itself alongside:
--set haproxy.nodeSelector.cluster=percona-cluster-node \
--set pxc.nodeSelector.cluster=percona-cluster-node \
--set pxc.resources.requests.cpu=2500m \
--set pxc.resources.requests.memory=6Gi \
--set pxc.resources.limits.cpu=2500m \
--set pxc.resources.limits.memory=6Gi \
--set pxc.persistence.enabled=true \
--set pxc.persistence.size=15Gi \
--set pxc.persistence.storageClass=hcloud-volumes \
--set pmm.enabled=true \
--set pmm.resources.requests.cpu=200m \
--set pmm.resources.requests.memory=150M \
--set pmm.resources.limits.cpu=200m \
--set pmm.resources.limits.memory=150M \
--set haproxy.resources.requests.cpu=300m \
--set haproxy.resources.requests.memory=512M \
--set haproxy.resources.limits.cpu=300m \
--set haproxy.resources.limits.memory=512M \
--set logcollector.resources.requests.cpu=100m \
--set logcollector.resources.requests.memory=100M \
--set backup.enabled=false \
--set secrets.passwords.root=$(openssl rand -base64 32 | tr -d '\n') \
--set secrets.passwords.pmmserverkey=************************************** \
--set secrets.passwords.xtrabackup=$(openssl rand -base64 32 | tr -d '\n') \
--set secrets.passwords.replication=$(openssl rand -base64 32 | tr -d '\n') \
--set secrets.passwords.proxyadmin=$(openssl rand -base64 32 | tr -d '\n') \
--set secrets.passwords.operator=$(openssl rand -base64 32 | tr -d '\n') \
--set secrets.passwords.monitor=$(openssl rand -base64 32 | tr -d '\n') \
--set secrets.passwords.clustercheck=$(openssl rand -base64 32 | tr -d '\n')
At that point the whole setup seemed healthy and functional to me as I was able to login into the PMM Dashboard and to observe some cluster data there, I could spot no restarting pods or whatever on my k8s-dashboard as well.
root@master-001:~/k8s/pxc# kubectl get all -n percona
NAME READY STATUS RESTARTS AGE
pod/percona-cluster-pxc-d-haproxy-0 3/3 Running 0 179m
pod/percona-cluster-pxc-d-haproxy-1 3/3 Running 0 179m
pod/percona-cluster-pxc-d-haproxy-2 3/3 Running 0 3h
pod/percona-cluster-pxc-d-pxc-0 4/4 Running 0 3h
pod/percona-cluster-pxc-d-pxc-1 4/4 Running 0 3h1m
pod/percona-cluster-pxc-d-pxc-2 4/4 Running 0 3h2m
pod/percona-operator-pxc-operator-687f56d59d-4hkqj 1/1 Running 0 18h
pod/pmm-0 1/1 Running 0 46h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/monitoring-service NodePort 10.xxx.xxx.211 <none> 443:31958/TCP,80:32217/TCP 46h
service/percona-cluster-pxc-d-haproxy ClusterIP 10.xxx.xxx.59 <none> 3306/TCP,3309/TCP,33062/TCP,33060/TCP 3h4m
service/percona-cluster-pxc-d-haproxy-replicas ClusterIP 10.xxx.xxx.92 <none> 3306/TCP 3h4m
service/percona-cluster-pxc-d-pxc ClusterIP None <none> 3306/TCP,33062/TCP,33060/TCP 3h4m
service/percona-cluster-pxc-d-pxc-unready ClusterIP None <none> 3306/TCP,33062/TCP,33060/TCP 3h4m
service/percona-xtradb-cluster-operator ClusterIP 10.xxx.xxx.172 <none> 443/TCP 18h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/percona-operator-pxc-operator 1/1 1 1 18h
NAME DESIRED CURRENT READY AGE
replicaset.apps/percona-operator-pxc-operator-687f56d59d 1 1 1 18h
NAME READY AGE
statefulset.apps/percona-cluster-pxc-d-haproxy 3/3 3h4m
statefulset.apps/percona-cluster-pxc-d-pxc 3/3 3h4m
statefulset.apps/pmm 1/1 46h
As for my source environment, it is a regular Ubuntu 22.04 Server with mysql-server installed there. I have installed Xtrabackup 8 and made a succesful backup to my AWS S3 bucket.
370 wget https://repo.percona.com/apt/percona-release_latest.$(lsb_release -sc)_all.deb
371 dpkg -i percona-release_latest.$(lsb_release -sc)_all.deb
372 percona-release setup ps80
373 sudo apt update
374 sudo apt install percona-xtrabackup-80
375 sudo apt install lz4
376 xtrabackup
395 xtrabackup --backup --stream=xbstream --extra-lsndir=/tmp --target-dir=/tmp | xbcloud put --storage=s3 --s3-endpoint='s3.amazonaws.com' --s3-region='eu-central-1' --s3-access-key=$AWS_ACCESS_KEY_ID --s3-secret-key=$AWS_SECRET_ACCESS_KEY --s3-bucket='****************-percona-db-backups' --parallel=10 $(date -I)-full_backup
This is what I can see inside my S3 bucket
% aws s3 ls ****************-percona-db-backups/2024-05-08-full_backup/
PRE app/
PRE mysql/
PRE performance_schema/
PRE sys/
2024-05-09 00:17:50 496 backup-my.cnf.00000000000000000000
2024-05-09 00:17:50 27 backup-my.cnf.00000000000000000001
2024-05-09 00:17:50 244 binlog.000150.00000000000000000000
2024-05-09 00:17:50 27 binlog.000150.00000000000000000001
2024-05-09 00:17:50 62 binlog.index.00000000000000000000
2024-05-09 00:17:50 26 binlog.index.00000000000000000001
2024-05-09 00:17:50 759599 ib_buffer_pool.00000000000000000000
2024-05-09 00:17:50 28 ib_buffer_pool.00000000000000000001
2024-05-09 00:17:23 10485801 ibdata1.00000000000000000000
2024-05-09 00:17:23 10485801 ibdata1.00000000000000000001
2024-05-09 00:17:23 10485801 ibdata1.00000000000000000002
2024-05-09 00:17:23 10485801 ibdata1.00000000000000000003
2024-05-09 00:17:23 10485801 ibdata1.00000000000000000004
2024-05-09 00:17:23 10485801 ibdata1.00000000000000000005
2024-05-09 00:17:23 10485801 ibdata1.00000000000000000006
2024-05-09 00:17:23 6291497 ibdata1.00000000000000000007
2024-05-09 00:17:23 21 ibdata1.00000000000000000008
2024-05-09 00:17:49 10485803 mysql.ibd.00000000000000000000
2024-05-09 00:17:49 10485803 mysql.ibd.00000000000000000001
2024-05-09 00:17:49 10485803 mysql.ibd.00000000000000000002
2024-05-09 00:17:49 10485803 mysql.ibd.00000000000000000003
2024-05-09 00:17:49 4194347 mysql.ibd.00000000000000000004
2024-05-09 00:17:49 23 mysql.ibd.00000000000000000005
2024-05-09 00:17:50 10485802 undo_001.00000000000000000000
2024-05-09 00:17:50 6291498 undo_001.00000000000000000001
2024-05-09 00:17:50 22 undo_001.00000000000000000002
2024-05-09 00:17:50 10485802 undo_002.00000000000000000000
2024-05-09 00:17:50 6291498 undo_002.00000000000000000001
2024-05-09 00:17:50 22 undo_002.00000000000000000002
2024-05-09 00:17:50 116 xtrabackup_binlog_info.00000000000000000000
2024-05-09 00:17:50 36 xtrabackup_binlog_info.00000000000000000001
2024-05-09 00:17:50 199 xtrabackup_checkpoints.00000000000000000000
2024-05-09 00:17:50 36 xtrabackup_checkpoints.00000000000000000001
2024-05-09 00:17:51 623 xtrabackup_info.00000000000000000000
2024-05-09 00:17:51 29 xtrabackup_info.00000000000000000001
2024-05-09 00:17:50 3124 xtrabackup_logfile.00000000000000000000
2024-05-09 00:17:50 32 xtrabackup_logfile.00000000000000000001
2024-05-09 00:17:51 95 xtrabackup_tablespaces.00000000000000000000
2024-05-09 00:17:51 36 xtrabackup_tablespaces.00000000000000000001
On my target environment, I have created a secret to store my AWS credentials, which are identical to the ones mentioned above:
root@master-001:~/k8s/pxc# kubectl get secret ****************-percona-db-backups -n percona -o yaml
apiVersion: v1
data:
AWS_ACCESS_KEY_ID: ************************
AWS_SECRET_ACCESS_KEY: ********************
kind: Secret
metadata:
creationTimestamp: "2024-05-09T14:55:28Z"
name: ****************-percona-db-backups
namespace: percona
resourceVersion: "5024887"
uid: 2b468225-2d6f-490f-899d-67725b4f0d23
type: Opaque
I have created the following k8s-manifest to request a restore:
root@master-001:~/k8s/pxc# cat initial-restore.yaml
apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBClusterRestore
metadata:
name: initial-restore
spec:
pxcCluster: percona-cluster-pxc-d
backupSource:
destination: s3://****************-percona-db-backups/2024-05-08-full_backup
s3:
credentialsSecret: ****************-percona-db-backups
endpointUrl: https://s3.eu-central-1.amazonaws.com
region: eu-central-1
pitr:
type: latest
backupSource:
s3:
bucket: "****************-percona-db-backups"
credentialsSecret: ****************-percona-db-backups
endpointUrl: https://s3.eu-central-1.amazonaws.com
region: eu-central-1
And applying it also seemed to be fine:
root@master-001:~/k8s/pxc# kubectl apply -f initial-restore.yaml -n percona
perconaxtradbclusterrestore.pxc.percona.com/initial-restore created
root@master-001:~/k8s/pxc# kubectl describe pxc-restore/initial-restore -n percona
Name: initial-restore
Namespace: percona
Labels: <none>
Annotations: <none>
API Version: pxc.percona.com/v1
Kind: PerconaXtraDBClusterRestore
Metadata:
Creation Timestamp: 2024-05-09T17:42:18Z
Generation: 1
Resource Version: 5052630
UID: 6917c770-e75d-45f8-b4d0-60fda2f9374b
Spec:
Backup Source:
Destination: s3://****************-percona-db-backups/2024-05-08-full_backup
s3:
Credentials Secret: ****************-percona-db-backups
Endpoint URL: https://s3.eu-central-1.amazonaws.com
Region: eu-central-1
Pitr:
Backup Source:
s3:
Bucket: ****************-percona-db-backups
Credentials Secret: ****************-percona-db-backups
Endpoint URL: https://s3.eu-central-1.amazonaws.com
Region: eu-central-1
Type: latest
Pxc Cluster: percona-cluster-pxc-d
Status:
Comments: You can view xtrabackup log:
$ kubectl logs job/restore-job-initial-restore-percona-cluster-pxc-d
If everything is fine, you can cleanup the job:
$ kubectl delete pxc-restore/initial-restore
Completed: 2024-05-09T17:42:18Z
State: Succeeded
Events: <none>
I have noticed that no job was created and decided to trace my percona-operator logs to find the reason.
root@master-001:~# kubectl logs pod/percona-operator-pxc-operator-687f56d59d-4hkqj -n percona
2024-05-09T16:34:18.148Z INFO setup Runs on {"platform": "kubernetes", "version": "v1.27.1"}
2024-05-09T16:34:18.149Z INFO setup Manager starting up {"gitCommit": "c85a021f2a21441500b02a2c0b3d17e8a8b25996", "gitBranch": "release-1-14-0", "buildTime": "2024-03-01T09:01:29Z", "goVersion": "go1.21.7", "os": "linux", "arch": "amd64"}
2024-05-09T16:34:18.149Z INFO setup Registering Components.
2024-05-09T16:34:19.300Z INFO controller-runtime.webhook Registering webhook {"path": "/validate-percona-xtradbcluster"}
2024-05-09T16:34:19.300Z INFO setup Starting the Cmd.
2024-05-09T16:34:19.300Z INFO controller-runtime.metrics Starting metrics server
2024-05-09T16:34:19.300Z INFO controller-runtime.metrics Serving metrics server {"bindAddress": ":8080", "secure": false}
2024-05-09T16:34:19.300Z INFO starting server {"kind": "health probe", "addr": "[::]:8081"}
2024-05-09T16:34:19.301Z INFO controller-runtime.webhook Starting webhook server
2024-05-09T16:34:19.301Z INFO controller-runtime.certwatcher Updated current TLS certificate
2024-05-09T16:34:19.301Z INFO controller-runtime.webhook Serving webhook server {"host": "", "port": 9443}
2024-05-09T16:34:19.301Z INFO controller-runtime.certwatcher Starting certificate watcher
2024-05-09T16:34:19.302Z INFO attempting to acquire leader lease percona/08db1feb.percona.com...
2024-05-09T16:34:35.256Z INFO successfully acquired lease percona/08db1feb.percona.com
2024-05-09T16:34:35.256Z INFO Starting EventSource {"controller": "pxc-controller", "source": "kind source: *v1.PerconaXtraDBCluster"}
2024-05-09T16:34:35.256Z INFO Starting Controller {"controller": "pxc-controller"}
2024-05-09T16:34:35.256Z INFO Starting EventSource {"controller": "pxcrestore-controller", "source": "kind source: *v1.PerconaXtraDBClusterRestore"}
2024-05-09T16:34:35.256Z INFO Starting Controller {"controller": "pxcrestore-controller"}
2024-05-09T16:34:35.257Z INFO Starting EventSource {"controller": "pxcbackup-controller", "source": "kind source: *v1.PerconaXtraDBClusterBackup"}
2024-05-09T16:34:35.257Z INFO Starting Controller {"controller": "pxcbackup-controller"}
2024-05-09T16:34:35.376Z INFO Starting workers {"controller": "pxcbackup-controller", "worker count": 1}
2024-05-09T16:34:35.405Z INFO Starting workers {"controller": "pxc-controller", "worker count": 1}
2024-05-09T16:34:35.407Z INFO Starting workers {"controller": "pxcrestore-controller", "worker count": 1}
2024-05-09T16:34:37.130Z INFO add new job {"controller": "pxc-controller", "namespace": "percona", "name": "percona-cluster-pxc-d", "reconcileID": "d4b88387-4e1c-42a8-899a-026b61667557", "name": "ensure-version/percona/percona-cluster-pxc-d", "schedule": "0 4 * * *"}
2024-05-09T16:34:37.130Z INFO add new job {"controller": "pxc-controller", "namespace": "percona", "name": "percona-cluster-pxc-d", "reconcileID": "d4b88387-4e1c-42a8-899a-026b61667557", "name": "telemetry/percona/percona-cluster-pxc-d", "schedule": "47 * * * *"}
root@master-001:~# kubectl logs pod/percona-operator-pxc-operator-687f56d59d-4hkqj -n percona --follow
2024-05-09T16:34:18.148Z INFO setup Runs on {"platform": "kubernetes", "version": "v1.27.1"}
2024-05-09T16:34:18.149Z INFO setup Manager starting up {"gitCommit": "c85a021f2a21441500b02a2c0b3d17e8a8b25996", "gitBranch": "release-1-14-0", "buildTime": "2024-03-01T09:01:29Z", "goVersion": "go1.21.7", "os": "linux", "arch": "amd64"}
2024-05-09T16:34:18.149Z INFO setup Registering Components.
2024-05-09T16:34:19.300Z INFO controller-runtime.webhook Registering webhook {"path": "/validate-percona-xtradbcluster"}
2024-05-09T16:34:19.300Z INFO setup Starting the Cmd.
2024-05-09T16:34:19.300Z INFO controller-runtime.metrics Starting metrics server
2024-05-09T16:34:19.300Z INFO controller-runtime.metrics Serving metrics server {"bindAddress": ":8080", "secure": false}
2024-05-09T16:34:19.300Z INFO starting server {"kind": "health probe", "addr": "[::]:8081"}
2024-05-09T16:34:19.301Z INFO controller-runtime.webhook Starting webhook server
2024-05-09T16:34:19.301Z INFO controller-runtime.certwatcher Updated current TLS certificate
2024-05-09T16:34:19.301Z INFO controller-runtime.webhook Serving webhook server {"host": "", "port": 9443}
2024-05-09T16:34:19.301Z INFO controller-runtime.certwatcher Starting certificate watcher
2024-05-09T16:34:19.302Z INFO attempting to acquire leader lease percona/08db1feb.percona.com...
2024-05-09T16:34:35.256Z INFO successfully acquired lease percona/08db1feb.percona.com
2024-05-09T16:34:35.256Z INFO Starting EventSource {"controller": "pxc-controller", "source": "kind source: *v1.PerconaXtraDBCluster"}
2024-05-09T16:34:35.256Z INFO Starting Controller {"controller": "pxc-controller"}
2024-05-09T16:34:35.256Z INFO Starting EventSource {"controller": "pxcrestore-controller", "source": "kind source: *v1.PerconaXtraDBClusterRestore"}
2024-05-09T16:34:35.256Z INFO Starting Controller {"controller": "pxcrestore-controller"}
2024-05-09T16:34:35.257Z INFO Starting EventSource {"controller": "pxcbackup-controller", "source": "kind source: *v1.PerconaXtraDBClusterBackup"}
2024-05-09T16:34:35.257Z INFO Starting Controller {"controller": "pxcbackup-controller"}
2024-05-09T16:34:35.376Z INFO Starting workers {"controller": "pxcbackup-controller", "worker count": 1}
2024-05-09T16:34:35.405Z INFO Starting workers {"controller": "pxc-controller", "worker count": 1}
2024-05-09T16:34:35.407Z INFO Starting workers {"controller": "pxcrestore-controller", "worker count": 1}
2024-05-09T16:34:37.130Z INFO add new job {"controller": "pxc-controller", "namespace": "percona", "name": "percona-cluster-pxc-d", "reconcileID": "d4b88387-4e1c-42a8-899a-026b61667557", "name": "ensure-version/percona/percona-cluster-pxc-d", "schedule": "0 4 * * *"}
2024-05-09T16:34:37.130Z INFO add new job {"controller": "pxc-controller", "namespace": "percona", "name": "percona-cluster-pxc-d", "reconcileID": "d4b88387-4e1c-42a8-899a-026b61667557", "name": "telemetry/percona/percona-cluster-pxc-d", "schedule": "47 * * * *"}
2024-05-09T16:36:54.361Z INFO backup restore request {"controller": "pxcrestore-controller", "namespace": "percona", "name": "initial-restore", "reconcileID": "30883363-e4e0-4dd6-93f1-6f8c00c85be1"}
2024-05-09T16:36:54.392Z INFO Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference {"controller": "pxcrestore-controller", "namespace": "percona", "name": "initial-restore", "reconcileID": "30883363-e4e0-4dd6-93f1-6f8c00c85be1"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x58 pc=0x1634635]
goroutine 197 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:116 +0x1e5
panic({0x1aa5fe0?, 0x2e707b0?})
/usr/local/go/src/runtime/panic.go:914 +0x21f
github.com/percona/percona-xtradb-cluster-operator/pkg/apis/pxc/v1.(*PXCBackupStatus).GetStorageType(...)
/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/apis/pxc/v1/pxc_backup_types.go:140
github.com/percona/percona-xtradb-cluster-operator/pkg/pxc/backup.RestoreJob(0xc000e901a0, 0xc000353200, 0xc000ad0a00, {0xc000c4ca00, 0x45}, 0x0)
/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/pxc/backup/restore.go:140 +0x75
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxcrestore.(*s3).Job(0xc00065e5d0?)
/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxcrestore/restorer.go:38 +0x32
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxcrestore.(*ReconcilePerconaXtraDBClusterRestore).validate(0x1a8a960?, {0x204fa08, 0xc000f3ef90}, 0xc000e901a0, 0x204f998?, 0xc000ad0a00?)
/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxcrestore/restore.go:80 +0x4b
github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxcrestore.(*ReconcilePerconaXtraDBClusterRestore).Reconcile(0xc00065e5d0, {0x204fa08, 0xc000f3ef90}, {{{0xc000992606?, 0x5?}, {0xc000992610?, 0xc000ec6d08?}}})
/go/src/github.com/percona/percona-xtradb-cluster-operator/pkg/controller/pxcrestore/controller.go:190 +0xf14
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x2053448?, {0x204fa08?, 0xc000f3ef90?}, {{{0xc000992606?, 0xb?}, {0xc000992610?, 0x0?}}})
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:119 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0003f20a0, {0x204fa40, 0xc000607a90}, {0x1b4eb40?, 0xc000bf4560?})
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:316 +0x3cc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0003f20a0, {0x204fa40, 0xc000607a90})
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:266 +0x1af
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:227 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 130
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.0/pkg/internal/controller/controller.go:223 +0x565
Could you please kindly tell if I am doing something wrong or is it a defect? TIA