Can XtraBackup be used in Kubernetes with a shared data volume?

We are looking to use the percona-xtradb-cluster Helm chart for deploying / managing a PXC cluster in Kubernetes. As part of our design, we’re trying to figure out the best way to use XtraBackup. From what I’ve been able to find, it seems that all solutions I’ve read about follow one of the following two patterns:

  1. Execute XtraBackup within a running instance and then send the backup somewhere (i.e., kubectl exec pxc xtrabackup …)
  2. Expose a service/port from a running instance that can be exploited to trigger XtraBackup from within the container (i.e., ssh pxc -e xtrabackup …)

There are a few different variations of both of these, but everything I’ve read does fall into one of these two categories. Even the percona-xtradb-cluster-operator, from what I can tell, is doing something like #2 but using ncat.

What I’m trying to understand is if there is another option that more closely follows a typical Kubernetes pattern. Before I go too far, I wanted to pose this question as I don’t know anything about the inner workings of XtraBackup itself. In particular, why can’t I launch a Job on the same Kubernetes node as a running PXC instance (Pod), mount the same data volume, and then run XtraBackup in that Job Pod (providing the appropriate connection params, etc.) ?

For what it’s worth, I have tested this and it does “work” in that I get a backup – I just don’t know if there is something special about XtraBackup that actually does need local access to the files such that this is not going to produce a proper backup. If this does work, with the growth of containers and Kubernetes, I’m surprised I couldn’t find anyone else doing something like this (or at least sharing their experience about it).

Any feedback would be helpful. Thanks.

Agelwarg,
Can you show how would you attach / detach the shared volume to the running pod?

Or you mean to start a Pod with always attached backuo volume? In this case XtraBackup may work. Xtrabackup needs the access to the local files, as behind the scene it physically read files.

We don’t want to modify/complicate the PXC image itself to expose anything else and are considering a design where we have a Pod with 2 containers, sharing the data volume. The 1st container runs a PXC instance, and the 2nd (sidecar) is where xtrabackup is actually executed, streaming the result to another backup volume. For example, the Pod spec might look something like this (not sure if formatting will make this look correct, but hopefully you get the idea):

spec:
containers:

  • name: pxc
    image: percona/percona-xtradb-cluster:5.7.19

    volumeMounts:
  • name: mysql-data
    mountPath: /var/lib/mysql
  • name: backup
    image: our-own-image

    volumeMounts:
  • name: mysql-data
    mountPath: /var/lib/mysql
  • name: mysql-backup
    mountPath: /backup
    volumes:
  • name: mysql-data
    persistentVolumeClaim:
    claimName: mysql-data-claim…
  • name: mysql-backup
    persistentVolumeClaim:
    claimName: mysql-backup-claim…

The backup container would have the xtrabackup binary and access the pxc instance via --host and --port, making sure that --datadir points to wherever the data is mounted inside the backup container (in this case, the same mountPath as the pxc container).

If the main piece about sharing the data volume is valid, there are a couple possibilities here. Current thinking is that maybe we run / expose sshd in the backup container so that a CronJob could trigger the backup on a schedule. We like that a bit better than modifying the pxc image to expose sshd from directly inside that container.

Thanks.

we got this working in our production. we used 2 containers in deployment; one running mysql image and the other running crond with xtrabackup installed.

the following directories are shared within the pod-

/var/lib/mysql - mysql data files
/var/run/mysqld - mysql socket (mounted as EmptyDir in k8s)

backup directory is mounted on xtrabackup container only.

xtrabackup must be able to connect to mysql in some fashion (tcp or socket) so that’s why we share the socket directory. xtrabackup also needs physical read access to the files it’s backup up. this solution fulfills both requirements.

crond on backup container fires xtrabackup scripts on a schedule

the yaml looks something like this-

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
generation: 1
name: mysql
selfLink: /apis/extensions/v1beta1/namespaces/default/deployments/mysql
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
run: mysql
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
run: mysql
spec:
containers:
image: mysql:5.7
imagePullPolicy: Always
name: mysql
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:

  • mountPath: /var/lib/mysql/
    name: mysql
  • mountPath: /var/run/mysqld
    name: socket
  • image: private-registry:5000/xtrabackup
    imagePullPolicy: Always
    name: xtrabackup
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
  • mountPath: /var/lib/mysql
    name: mysql
  • mountPath: /xtrabackup
    name: backup
  • mountPath: /var/run/mysqld
    name: socket
    dnsPolicy: ClusterFirst
    restartPolicy: Always
    schedulerName: default-scheduler
    securityContext: {}
    terminationGracePeriodSeconds: 30
    volumes:
  • name: mysql
    persistentVolumeClaim:
    claimName: mysql
  • emptyDir: {}
    name: socket
  • name: backup
    nfs:
    path: /path/to/backup/
    server: 1.2.3.4

@agelwarg can you share the chart?