Percona XtraDB Cluster Operator: /var/lib/mysql has wrong permissions

Hello,

I’m trying to deploy Percona XtraDB Cluster Operator in a k8s cluster according to the manual that is being exposed here: [URL=“Install Percona XtraDB Cluster on Kubernetes”]https://www.percona.com/doc/kubernet...ubernetes.html[/URL].

After performing all necessary steps I see that the cluster1-pxc-0 is being constantly restarted:

NAME READY STATUS RESTARTS AGE
cluster1-proxysql-0 3/3 Running 0 37m
cluster1-proxysql-1 3/3 Running 0 37m
cluster1-proxysql-2 3/3 Running 0 37m
cluster1-pxc-0 0/1 CrashLoopBackOff 10 37m
percona-xtradb-cluster-operator-6bc6889544-4l5fb 1/1 Running 0 40m

The pod’s log shows that it’s having a problem accessing to /var/lib/mysql directory which is mounted to a GlusterFS storage:

++ id -u
+ USER_ID=1001
+ _MYSQL_ROOT_HOST=%
+ '[' '' = - ']'
+ echo 'Percona XtraDB Cluster: Finding peers'
+ PXC_SERVICE=cluster1-pxc-unready
+ echo 'Using service name: cluster1-pxc-unready'
+ /usr/bin/peer-list -on-start=/usr/bin/configure-pxc.sh -service=cluster1-pxc-unready
Percona XtraDB Cluster: Finding peers
Using service name: cluster1-pxc-unready
2019/06/21 11:38:45 Peer finder enter
2019/06/21 11:38:45 Determined Domain to be pxc.svc.k8s.***.***.***
2019/06/21 11:38:45 Peer list updated
was []
now [10-233-96-46.cluster1-pxc-unready.pxc.svc.k8s.***.***.***]
2019/06/21 11:38:45 execing: /usr/bin/configure-pxc.sh with stdin: 10-233-96-46.cluster1-pxc-unready.pxc.svc.k8s.***.***.***
2019/06/21 11:38:45 read line 10-233-96-46.cluster1-pxc-unready.pxc.svc.k8s.***.***.***
2019/06/21 11:38:46 Peer finder exiting
++ mysqld --verbose --wsrep_provider= --help
++ awk '$1 == "datadir" { print $2; exit }'
+ DATADIR=/var/lib/mysql/
+ '[' -z '' ']'
+ DATADIR=/var/lib/mysql
+ cat /etc/mysql/node.cnf
[mysqld]
pxc-encrypt-cluster-traffic=ON
ssl-ca=/etc/mysql/ssl-internal/ca.crt
ssl-key=/etc/mysql/ssl-internal/tls.key
ssl-cert=/etc/mysql/ssl-internal/tls.crt

ignore-db-dir=lost+found
datadir=/var/lib/mysql
socket=/tmp/mysql.sock

server_id=10
binlog_format=ROW
default_storage_engine=InnoDB

innodb_flush_log_at_trx_commit = 0
innodb_flush_method = O_DIRECT
innodb_file_per_table = 1
innodb_autoinc_lock_mode=2

bind_address = 0.0.0.0

wsrep_slave_threads=2
wsrep_cluster_address=gcomm://

wsrep_node_address=10.233.96.46

wsrep_provider=/usr/lib64/galera3/libgalera_smm.so

wsrep_cluster_name=cluster1-pxc

wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth='xtrabackup:rpt7tM9CvXRL3'

[client]
socket=/tmp/mysql.sock
++ grep wsrep_cluster_address /etc/mysql/node.cnf
++ sed -e 's^.*gcomm://^^'
Cluster address set to:
+ WSREP_CLUSTER_ADDRESS=
+ echo 'Cluster address set to: '
Cluster address is empty!
+ '[' -z '' ']'
+ echo 'Cluster address is empty! '
+ '[' '!' -z '' ']'
+ '[' '!' -e /var/lib/mysql/mysql ']'
Running with password ::VMx7s4rHgdThp::
+ echo 'Running with password ::VMx7s4rHgdThp::'
+ '[' -z VMx7s4rHgdThp -a -z '' -a -z '' -a -z '' ']'
+ '[' '!' -z '' -a -z VMx7s4rHgdThp ']'
+ rm -rf '/var/lib/mysql/*'
+ mkdir -p /var/lib/mysql
+ echo 'Running --initialize-insecure on /var/lib/mysql'
+ mysqld --initialize-insecure --skip-ssl
Running --initialize-insecure on /var/lib/mysql
mysqld: Can't create/write to file '/var/lib/mysql/is_writable' (Errcode: 13 - Permission denied)
2019-06-21T11:38:52.200250Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2019-06-21T11:38:52.200707Z 0 [Warning] WSREP: Node is running in bootstrap/initialize mode. Disabling pxc_strict_mode checks
2019-06-21T11:38:52.223799Z 0 [ERROR] --initialize specified but the data directory exists and is not writable. Aborting.
2019-06-21T11:38:52.223875Z 0 [ERROR] Aborting

+ echo 'Finished --initialize-insecure'
+ pid=39
+ mysql=(mysql --protocol=socket -uroot)
+ mysqld --user=mysql --datadir=/var/lib/mysql --skip-networking
+ for i in '{30..0}'
Finished --initialize-insecure
+ mysql --protocol=socket -uroot
+ echo 'SELECT 1'
+ echo 'MySQL init process in progress...'
+ sleep 1
MySQL init process in progress...
2019-06-21T11:38:52.527091Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2019-06-21T11:38:52.561605Z 0 [Warning] Can't create test file /var/lib/mysql/cluster1-pxc-0.lower-test
2019-06-21T11:38:52.561840Z 0 [Note] mysqld (mysqld 5.7.25-28-57) starting as process 39 ...
2019-06-21T11:38:52.584493Z 0 [Warning] Can't create test file /var/lib/mysql/cluster1-pxc-0.lower-test
2019-06-21T11:38:52.605591Z 0 [Warning] Can't create test file /var/lib/mysql/cluster1-pxc-0.lower-test
2019-06-21T11:38:52.607954Z 0 [Note] WSREP: Skipping automatic SSL certificate generation (enabled only in bootstrap mode)
2019-06-21T11:38:52.608038Z 0 [Note] WSREP: Setting wsrep_ready to false
2019-06-21T11:38:52.608063Z 0 [Note] WSREP: No pre-stored wsrep-start position found. Skipping position initialization.
2019-06-21T11:38:52.608073Z 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera3/libgalera_smm.so'
2019-06-21T11:38:52.620062Z 0 [Note] WSREP: wsrep_load(): Galera 3.35(rddf9876) by Codership Oy <info&#64;codership.com> loaded successfully.
2019-06-21T11:38:52.620601Z 0 [Note] WSREP: CRC-32C: using hardware acceleration.
2019-06-21T11:38:52.628488Z 0 [Warning] WSREP: Could not open state file for reading: '/var/lib/mysql//grastate.dat'
2019-06-21T11:38:52.628540Z 0 [Warning] WSREP: No persistent state found. Bootstraping with default state
2019-06-21T11:38:52.636787Z 0 [ERROR] WSREP: Could not open state file for writing: '/var/lib/mysql//grastate.dat'. Check permissions and/or disk space.: 13 (Permission denied)
at galera/src/saved_state.cpp:SavedState():57
2019-06-21T11:38:52.636864Z 0 [ERROR] WSREP: Failed to initialize wsrep_provider (reason:7). Must shutdown
2019-06-21T11:38:52.636891Z 0 [ERROR] Aborting

2019-06-21T11:38:52.636932Z 0 [Note] Binlog end
2019-06-21T11:38:52.640434Z 0 [Note] mysqld: Shutdown complete
+ for i in '{30..0}'
+ echo 'SELECT 1'
+ mysql --protocol=socket -uroot
+ echo 'MySQL init process in progress...'
+ sleep 1
MySQL init process in progress...
***SKIPPED***
+ '[' 0 = 0 ']'
+ echo 'MySQL init process failed.'
MySQL init process failed.
+ exit 1

I’m trying to figure out, why doesn’t the process have the permissions to write to write to /var/lib/mysql.

Here’s what I see when I run ls -al /var/lib/mysql in the container:

total 8
drwxr-xr-x. 2 root root 4096 Jun 21 10:32 .
drwxr-xr-x. 17 root root 4096 May 10 16:57 ..

The directory belongs to root:root, not to mysql (UID=1001), that’s why the initialization script couldn’t initialize the database properly.

Shouldn’t the directory belong to mysql? Why doesn’t it? Have I done something wrong?

Here’s what I see when I run mount in the container:

***SKIPPED***
10.13.1.16:storage1/pxc/datadir-cluster1-pxc-0 on /var/lib/mysql type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
***SKIPPED***

My PVC looks like that:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
creationTimestamp: "2019-06-21T10:56:46Z"
finalizers:
- kubernetes.io/pvc-protection
labels:
app.kubernetes.io/component: pxc
app.kubernetes.io/instance: cluster1
app.kubernetes.io/managed-by: percona-xtradb-cluster-operator
app.kubernetes.io/name: percona-xtradb-cluster
app.kubernetes.io/part-of: percona-xtradb-cluster
name: datadir-cluster1-pxc-0
namespace: pxc
resourceVersion: "3024217"
selfLink: /api/v1/namespaces/pxc/persistentvolumeclaims/datadir-cluster1-pxc-0
uid: 4308b987-9413-11e9-ad60-02001a7f0008
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 6Gi
volumeMode: Filesystem
volumeName: datadir-cluster1-pxc-0
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 6Gi
phase: Bound

The PV looks like that:

apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"name":"datadir-cluster1-pxc-0"},"spec":{"accessModes":["ReadWriteOnce"],"capacity":{"storage":"6Gi"},"claimRef":{"name":"datadir-cluster1-pxc-0","namespace":"pxc"},"glusterfs":{"endpoints":"glusterfs-cluster","path":"storage1/pxc/datadir-cluster1-pxc-0","readOnly":false},"persistentVolumeReclaimPolicy":"Retain"}}
creationTimestamp: "2019-06-21T10:57:30Z"
finalizers:
- kubernetes.io/pv-protection
name: datadir-cluster1-pxc-0
resourceVersion: "3024215"
selfLink: /api/v1/persistentvolumes/datadir-cluster1-pxc-0
uid: 5d97bbd4-9413-11e9-ad60-02001a7f0008
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 6Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: datadir-cluster1-pxc-0
namespace: pxc
resourceVersion: "3023964"
uid: 4308b987-9413-11e9-ad60-02001a7f0008
glusterfs:
endpoints: glusterfs-cluster
path: storage1/pxc/datadir-cluster1-pxc-0
persistentVolumeReclaimPolicy: Retain
volumeMode: Filesystem
status:
phase: Bound

Thanks in advance for any tips and clues.

Hi Melnik,

I found the same issue and realized that in some k8s implementation non-root users do not have the privilege to modify the pvc mount dir as they are not owners.

The solution would be to have an InitContainers to run before with a command to assign ownership to the user 1001 (That is the user specified in the Dockerfile). 

initContainers:
- name: "permissionsfix"
image: "busybox:1.25.0"
imagePullPolicy: IfNotPresent
command: ["/bin/sh", "-c"]
args:
- chown 1001:1001 /var/lib/mysql;
volumeMounts:
- name: datadir
mountPath: /var/lib/mysql


The only problem is that Percona Operator doesn’t support this configuration, at least I haven’t found a way to do it with the current operator.


Hello
If something like this happen please consider securityContext related field in PXC custom resource. Kubernetes shall adjust permissions properly in that case.

@IvanPylypenko  I am having the same issue trying to deploy percona xtradb cluster to my kubernetes instances using helm.

I have used the following:securityContext:
  runAsUser: 1001
  runAsGroup: 1001
  fsGroup: 1001
And then I ran chmod 777 and chown 1001:1001 on my Persistent Volume HostPath. Yet I still get this error… I’m scratching my head. Can you shed any light on how to get this to work?