Fresh instance with Percona XtraDB Cluster Operator v1.8.0 not starting completly under OKD

pavloos · October 11, 2021, 2:52pm

I have the same problem:

vanilla scaleway kapsule cluster v1.21.4
coredns CoreDNS-1.8.4
operator and db installed using helm charts
operator running in db namespace

 k -n db get deployments.apps pxc-operator
NAME           READY   UP-TO-DATE   AVAILABLE   AGE
pxc-operator   1/1     1            1           6m12s
❯ k -n db get deployments.apps pxc-operator -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    meta.helm.sh/release-name: pxc-operator
    meta.helm.sh/release-namespace: db
  creationTimestamp: "2021-10-11T13:37:59Z"
  generation: 1
  labels:
    app.kubernetes.io/instance: pxc-operator
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: pxc-operator
    app.kubernetes.io/version: 1.9.0
    helm.sh/chart: pxc-operator-1.9.1
  name: pxc-operator
  namespace: db
  resourceVersion: "46352088"
  uid: b8c837b2-778f-466c-8d50-7967368ec120
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/component: operator
      app.kubernetes.io/instance: pxc-operator
      app.kubernetes.io/name: pxc-operator
      app.kubernetes.io/part-of: pxc-operator
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app.kubernetes.io/component: operator
        app.kubernetes.io/instance: pxc-operator
        app.kubernetes.io/name: pxc-operator
        app.kubernetes.io/part-of: pxc-operator
    spec:
      containers:
      - command:
        - percona-xtradb-cluster-operator
        env:
        - name: WATCH_NAMESPACE
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: OPERATOR_NAME
          value: pxc-operator
        image: percona/percona-xtradb-cluster-operator:1.9.0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /metrics
            port: metrics
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: pxc-operator
        ports:
        - containerPort: 8080
          name: metrics
          protocol: TCP
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: pxc-operator
      serviceAccountName: pxc-operator
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2021-10-11T13:37:59Z"
    lastUpdateTime: "2021-10-11T13:37:59Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2021-10-11T13:37:59Z"
    lastUpdateTime: "2021-10-11T13:38:10Z"
    message: ReplicaSet "pxc-operator-5998c9b5cb" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1

pxc in men namespace

❯ k -n men get pxc -o yaml
apiVersion: v1
items:
- apiVersion: pxc.percona.com/v1
  kind: PerconaXtraDBCluster
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"pxc.percona.com/v1-9-0","kind":"PerconaXtraDBCluster"}
      meta.helm.sh/release-name: db
      meta.helm.sh/release-namespace: men
    creationTimestamp: "2021-10-11T13:38:47Z"
    finalizers:
    - delete-pxc-pods-in-order
    - delete-proxysql-pvc
    - delete-pxc-pvc
    generation: 2
    labels:
      app.kubernetes.io/instance: db
      app.kubernetes.io/managed-by: Helm
      app.kubernetes.io/name: pxc-db
      app.kubernetes.io/version: 1.9.0
      helm.sh/chart: pxc-db-1.9.1
    name: db-pxc-db
    namespace: men
    resourceVersion: "46354859"
    uid: 36eae74f-f7c4-4df0-8dff-1ccc2491eb68
  spec:
    backup:
      image: percona/percona-xtradb-cluster-operator:1.9.0-pxc8.0-backup
      imagePullPolicy: Always
      pitr:
        enabled: false
        storageName: ""
      schedule:
      - keep: 5
        name: daily-backup
        schedule: 0 0 * * *
        storageName: fs-pvc
      storages:
        fs-pvc:
          podSecurityContext:
            fsGroup: 1001
            supplementalGroups:
            - 1001
          s3:
            bucket: ""
            credentialsSecret: ""
          type: filesystem
          volume:
            persistentVolumeClaim:
              accessModes:
              - ReadWriteOnce
              resources:
                requests:
                  storage: 6Gi
    crVersion: 1.9.0
    enableCRValidationWebhook: false
    haproxy:
      affinity:
        antiAffinityTopologyKey: kubernetes.io/hostname
      enabled: true
      envVarsSecret: db-pxc-db-env-vars-haproxy
      gracePeriod: 30
      image: percona/percona-xtradb-cluster-operator:1.9.0-haproxy
      imagePullPolicy: Always
      livenessProbes:
        failureThreshold: 4
        initialDelaySeconds: 60
        periodSeconds: 30
        successThreshold: 1
        timeoutSeconds: 5
      podDisruptionBudget:
        maxUnavailable: 1
      readinessProbes:
        failureThreshold: 3
        initialDelaySeconds: 15
        periodSeconds: 5
        successThreshold: 1
        timeoutSeconds: 1
      resources:
        limits: {}
        requests:
          cpu: 600m
          memory: 1G
      serviceAccountName: default
      sidecarResources:
        limits: {}
        requests: {}
      size: 3
      volumeSpec:
        emptyDir: {}
    logCollectorSecretName: db-pxc-db-log-collector
    logcollector:
      enabled: true
      image: percona/percona-xtradb-cluster-operator:1.9.0-logcollector
      imagePullPolicy: Always
      resources:
        limits: {}
        requests: {}
    platform: kubernetes
    pmm:
      resources:
        limits: {}
        requests:
          cpu: 600m
          memory: 1G
    proxysql:
      livenessProbes: {}
      podSecurityContext:
        fsGroup: 1001
        supplementalGroups:
        - 1001
      readinessProbes: {}
    pxc:
      affinity:
        antiAffinityTopologyKey: kubernetes.io/hostname
      autoRecovery: true
      envVarsSecret: db-pxc-db-env-vars-pxc
      expose: {}
      gracePeriod: 600
      image: percona/percona-xtradb-cluster:8.0.23-14.1
      imagePullPolicy: Always
      livenessDelaySec: 300
      livenessProbes:
        failureThreshold: 3
        initialDelaySeconds: 300
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 5
      podDisruptionBudget:
        maxUnavailable: 1
      podSecurityContext:
        fsGroup: 1001
        supplementalGroups:
        - 1001
      readinessDelaySec: 15
      readinessProbes:
        failureThreshold: 5
        initialDelaySeconds: 15
        periodSeconds: 30
        successThreshold: 1
        timeoutSeconds: 15
      resources:
        limits: {}
        requests:
          cpu: 600m
          memory: 1G
      serviceAccountName: default
      sidecarResources:
        limits: {}
        requests: {}
      size: 3
      sslInternalSecretName: db-pxc-db-ssl-internal
      sslSecretName: db-pxc-db-ssl
      vaultSecretName: db-pxc-db-vault
      volumeSpec:
        emptyDir: {}
    secretsName: db-pxc-db
    sslInternalSecretName: db-pxc-db-ssl-internal
    sslSecretName: db-pxc-db-ssl
    updateStrategy: SmartUpdate
    upgradeOptions:
      apply: 8.0-recommended
      schedule: 0 4 * * *
      versionServiceEndpoint: https://check.percona.com
    vaultSecretName: db-pxc-db-vault
  status:
    backup:
      version: 8.0.23
    conditions:
    - lastTransitionTime: "2021-10-11T13:38:52Z"
      status: "True"
      type: initializing
    haproxy:
      labelSelectorPath: app.kubernetes.io/component=haproxy,app.kubernetes.io/instance=db-pxc-db,app.kubernetes.io/managed-by=percona-xtradb-cluster-operator,app.kubernetes.io/name=percona-xtradb-cluster,app.kubernetes.io/part-of=percona-xtradb-cluster
      size: 3
      status: initializing
    host: db-pxc-db-haproxy.men
    logcollector:
      version: 1.9.0
    observedGeneration: 2
    pmm:
      version: 2.18.0
    proxysql: {}
    pxc:
      image: percona/percona-xtradb-cluster:8.0.23-14.1
      labelSelectorPath: app.kubernetes.io/component=pxc,app.kubernetes.io/instance=db-pxc-db,app.kubernetes.io/managed-by=percona-xtradb-cluster-operator,app.kubernetes.io/name=percona-xtradb-cluster,app.kubernetes.io/part-of=percona-xtradb-cluster
      size: 3
      status: initializing
      version: 8.0.23-14.1
    ready: 0
    size: 6
    state: initializing
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

the cluster is not starting due to DNS resolution

❯ k -n men get pods
NAME                  READY   STATUS    RESTARTS   AGE
db-pxc-db-haproxy-0   1/2     Running   13         55m
db-pxc-db-pxc-0       2/3     Running   3          55m

❯ k -n men exec -ti dnsutils -- nslookup db-pxc-db-pxc
Server:         10.32.0.10
Address:        10.32.0.10#53

** server can't find db-pxc-db-pxc: NXDOMAIN

command terminated with exit code 1
❯ k -n men exec -ti dnsutils -- nslookup db-pxc-db-pxc-unready
Server:         10.32.0.10
Address:        10.32.0.10#53

** server can't find db-pxc-db-pxc-unready: NXDOMAIN

command terminated with exit code 1

❯ k logs db-pxc-db-pxc-0 -c pxc | tail
2021/10/11 14:35:54 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:35:55 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:35:56 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:35:57 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:35:58 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:35:59 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:36:00 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:36:01 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:36:02 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:36:03 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host

❯ k -n men describe pod db-pxc-db-pxc-0 | tail -20
Events:
  Type     Reason     Age   From     Message
  ----     ------     ----  ----     -------
  Normal   Pulled     60m   kubelet  Successfully pulled image "percona/percona-xtradb-cluster-operator:1.9.0-logcollector" in 24.784727162s
  Normal   Created    60m   kubelet  Created container logs
  Normal   Started    60m   kubelet  Started container logs
  Normal   Pulling    60m   kubelet  Pulling image "percona/percona-xtradb-cluster-operator:1.9.0-logcollector"
  Normal   Pulled     60m   kubelet  Successfully pulled image "percona/percona-xtradb-cluster-operator:1.9.0-logcollector" in 1.942397197s
  Normal   Created    60m   kubelet  Created container logrotate
  Normal   Pulling    60m   kubelet  Pulling image "percona/percona-xtradb-cluster:8.0.23-14.1"
  Normal   Started    60m   kubelet  Started container logrotate
  Normal   Pulled     60m   kubelet  Successfully pulled image "percona/percona-xtradb-cluster:8.0.23-14.1" in 23.794414542s
  Normal   Created    60m   kubelet  Created container pxc
  Normal   Started    60m   kubelet  Started container pxc
  Warning  Unhealthy  55m   kubelet  Liveness probe failed: ERROR 2003 (HY000): Can't connect to MySQL server on 'db-pxc-db-pxc-0' (111)
+ [[ -n '' ]]
+ exit 1
  Warning  Unhealthy  40s (x116 over 59m)  kubelet  Readiness probe failed: ERROR 2003 (HY000): Can't connect to MySQL server on 'db-pxc-db-pxc-0' (111)
+ [[ '' == \P\r\i\m\a\r\y ]]
+ exit 1

Cluster dns resolution seems fine otherwise and endpoints are populated fine

❯ k -n men get endpoints db-pxc-db-pxc-unready
NAME                    ENDPOINTS                                                    AGE
db-pxc-db-pxc-unready   100.64.142.70:33062,100.64.142.70:33060,100.64.142.70:3306   64m

What I think might be the problem is that PXC statefulset’s serviceName field does not match the “unready” service name

❯ k -n men get statefulset db-pxc-db-pxc -o jsonpath='{.spec.serviceName}'
db-pxc-db-pxc

Similar issue is described here

and it is how it’s done for proxysql statefulset which in my tests starts just fine.

Any help appreciated.

Regards

Slava_Sarzhan · October 11, 2021, 5:13pm

@pavloos Please have a look at this commit K8SPXC-876 use PublishNotReadyAddresses instead of annotation · percona/percona-xtradb-cluster-operator@808ce22 · GitHub

This issue was fixed for 1.10.0. As a workaround you can edit service kubectl edit svc/${clustername}-pxc-unready manually and add publishNotReadyAddresses: true to the spec.

pavloos · October 11, 2021, 8:22pm

Thanks @Slava_Sarzhan. That’s great news. When is 1.10.0 planned to be released?

Cheers

Slava_Sarzhan · October 13, 2021, 4:05pm

@pavloos We have a plan to release it at the beginning of November.

Vsevosemnog · March 21, 2023, 5:08am

Hello

I`m using percona-xtradb-cluster-operator:1.11.0
But after database restoration, cluster freezes in initialization status

Only 1 of 3 pxc pods are created, the next one is in crashloopbackoff statut and can`t connect to pxc-unready svc

I can fix this issue and start cluster only if i delete backup up information from cluster manifest

Could you tell me please what am i doing wrong?

Slava_Sarzhan · March 24, 2023, 9:36am

@Vsevosemnog Can you reproduce this issue using the latest PXC operator, v1.12.0?

Slava_Sarzhan · March 24, 2023, 9:41am

@Vsevosemnog, we need to know more about your k8s deployment and also need to have your CR.

Topic		Replies	Views
Kubernetes: percona-xtradb-cluster-operator fails to initialize - readiness probe failed Percona Operator for MySQL	15	1909	February 16, 2023
PXC cluster not starting in kind Percona Operator for MySQL	10	2392	February 28, 2023
Cluster status and backups not working Percona Operator for MySQL	5	1072	October 28, 2021
Create PXC Cluster: sed: -e expression Percona Operator for MySQL percona	21	1835	May 8, 2021
Can you help me setup pxc cluster on k8s Percona Operator for MySQL	11	1120	August 25, 2021

Fresh instance with Percona XtraDB Cluster Operator v1.8.0 not starting completly under OKD

Related topics