Fresh instance with Percona XtraDB Cluster Operator v1.8.0 not starting completly under OKD

Thanks for the reploy @Slava_Sarzhan . I do not have any communication issues as far as I can see. Flannel with calico are up and both working and my ceph cluster is detecting heart beats from all nodes. Also other pods are working right.

Do I need configure EmptyDir on haproxy? Could that be it?

1 Like

I don’t believe EmptyDir has anything to do here. HAProxy pods are stateless, so it should not be an issue.

Is there anything else specific about your k8s cluster or Operator configuration?

1 Like

Hello, i have the same issue, instance installed following this page: Install Percona XtraDB Cluster on Kubernetes

in the yaml files i edited only storageClass name

pod can’t complete running status
NAME READY STATUS RESTARTS AGE
cluster1-haproxy-0 2/2 Running 1 16m
cluster1-haproxy-1 2/2 Running 0 11m
cluster1-haproxy-2 2/2 Running 0 11m
cluster1-pxc-0 3/3 Running 0 16m
cluster1-pxc-1 2/3 Running 0 11m
percona-xtradb-cluster-operator-77bfd8cdc5-5c9xm 1/1 Running 0 17m

12m Warning Unhealthy pod/cluster1-pxc-1 Readiness probe failed: ERROR 2003 (HY000): Can’t connect to MySQL server on ‘cluster1-pxc-1’ (111)

already tried to install changing namespace, with and without create secrets…

kubernetes v1.20.6 on rancher v2.5.7

1 Like

@MarcoFan anything in the logs of the Operator and Pods?
Is the 3rd PXC pod starting at all?

1 Like

i have the same error if i install calico

minikube start --driver=virtualbox --disable-driver-mounts --cpus=12 --memory=16096  --network-plugin=cni --cni=calico --nodes=3
Readiness probe failed: ERROR 2003 (HY000): Can't connect to MySQL server on 'my-db-pxc-db-pxc-0' (111) + [[ '' == \P\r\i\m\a\r\y ]] + exit 1 

logs-from-logs-in-cluster1-pxc-0.txt (929 Bytes)
logs-from-haproxy-in-cluster1-haproxy-0.txt (647 Bytes)

 Readiness probe failed: ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 2
Back-off restarting failed container




1 Like

if no calico
percona-xtradb-cluster works

1 Like

Hi @andrey

I can reproduce it using command provided by you. The root of the issue is that cluster1-pxc-0/cluster1-haproxy-0 pods can’t resolve services like cluster1-pxc-unready. That is why operator can’t configure the cluster in a proper way. It is calico issue. As I can see calico v3.14.1 is used by minikube and it was released more than one year ago. I have installed the latest v3.19.1 calico using official documentation Quickstart for Calico on minikube (using Manifest ) and issue has gone:

>kubectl get pods -l k8s-app=calico-node -n kube-system
NAME                READY   STATUS    RESTARTS   AGE
calico-node-fkwnn   1/1     Running   0          20m
calico-node-mk8dx   1/1     Running   0          19m
calico-node-z29f5   1/1     Running   0          18m

> kubectl get pods
NAME                                            READY   STATUS    RESTARTS   AGE
cluster1-haproxy-0                              2/2     Running   0          5m32s
cluster1-haproxy-1                              2/2     Running   0          3m30s
cluster1-haproxy-2                              2/2     Running   0          3m4s
cluster1-pxc-0                                  3/3     Running   0          5m32s
cluster1-pxc-1                                  3/3     Running   0          3m29s
cluster1-pxc-2                                  3/3     Running   0          117s
percona-xtradb-cluster-operator-d99c748-jhv4x   1/1     Running   0          6m16s

Also, I have tested it on scaleway k8s cluster with CNI calico and it also works. Try to use the latest version of calico and inform me about the results.

1 Like

@Slava_Sarzhan so i managed to get this working. The issue originally was that I was trying to define secrets instead of allowing percona to create them for itself. Things worked and I moved on. I came back today to do some maintenance and noticed that issue had come back.

NAME READY STATUS RESTARTS AGE
68e50-daily-backup-1627084800-ps25h 0/1 Completed 0 12h
68e50-sat-night-backup-1627084800-q2wx6 0/1 Completed 0 12h
cluster1-haproxy-0 1/2 Running 6 19m
cluster1-haproxy-1 1/2 CrashLoopBackOff 11903 53d
cluster1-haproxy-2 1/2 CrashLoopBackOff 11903 53d
cluster1-pxc-0 3/3 Running 19 53d
cluster1-pxc-1 2/3 CrashLoopBackOff 7 20m
percona-xtradb-cluster-operator-77bfd8cdc5-9r6vr 1/1 Running 1 53d
xb-cron-cluster1-s3-us-west-20210605000008-3d2dv-hxwjd 0/1 CreateContainerConfigError 0 45d

I am using calico v3.19.1 as shown below.
kubectl calico version
Client Version: v3.19.1
Git commit: 6fc0db96
Unable to retrieve Cluster Version or Type: resource does not exist: ClusterInformation(default) with error: the server could not find the requested resource (get ClusterInformations.crd.projectcalico.org default)

I did some more digging in the logs and found the following. It looks likes there was an attempt by galera to open a connection and that failed.

[0] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130432.413401552, {“log”=>“2021-07-24T12:40:32.412880Z 0 [Warning] [MY-000000] [Galera] last inactive check more than PT1.5S (3*evs.inactive_check_period) ago (PT3.50417S), skipping check”}]
[0] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130461.921485610, {“log”=>“2021-07-24T12:41:01.920841Z 0 [Note] [MY-000000] [Galera] PC protocol downgrade 1 → 0”}]
[1] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130461.921909299, {“log”=>“2021-07-24T12:41:01.921460Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node”}]
[2] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130461.921911826, {“log”=>“view ((empty))”}]
[3] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130461.922410405, {“log”=>“2021-07-24T12:41:01.922374Z 0 [ERROR] [MY-000000] [Galera] failed to open gcomm backend connection: 110: failed to reach primary view (pc.wait_prim_timeout): 110 (Connection timed out)”}]
[4] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130461.922412800, {“log”=>" at gcomm/src/pc.cpp:connect():161"}]
[5] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130461.922487606, {“log”=>“2021-07-24T12:41:01.922428Z 0 [ERROR] [MY-000000] [Galera] gcs/src/gcs_core.cpp:gcs_core_open():220: Failed to open backend connection: -110 (Connection timed out)”}]
[0] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130462.922868257, {“log”=>“2021-07-24T12:41:02.922714Z 0 [Note] [MY-000000] [Galera] gcomm: terminating thread”}]
[1] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130462.922874019, {“log”=>“2021-07-24T12:41:02.922822Z 0 [Note] [MY-000000] [Galera] gcomm: joining thread”}]
[2] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130462.923158562, {“log”=>“2021-07-24T12:41:02.923073Z 0 [ERROR] [MY-000000] [Galera] gcs/src/gcs.cpp:gcs_open():1754: Failed to open channel ‘cluster1-pxc’ at ‘gcomm://10.1.86.126’: -110 (Connection timed out)”}]
[3] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130462.923258685, {“log”=>“2021-07-24T12:41:02.923175Z 0 [ERROR] [MY-000000] [Galera] gcs connect failed: Connection timed out”}]
[4] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130462.923260499, {“log”=>“2021-07-24T12:41:02.923219Z 0 [ERROR] [MY-000000] [WSREP] Provider/Node (gcomm://10.1.86.126) failed to establish connection with cluster (reason: 7)”}]
[5] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130462.923345885, {“log”=>“2021-07-24T12:41:02.923255Z 0 [ERROR] [MY-010119] [Server] Aborting”}]
[6] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130462.923724901, {“log”=>“2021-07-24T12:41:02.923666Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.22-13.1) Percona XtraDB Cluster (GPL), Release rel13, Revision a48e6d5, WSREP version 26.4.3.”}]
[7] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130462.924285826, {“log”=>“2021-07-24T12:41:02.924248Z 0 [Note] [MY-000000] [Galera] dtor state: CLOSED”}]
[8] pxcluster.cluster1-pxc-1.mysqld-error.log: [1627130462.924356763, {“log”=>“2021-07-24T12:41:02.924329Z 0 [Note] [MY-000000] [Galera] MemPool(TrxHandleSlave): hit ratio: 0, misses: 0, in use: 0, in pool: 0”}]

1 Like

Thanks for the quick response. The problem was with coredns on minikube. I created a simple Pod to use as a test environment

apiVersion: v1
kind: Pod
metadata:
  name: dnsutils
  namespace: default
spec:
  containers:
  - name: dnsutils
    image: gcr.io/kubernetes-e2e-test-images/dnsutils:1.3
    command:
      - sleep
      - "3600"
    imagePullPolicy: IfNotPresent
  restartPolicy: Always
  nodeName: minikube-m02

Installed it on minikube-m02 to check DNS and I found out that dns does not work on minikube-m02
no found kubernetes.default


no found cluster1-pxc-0

I just reloaded CoreDNS on minikube

└$► kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME                      READY   STATUS    RESTARTS   AGE
coredns-74ff55c5b-hfklv   1/1     Running   0          38m
└$► kubectl delete coredns-74ff55c5b-hfklv -n kube-system

Percona XtraDB Cluster on Minikube (minikube start --driver=virtualbox --disable-driver-mounts --cpus=12 --memory=16096 --network-plugin=cni --cni=calico --nodes=3) is working

Just restart CoreDNS on minikube

It makes no difference to run the minicube on three nodes at once or add one at a time. CoreDNS work only minikube-m00

2 Likes

@Andrey just to give more insite. I am using ubuntu with charmed kubernetes. I check and my DNS server is working as expected. dns test is as follows:

kubectl exec -i -t dnsutils – nslookup cluster1-pxc.pxcluster
Server: 10.152.183.14
Address: 10.152.183.14#53

Name: cluster1-pxc.pxcluster.svc.cluster.local
Address: 10.1.86.126

1 Like

I don’t have much experience, but it seems to me that you still have a problem with DNS.

└$► kubectl exec -i -t dnsutils -- nslookup cluster1-pxc
Server:		10.96.0.10
Address:	10.96.0.10#53

Name:	cluster1-pxc.default.svc.cluster.local
Address: 10.244.205.194
Name:	cluster1-pxc.default.svc.cluster.local
Address: 10.244.151.3
Name:	cluster1-pxc.default.svc.cluster.local
Address: 10.244.120.66
1 Like

Not sure I see the issue. Normally you have to specify the name space of the pod to resolve it and in this case dnsutils us running in the default name space while percona cluster is running in pxcluster name space.

1 Like

is there any further suggestion on this that i can try?

1 Like

I have the same problem:

  • vanilla scaleway kapsule cluster v1.21.4
  • coredns CoreDNS-1.8.4
  • operator and db installed using helm charts
  • operator running in db namespace
 k -n db get deployments.apps pxc-operator
NAME           READY   UP-TO-DATE   AVAILABLE   AGE
pxc-operator   1/1     1            1           6m12s
❯ k -n db get deployments.apps pxc-operator -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    meta.helm.sh/release-name: pxc-operator
    meta.helm.sh/release-namespace: db
  creationTimestamp: "2021-10-11T13:37:59Z"
  generation: 1
  labels:
    app.kubernetes.io/instance: pxc-operator
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: pxc-operator
    app.kubernetes.io/version: 1.9.0
    helm.sh/chart: pxc-operator-1.9.1
  name: pxc-operator
  namespace: db
  resourceVersion: "46352088"
  uid: b8c837b2-778f-466c-8d50-7967368ec120
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/component: operator
      app.kubernetes.io/instance: pxc-operator
      app.kubernetes.io/name: pxc-operator
      app.kubernetes.io/part-of: pxc-operator
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app.kubernetes.io/component: operator
        app.kubernetes.io/instance: pxc-operator
        app.kubernetes.io/name: pxc-operator
        app.kubernetes.io/part-of: pxc-operator
    spec:
      containers:
      - command:
        - percona-xtradb-cluster-operator
        env:
        - name: WATCH_NAMESPACE
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: OPERATOR_NAME
          value: pxc-operator
        image: percona/percona-xtradb-cluster-operator:1.9.0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /metrics
            port: metrics
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: pxc-operator
        ports:
        - containerPort: 8080
          name: metrics
          protocol: TCP
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: pxc-operator
      serviceAccountName: pxc-operator
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2021-10-11T13:37:59Z"
    lastUpdateTime: "2021-10-11T13:37:59Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2021-10-11T13:37:59Z"
    lastUpdateTime: "2021-10-11T13:38:10Z"
    message: ReplicaSet "pxc-operator-5998c9b5cb" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1
  • pxc in men namespace
❯ k -n men get pxc -o yaml
apiVersion: v1
items:
- apiVersion: pxc.percona.com/v1
  kind: PerconaXtraDBCluster
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"pxc.percona.com/v1-9-0","kind":"PerconaXtraDBCluster"}
      meta.helm.sh/release-name: db
      meta.helm.sh/release-namespace: men
    creationTimestamp: "2021-10-11T13:38:47Z"
    finalizers:
    - delete-pxc-pods-in-order
    - delete-proxysql-pvc
    - delete-pxc-pvc
    generation: 2
    labels:
      app.kubernetes.io/instance: db
      app.kubernetes.io/managed-by: Helm
      app.kubernetes.io/name: pxc-db
      app.kubernetes.io/version: 1.9.0
      helm.sh/chart: pxc-db-1.9.1
    name: db-pxc-db
    namespace: men
    resourceVersion: "46354859"
    uid: 36eae74f-f7c4-4df0-8dff-1ccc2491eb68
  spec:
    backup:
      image: percona/percona-xtradb-cluster-operator:1.9.0-pxc8.0-backup
      imagePullPolicy: Always
      pitr:
        enabled: false
        storageName: ""
      schedule:
      - keep: 5
        name: daily-backup
        schedule: 0 0 * * *
        storageName: fs-pvc
      storages:
        fs-pvc:
          podSecurityContext:
            fsGroup: 1001
            supplementalGroups:
            - 1001
          s3:
            bucket: ""
            credentialsSecret: ""
          type: filesystem
          volume:
            persistentVolumeClaim:
              accessModes:
              - ReadWriteOnce
              resources:
                requests:
                  storage: 6Gi
    crVersion: 1.9.0
    enableCRValidationWebhook: false
    haproxy:
      affinity:
        antiAffinityTopologyKey: kubernetes.io/hostname
      enabled: true
      envVarsSecret: db-pxc-db-env-vars-haproxy
      gracePeriod: 30
      image: percona/percona-xtradb-cluster-operator:1.9.0-haproxy
      imagePullPolicy: Always
      livenessProbes:
        failureThreshold: 4
        initialDelaySeconds: 60
        periodSeconds: 30
        successThreshold: 1
        timeoutSeconds: 5
      podDisruptionBudget:
        maxUnavailable: 1
      readinessProbes:
        failureThreshold: 3
        initialDelaySeconds: 15
        periodSeconds: 5
        successThreshold: 1
        timeoutSeconds: 1
      resources:
        limits: {}
        requests:
          cpu: 600m
          memory: 1G
      serviceAccountName: default
      sidecarResources:
        limits: {}
        requests: {}
      size: 3
      volumeSpec:
        emptyDir: {}
    logCollectorSecretName: db-pxc-db-log-collector
    logcollector:
      enabled: true
      image: percona/percona-xtradb-cluster-operator:1.9.0-logcollector
      imagePullPolicy: Always
      resources:
        limits: {}
        requests: {}
    platform: kubernetes
    pmm:
      resources:
        limits: {}
        requests:
          cpu: 600m
          memory: 1G
    proxysql:
      livenessProbes: {}
      podSecurityContext:
        fsGroup: 1001
        supplementalGroups:
        - 1001
      readinessProbes: {}
    pxc:
      affinity:
        antiAffinityTopologyKey: kubernetes.io/hostname
      autoRecovery: true
      envVarsSecret: db-pxc-db-env-vars-pxc
      expose: {}
      gracePeriod: 600
      image: percona/percona-xtradb-cluster:8.0.23-14.1
      imagePullPolicy: Always
      livenessDelaySec: 300
      livenessProbes:
        failureThreshold: 3
        initialDelaySeconds: 300
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 5
      podDisruptionBudget:
        maxUnavailable: 1
      podSecurityContext:
        fsGroup: 1001
        supplementalGroups:
        - 1001
      readinessDelaySec: 15
      readinessProbes:
        failureThreshold: 5
        initialDelaySeconds: 15
        periodSeconds: 30
        successThreshold: 1
        timeoutSeconds: 15
      resources:
        limits: {}
        requests:
          cpu: 600m
          memory: 1G
      serviceAccountName: default
      sidecarResources:
        limits: {}
        requests: {}
      size: 3
      sslInternalSecretName: db-pxc-db-ssl-internal
      sslSecretName: db-pxc-db-ssl
      vaultSecretName: db-pxc-db-vault
      volumeSpec:
        emptyDir: {}
    secretsName: db-pxc-db
    sslInternalSecretName: db-pxc-db-ssl-internal
    sslSecretName: db-pxc-db-ssl
    updateStrategy: SmartUpdate
    upgradeOptions:
      apply: 8.0-recommended
      schedule: 0 4 * * *
      versionServiceEndpoint: https://check.percona.com
    vaultSecretName: db-pxc-db-vault
  status:
    backup:
      version: 8.0.23
    conditions:
    - lastTransitionTime: "2021-10-11T13:38:52Z"
      status: "True"
      type: initializing
    haproxy:
      labelSelectorPath: app.kubernetes.io/component=haproxy,app.kubernetes.io/instance=db-pxc-db,app.kubernetes.io/managed-by=percona-xtradb-cluster-operator,app.kubernetes.io/name=percona-xtradb-cluster,app.kubernetes.io/part-of=percona-xtradb-cluster
      size: 3
      status: initializing
    host: db-pxc-db-haproxy.men
    logcollector:
      version: 1.9.0
    observedGeneration: 2
    pmm:
      version: 2.18.0
    proxysql: {}
    pxc:
      image: percona/percona-xtradb-cluster:8.0.23-14.1
      labelSelectorPath: app.kubernetes.io/component=pxc,app.kubernetes.io/instance=db-pxc-db,app.kubernetes.io/managed-by=percona-xtradb-cluster-operator,app.kubernetes.io/name=percona-xtradb-cluster,app.kubernetes.io/part-of=percona-xtradb-cluster
      size: 3
      status: initializing
      version: 8.0.23-14.1
    ready: 0
    size: 6
    state: initializing
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
  • the cluster is not starting due to DNS resolution
❯ k -n men get pods
NAME                  READY   STATUS    RESTARTS   AGE
db-pxc-db-haproxy-0   1/2     Running   13         55m
db-pxc-db-pxc-0       2/3     Running   3          55m
❯ k -n men exec -ti dnsutils -- nslookup db-pxc-db-pxc
Server:         10.32.0.10
Address:        10.32.0.10#53

** server can't find db-pxc-db-pxc: NXDOMAIN

command terminated with exit code 1
❯ k -n men exec -ti dnsutils -- nslookup db-pxc-db-pxc-unready
Server:         10.32.0.10
Address:        10.32.0.10#53

** server can't find db-pxc-db-pxc-unready: NXDOMAIN

command terminated with exit code 1
❯ k logs db-pxc-db-pxc-0 -c pxc | tail
2021/10/11 14:35:54 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:35:55 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:35:56 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:35:57 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:35:58 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:35:59 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:36:00 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:36:01 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:36:02 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
2021/10/11 14:36:03 lookup db-pxc-db-pxc-unready on 10.32.0.10:53: no such host
❯ k -n men describe pod db-pxc-db-pxc-0 | tail -20
Events:
  Type     Reason     Age   From     Message
  ----     ------     ----  ----     -------
  Normal   Pulled     60m   kubelet  Successfully pulled image "percona/percona-xtradb-cluster-operator:1.9.0-logcollector" in 24.784727162s
  Normal   Created    60m   kubelet  Created container logs
  Normal   Started    60m   kubelet  Started container logs
  Normal   Pulling    60m   kubelet  Pulling image "percona/percona-xtradb-cluster-operator:1.9.0-logcollector"
  Normal   Pulled     60m   kubelet  Successfully pulled image "percona/percona-xtradb-cluster-operator:1.9.0-logcollector" in 1.942397197s
  Normal   Created    60m   kubelet  Created container logrotate
  Normal   Pulling    60m   kubelet  Pulling image "percona/percona-xtradb-cluster:8.0.23-14.1"
  Normal   Started    60m   kubelet  Started container logrotate
  Normal   Pulled     60m   kubelet  Successfully pulled image "percona/percona-xtradb-cluster:8.0.23-14.1" in 23.794414542s
  Normal   Created    60m   kubelet  Created container pxc
  Normal   Started    60m   kubelet  Started container pxc
  Warning  Unhealthy  55m   kubelet  Liveness probe failed: ERROR 2003 (HY000): Can't connect to MySQL server on 'db-pxc-db-pxc-0' (111)
+ [[ -n '' ]]
+ exit 1
  Warning  Unhealthy  40s (x116 over 59m)  kubelet  Readiness probe failed: ERROR 2003 (HY000): Can't connect to MySQL server on 'db-pxc-db-pxc-0' (111)
+ [[ '' == \P\r\i\m\a\r\y ]]
+ exit 1

Cluster dns resolution seems fine otherwise and endpoints are populated fine

❯ k -n men get endpoints db-pxc-db-pxc-unready
NAME                    ENDPOINTS                                                    AGE
db-pxc-db-pxc-unready   100.64.142.70:33062,100.64.142.70:33060,100.64.142.70:3306   64m

What I think might be the problem is that PXC statefulset’s serviceName field does not match the “unready” service name

❯ k -n men get statefulset db-pxc-db-pxc -o jsonpath='{.spec.serviceName}'
db-pxc-db-pxc 

Similar issue is described here

and it is how it’s done for proxysql statefulset which in my tests starts just fine.

Any help appreciated.

Regards

1 Like

@pavloos Please have a look at this commit K8SPXC-876 use PublishNotReadyAddresses instead of annotation · percona/percona-xtradb-cluster-operator@808ce22 · GitHub

This issue was fixed for 1.10.0. As a workaround you can edit service kubectl edit svc/${clustername}-pxc-unready manually and add publishNotReadyAddresses: true to the spec.

1 Like

Thanks @Slava_Sarzhan. That’s great news. When is 1.10.0 planned to be released?

Cheers

1 Like

@pavloos We have a plan to release it at the beginning of November.

2 Likes

Hello

I`m using percona-xtradb-cluster-operator:1.11.0
But after database restoration, cluster freezes in initialization status

Only 1 of 3 pxc pods are created, the next one is in crashloopbackoff statut and can`t connect to pxc-unready svc

I can fix this issue and start cluster only if i delete backup up information from cluster manifest

Could you tell me please what am i doing wrong?

@Vsevosemnog Can you reproduce this issue using the latest PXC operator, v1.12.0?

@Vsevosemnog, we need to know more about your k8s deployment and also need to have your CR.