Percona operator V2.2 for postgresql: first deployment in "initializing" status for 21 hours

[ context : i would like to evaluate / POC the postgresql operator of percona version 2.2.0
i did a fresh installation on our K8S environment.
but i think i have missed something in the CR.yaml.

]

Steps to Reproduce:

[1/install operator and deploy postgres cluster following Installaton docs
2/ adjusted : S3 secret and and keys : created the secret and applied
]

Version:

[percona-postgresql-operator 2.2.0]

Logs:

[time=“2023-11-03T16:30:42Z” level=error msg=“unable to create stanza” PostgresCluster=ddba70-ns-postgresql/ddba70-cluster-postgr controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster error=“command terminated with exit code 49: ERROR: [049]: unable to get address for ‘ddba70-cluster-postgr-repo-host-0.ddba70-cluster-postgr-pods.ddba70-ns-postgresql.svc.infra.vln.pack.’: [-2] Name or service not known\n” file=“internal/controller/postgrescluster/pgbackrest.go:2614” func=“postgrescluster.(*Reconciler).reconcileStanzaCreate” name=ddba70-cluster-postgr namespace=ddba70-ns-postgresql reconcileID=7ff42a4d-fe54-49cb-920a-86aa010b4406 reconciler=pgBackRest version=
time=“2023-11-03T16:30:42Z” level=debug msg=“command terminated with exit code 49: ERROR: [049]: unable to get address for ‘ddba70-cluster-postgr-repo-host-0.ddba70-cluster-postgr-pods.ddba70-ns-postgresql.svc.infra.vln.pack.’: [-2] Name or service not known\n” object=“{PostgresCluster ddba70-ns-postgresql ddba70-cluster-postgr ddf42c0c-a59b-435f-920a-7b1959909bb6 postgres-operator.crunchydata.com/v1beta1 439832221 }” reason=UnableToCreateStanzas type=Warning version=]

Expected Result:

[ the cluster was expected to be deployed sucessfulley and running ]

Actual Result:

[1/ cluster is in initializing status since 21 hours
2/ pod ddba70-cluster-postgr-repo-host is in “pending” status since 21h
3/ error concerning stanza in operator logs ]

Additional Information:

[etting the following errors in the postgreq operator pod logs:
time=“2023-11-03T15:57:53Z” level=error msg=“unable to create stanza” PostgresCluster=ddba70-ns-postgresql/ddba70-cluster-postgr controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster error=“command terminated with exit code 49: ERROR: [049]: unable to get address for ‘ddba70-cluster-postgr-repo-host-0.ddba70-cluster-postgr-pods.ddba70-ns-postgresql.svc.infra.vln.pack.’: [-2] Name or service not known\n” file=“internal/controller/postgrescluster/pgbackrest.go:2614” func=“postgrescluster.(*Reconciler).reconcileStanzaCreate” name=ddba70-cluster-postgr namespace=ddba70-ns-postgresql reconcileID=dd71554d-7b58-4c54-b5de-004c3d89cacc reconciler=pgBackRest version=## Description:

────────── Logs(ddba70-ns-postgresql/ddba70-cluster-postgr-repo-host-0)[1m] ────────────────────────────────────────────────────────────────────────────────────┐
│ Autoscroll:On FullScreen:Off Timestamps:On Wrap:Off │
│ 2023-11-03T16:21:02.64186395Z Stream closed EOF for ddba70-ns-postgresql/ddba70-cluster-postgr-repo-host-0 (nss-wrapper-init) │
│ 2023-11-03T16:21:02.642047419Z Stream closed EOF for ddba70-ns-postgresql/ddba70-cluster-postgr-repo-host-0 (pgbackrest-config) │
│ 2023-11-03T16:21:02.64208106Z Stream closed EOF for ddba70-ns-postgresql/ddba70-cluster-postgr-repo-host-0 (pgbackrest) │
│ 2023-11-03T16:21:02.642168203Z Stream closed EOF for ddba70-ns-postgresql/ddba70-cluster-postgr-repo-host-0 (pgbackrest-log-dir) │
verifications 2/ k get all
the following pod is “PENDING” since 21h

details:
k get all
NAME READY STATUS RESTARTS AGE
pod/ddba70-cluster-postgr-repo-host-0 0/2 Pending 0 21h
pod/ddba70-cluster-postgr-rs0-99lj-0 4/4 Running 0 21h
pod/ddba70-cluster-postgr-rs0-bw47-0 4/4 Running 0 21h
pod/ddba70-cluster-postgr-rs0-f9lx-0 4/4 Running 0 21h
pod/percona-postgresql-operator-6677f7664c-lb7rj 1/1 Running 0 22h

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ddba70-cluster-postgr-ha ClusterIP 192.168.1.196 5432/TCP 21h
service/ddba70-cluster-postgr-ha-config ClusterIP None 21h
service/ddba70-cluster-postgr-pgbouncer ClusterIP 192.168.1.46 5432/TCP 21h
service/ddba70-cluster-postgr-pods ClusterIP None 21h
service/ddba70-cluster-postgr-primary ClusterIP None 5432/TCP 21h
service/ddba70-cluster-postgr-replicas ClusterIP 192.168.1.2 5432/TCP 21h

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/ddba70-cluster-postgr-pgbouncer 0/0 0 0 21h
deployment.apps/percona-postgresql-operator 1/1 1 1 22h

NAME DESIRED CURRENT READY AGE
replicaset.apps/ddba70-cluster-postgr-pgbouncer-77996f89b 0 0 0 21h
replicaset.apps/percona-postgresql-operator-6677f7664c 1 1 1 22h

NAME READY AGE
statefulset.apps/ddba70-cluster-postgr-repo-host 0/1 21h
statefulset.apps/ddba70-cluster-postgr-rs0-99lj 1/1 21h
statefulset.apps/ddba70-cluster-postgr-rs0-bw47 1/1 21h
statefulset.apps/ddba70-cluster-postgr-rs0-f9lx 1/1 21h

verification 2: k get pg
==> status “INITIALIZING” since 21h

NAME ENDPOINT STATUS POSTGRES PGBOUNCER AGE
ddba70-cluster-postgr ddba70-cluster-postgr-pgbouncer.ddba70-ns-postgresql.svc initializing 3 22h

Have anyone met this situation ? and can get an advice ?

best regards
frmch

Hello @frmch .

Could you please describe the pgbackrest repo pod?
kubectl describe pod ddba70-cluster-postgr-repo-host-0

Hi Sergey,

i have tested percona mongodb operator for the momentor

i am actually testing the percona postgresql operator for K8S

As a starter with this postgresql operator,

the problem was solved by adding the
"storageClassName in the folowing section:

repos:

  • name: repo1
    s3:
    bucket:
    endpoint:
    region:
    schedules:
    full: “0 0 * * 6”

differential: “0 1 * * 1-6”

volume:
volumeClaimSpec:
accessModes:

  • ReadWriteOnce
    resources:
    requests:
    storage: 1Gi
    storageClassName: <===== that was missing

The point is that i am a starter with the postgresql operator.
and i got to the solution by several trial and errors

Have you a complete example : cr.yaml
that is more practical to use and facilitates the understanding of the several sections of the CR.yaml

That will help a lot to be able and facilitate the adoptiono of the operator.

It’s just a kind suggestion from me.

Apart from that, it is hard as starter to have a global functionnig of the different components : => i do not find a document that talks about

the role and necessity of :
pgbouncer
patroni :
backup pods : eache time we see appearinig a pod after each backup that run in status ‘COMPLETE’
pgbackuprest

what are the role of the the different services ?
exemple :

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ddbaXX-cluster-postgres-ha ClusterIP 192.168.1.181 5432/TCP 9h
service/ddbaXX-cluster-postgres-ha-config ClusterIP None 9h
service/ddbaXX-cluster-postgres-pgbouncer ClusterIP 192.168.1.62 5432/TCP 9h
service/ddbaXX-cluster-postgres-pods ClusterIP None 9h
service/ddbaXX-cluster-postgres-primary ClusterIP None 5432/TCP 9h
service/ddbaXX-cluster-postgres-replicas ClusterIP 192.168.1.208 5432/TCP 9h

How to connect remotely to the Leader ? the standby
how to configure “SYNC” and “ASYNC” standby ?

Multiple “repos:” : why and how ?

Is thers any document or notes that can help to start ?

regards

@frmch

Hello @frmch ,

this is valuable feedback. Have you tried to check our docs here: Percona Operator for PostgreSQL

It is a bit strange that the cluster was not starting without StorageClassName, as our operator uses default storage class by default.

As for services, this doc talks about it: Exposing the cluster - Percona Operator for PostgreSQL

Repos are described in backup and restore docs: About backups - Percona Operator for PostgreSQL

1 Like