The psmdb resource is stuck in initializing.
I tried pausing and unpasusing the cluster, the operator gives following error and goes into initializing again.
2023-01-12T10:22:32.954Z ERROR controller_psmdb failed to reconcile cluster {"Request.Namespace": "xcloud-psmdb", "Request.Name": "xcloud-psmdb-db", "replset": "rs", "error": "create system users: failed to get mongo client: ping mongo: connection() error occured during connection handshake: dial tcp: lookup xcloud-psmdb-db-rs-1.xcloud-psmdb-db-rs.xcloud-psmdb.svc.cluster.local on 10.0.0.10:53: no such host", "errorVerbose": "connection() error occured during connection handshake: dial tcp: lookup xcloud-psmdb-db-rs-1.xcloud-psmdb-db-rs.xcloud-psmdb.svc.cluster.local on 10.0.0.10:53: no such host\nping mongo\ngithub.com/percona/percona-server-mongodb-operator/pkg/psmdb/mongo.Dial\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/psmdb/mongo/mongo.go:64\ngithub.com/percona/percona-server-mongodb-operator/pkg/psmdb.MongoClient\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/psmdb/client.go:47\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).mongoClientWithRole\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/connections.go:21\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).createOrUpdateSystemUsers\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:671\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).reconcileCluster\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:134\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:499\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1571\nfailed to get mongo client\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).createOrUpdateSystemUsers\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:673\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).reconcileCluster\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:134\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:499\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1571\ncreate system users\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).reconcileCluster\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:136\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:499\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1571"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227
1 Like
Hey @Shreyas_Pandya ,
please share more details about your deployment.
k8s version, operator version, custom resource manifest, what led to this situation, etc.
1 Like
Hi @Sergey_Pronin,
We are facing this issue in multiple deployments of psmdb. The CRD runs fine for a few days and then the status changes to initializing suddenly. unfortunately we can’t provide any operator logs on this since this issue happened long ago.
k8s version: v1.22.11
these are our helm charts
allowUnsafeConfigurations: false
backup:
enabled: true
image:
repository: percona/percona-backup-mongodb
tag: 1.8.1
pitr:
compressionLevel: 6
compressionType: gzip
enabled: true
oplogSpanMin: 4
serviceAccountName: percona-server-mongodb-operator
storages:
azure-blob:
azure:
container: percona
credentialsSecret: percona-backup-sa-creds
prefix: scheduled
type: azure
tasks:
- compressionType: gzip
enabled: true
keep: 30
name: daily-backup
schedule: 30 0 * * *
storageName: azure-blob
- compressionType: gzip
enabled: true
keep: 13
name: weekly-backup
schedule: 30 0 * * 0
storageName: azure-blob
finalizers:
- delete-psmdb-pods-in-order
image:
repository: percona/percona-server-mongodb
tag: 4.4.16-16
pause: false
replsets:
- antiAffinityTopologyKey: kubernetes.io/hostname
name: rs
nodeSelector:
app: xcmongo
resources:
limits:
cpu: 4000m
memory: 16G
requests:
cpu: 2000m
memory: 12G
size: 3
volumeSpec:
pvc:
resources:
requests:
storage: 30Gi
secrets:
users: redacted-psmdb-db-secrets
sharding:
enabled: false
updateStrategy: SmartUpdate
upgradeOptions:
apply: disabled
schedule: 0 2 * * *
setFCV: false
versionServiceEndpoint: https://check.percona.com
users:
MONGODB_BACKUP_PASSWORD: redacted
MONGODB_BACKUP_USER: redacted
MONGODB_CLUSTER_ADMIN_PASSWORD: redacted
MONGODB_CLUSTER_ADMIN_USER: redacted
MONGODB_CLUSTER_MONITOR_PASSWORD: redacted
MONGODB_CLUSTER_MONITOR_USER: redacted
MONGODB_DATABASE_ADMIN_PASSWORD: redacted
MONGODB_DATABASE_ADMIN_USER: redacted
MONGODB_USER_ADMIN_PASSWORD: redacted
MONGODB_USER_ADMIN_USER: redacted
PMM_SERVER_API_KEY: redacted
kubectl describe on psmdb crd:
Name: redacted-psmdb-db
Namespace: redacted-psmdb
Labels: app.kubernetes.io/instance=redacted
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=psmdb-db
app.kubernetes.io/version=1.13.0
helm.sh/chart=psmdb-db-1.13.0
Annotations: meta.helm.sh/release-name: redacted
meta.helm.sh/release-namespace: redacted-psmdb
API Version: psmdb.percona.com/v1
Kind: PerconaServerMongoDB
Metadata:
Creation Timestamp: 2023-07-12T09:20:10Z
Finalizers:
delete-psmdb-pods-in-order
Generation: 1
Managed Fields:
API Version: psmdb.percona.com/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubectl.kubernetes.io/last-applied-configuration:
f:meta.helm.sh/release-name:
f:meta.helm.sh/release-namespace:
f:finalizers:
.:
v:"delete-psmdb-pods-in-order":
f:labels:
.:
f:app.kubernetes.io/instance:
f:app.kubernetes.io/managed-by:
f:app.kubernetes.io/name:
f:app.kubernetes.io/version:
f:helm.sh/chart:
f:spec:
.:
f:backup:
.:
f:enabled:
f:image:
f:pitr:
.:
f:compressionLevel:
f:compressionType:
f:enabled:
f:oplogSpanMin:
f:serviceAccountName:
f:storages:
.:
f:azure-blob:
.:
f:azure:
.:
f:container:
f:credentialsSecret:
f:prefix:
f:type:
f:tasks:
f:crVersion:
f:image:
f:imagePullPolicy:
f:multiCluster:
.:
f:enabled:
f:pause:
f:pmm:
.:
f:enabled:
f:image:
f:serverHost:
f:replsets:
f:secrets:
.:
f:users:
f:sharding:
.:
f:configsvrReplSet:
.:
f:affinity:
.:
f:antiAffinityTopologyKey:
f:expose:
.:
f:enabled:
f:exposeType:
f:podDisruptionBudget:
.:
f:maxUnavailable:
f:resources:
.:
f:limits:
.:
f:cpu:
f:memory:
f:requests:
.:
f:cpu:
f:memory:
f:size:
f:volumeSpec:
.:
f:persistentVolumeClaim:
.:
f:resources:
.:
f:requests:
.:
f:storage:
f:enabled:
f:mongos:
.:
f:affinity:
.:
f:antiAffinityTopologyKey:
f:expose:
.:
f:exposeType:
f:podDisruptionBudget:
.:
f:maxUnavailable:
f:resources:
.:
f:limits:
.:
f:cpu:
f:memory:
f:requests:
.:
f:cpu:
f:memory:
f:size:
f:unmanaged:
f:updateStrategy:
f:upgradeOptions:
.:
f:apply:
f:schedule:
f:setFCV:
f:versionServiceEndpoint:
Manager: terraform-provider-helm_v2.8.0_x5
Operation: Update
Time: 2023-07-12T09:20:10Z
API Version: psmdb.percona.com/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:conditions:
f:host:
f:mongoImage:
f:mongoVersion:
f:observedGeneration:
f:ready:
f:replsets:
.:
f:rs:
.:
f:initialized:
f:ready:
f:size:
f:status:
f:size:
f:state:
Manager: percona-server-mongodb-operator
Operation: Update
Subresource: status
Time: 2023-07-18T15:50:35Z
Resource Version: 172633694
UID: 3b8f5c92-e5f5-4b4d-b84a-e7e6bfea7224
Spec:
Backup:
Enabled: true
Image: percona/percona-backup-mongodb:1.8.1
Pitr:
Compression Level: 6
Compression Type: gzip
Enabled: true
Oplog Span Min: 4
Service Account Name: percona-server-mongodb-operator
Storages:
Azure - Blob:
Azure:
Container: percona
Credentials Secret: percona-backup-sa-creds
Prefix: scheduled
Type: azure
Tasks:
Compression Type: gzip
Enabled: true
Keep: 30
Name: daily-backup
Schedule: 30 0 * * *
Storage Name: azure-blob
Compression Type: gzip
Enabled: true
Keep: 13
Name: weekly-backup
Schedule: 30 0 * * 0
Storage Name: azure-blob
Cr Version: 1.13.0
Image: percona/percona-server-mongodb:4.4.16-16
Image Pull Policy: Always
Multi Cluster:
Enabled: false
Pause: false
Pmm:
Enabled: false
Image: percona/pmm-client:2.30.0
Server Host: monitoring-service
Replsets:
Affinity:
Anti Affinity Topology Key: kubernetes.io/hostname
Name: rs
Node Selector:
App: xcmongo
Resources:
Limits:
Cpu: 4000m
Memory: 16G
Requests:
Cpu: 2000m
Memory: 12G
Size: 3
Volume Spec:
Persistent Volume Claim:
Resources:
Requests:
Storage: 30Gi
Secrets:
Users: redacted-psmdb-db-secrets
Sharding:
Configsvr Repl Set:
Affinity:
Anti Affinity Topology Key: kubernetes.io/hostname
Expose:
Enabled: false
Expose Type: ClusterIP
Pod Disruption Budget:
Max Unavailable: 1
Resources:
Limits:
Cpu: 300m
Memory: 0.5G
Requests:
Cpu: 300m
Memory: 0.5G
Size: 3
Volume Spec:
Persistent Volume Claim:
Resources:
Requests:
Storage: 3Gi
Enabled: false
Mongos:
Affinity:
Anti Affinity Topology Key: kubernetes.io/hostname
Expose:
Expose Type: ClusterIP
Pod Disruption Budget:
Max Unavailable: 1
Resources:
Limits:
Cpu: 300m
Memory: 0.5G
Requests:
Cpu: 300m
Memory: 0.5G
Size: 2
Unmanaged: false
Update Strategy: SmartUpdate
Upgrade Options:
Apply: disabled
Schedule: 0 2 * * *
Set FCV: false
Version Service Endpoint: https://check.percona.com
Status:
Conditions:
Last Transition Time: 2023-07-27T18:51:16Z
Message: rs: ready
Reason: RSReady
Status: True
Type: ready
Last Transition Time: 2023-07-27T18:51:16Z
Status: True
Type: initializing
Last Transition Time: 2023-07-28T09:50:44Z
Message: rs: ready
Reason: RSReady
Status: True
Type: ready
Last Transition Time: 2023-07-28T09:50:44Z
Status: True
Type: initializing
Last Transition Time: 2023-07-28T21:53:07Z
Message: rs: ready
Reason: RSReady
Status: True
Type: ready
Last Transition Time: 2023-07-28T21:53:07Z
Status: True
Type: initializing
Last Transition Time: 2023-07-29T12:53:26Z
Message: rs: ready
Reason: RSReady
Status: True
Type: ready
Last Transition Time: 2023-07-29T12:53:26Z
Status: True
Type: initializing
Last Transition Time: 2023-07-30T09:52:41Z
Message: rs: ready
Reason: RSReady
Status: True
Type: ready
Last Transition Time: 2023-07-30T09:52:41Z
Status: True
Type: initializing
Last Transition Time: 2023-07-30T21:53:07Z
Message: rs: ready
Reason: RSReady
Status: True
Type: ready
Last Transition Time: 2023-07-30T21:53:07Z
Status: True
Type: initializing
Last Transition Time: 2023-07-31T10:08:43Z
Message: rs: ready
Reason: RSReady
Status: True
Type: ready
Last Transition Time: 2023-07-31T10:08:43Z
Status: True
Type: initializing
Last Transition Time: 2023-08-01T00:52:06Z
Message: rs: ready
Reason: RSReady
Status: True
Type: ready
Last Transition Time: 2023-08-01T00:52:06Z
Status: True
Type: initializing
Last Transition Time: 2023-08-01T12:52:36Z
Message: rs: ready
Reason: RSReady
Status: True
Type: ready
Last Transition Time: 2023-08-01T12:52:36Z
Status: True
Type: initializing
Last Transition Time: 2023-08-02T03:52:26Z
Message: rs: ready
Reason: RSReady
Status: True
Type: ready
Last Transition Time: 2023-08-02T03:52:26Z
Status: True
Type: initializing
Host: redacted-psmdb-db-rs.redacted-psmdb.svc.cluster.local
Mongo Image: percona/percona-server-mongodb:4.4.16-16
Mongo Version: 4.4.16-16
Observed Generation: 1
Ready: 3
Replsets:
Rs:
Initialized: true
Ready: 3
Size: 3
Status: ready
Size: 3
State: initializing
Events: <none>
Initializing state indicates usually that some of the components are not ready. For example, if you restart even one node in a replica set - you get initializing state.
Is it going back to ready state after some time or it is stuck in initializing?
It is stuck in initializing state since 21 days even though the statefulset is in ready state for all three replicas.