my cr.yaml:
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDB
metadata:
name: my-cluster-name
finalizers:
- delete-psmdb-pods-in-order
# - delete-psmdb-pvc
spec:
crVersion: 1.13.0
image: percona/percona-server-mongodb:5.0.11-10
imagePullPolicy: Always
allowUnsafeConfigurations: false
....
backup:
enabled: true
image: percona/percona-backup-mongodb:1.8.1
serviceAccountName: percona-server-mongodb-operator
....
when ‘backup’ is true, errors occured:
Reconciler error {“name”: “my-cluster-elvis2”, “namespace”: “default”, “error”: “create pbm object: create PBM connection to my-cluster-elvis2-rs0-0.my-cluster-elvis2-rs0.default.svc.cluster.local:27017,my-cluster-elvis2-rs0-1.my-cluster-elvis2-rs0.default.svc.cluster.local:27017,my-cluster-elvis2-rs0-2.my-cluster-elvis2-rs0.default.svc.cluster.local:27017: create mongo connection: mongo ping: server selection error: server selection timeout, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: 10.181.4.179:30862, Type: Unknown, Last error: connection() error occured during connection handshake: x509: cannot validate certificate for 10.181.4.179 because it doesn’t contain any IP SANs }, { Addr: 10.181.4.29:31197, Type: Unknown, Last error: connection() error occured during connection handshake: x509: cannot validate certificate for 10.181.4.29 because it doesn’t contain any IP SANs }, { Addr: 10.181.4.214:30971, Type: Unknown, Last error: connection() error occured during connection handshake: x509: cannot validate certificate for 10.181.4.214 because it doesn’t contain any IP SANs }, ] }”, “errorVerbose”: "server selection error: server selection timeout, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: 10.181.4.179:30862, Type: Unknown, Last error: connection() error occured during connection handshake: x509: cannot validate certificate for 10.181.4.179 because it doesn’t contain any IP SANs }, { Addr: 10.181.4.29:31197, Type: Unknown, Last error: connection() error occured during connection handshake: x509: cannot validate certificate for 10.181.4.29 because it doesn’t contain any IP SANs }, { Addr: 10.181.4.214:30971, Type: Unknown, Last error: connection() error occured during connection handshake: x509: cannot validate certificate for 10.181.4.214 because it doesn’t contain any IP SANs }
2 Likes
Hey @Worlder_Mo ,
how can I reproduce it?
I deployed operator version 1.13 and default YAML manifest (cr.yaml), but disabled backups.
Enabled it then. The Pods were restarted and all is up now.
I don’t see any issues, nor I see this error in the logs.
1 Like
I didn’t do anything, I just enabled the backup option when I deployed cr.yaml, and the errors occused.
This should be an error with TLS certificates
1 Like
@Worlder_Mo can you share the cr?
1 Like
I used cr.yaml file of the official sample document.
1 Like
@Sergey_Pronin But I enabled the expose of replset.
This is the cr I used:
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDB
metadata:
name: my-cluster-name
finalizers:
- delete-psmdb-pods-in-order
- delete-psmdb-pvc
spec:
# platform: openshift
# clusterServiceDNSSuffix: svc.cluster.local
# clusterServiceDNSMode: "Internal"
# pause: true
# unmanaged: false
crVersion: 1.13.0
image: percona/percona-server-mongodb:5.0.11-10
imagePullPolicy: IfNotPresent
# tls:
# # 90 days in hours
# certValidityDuration: 2160h
# imagePullSecrets:
# - name: private-registry-credentials
allowUnsafeConfigurations: false
updateStrategy: SmartUpdate
# multiCluster:
# enabled: true
# DNSSuffix: svc.clusterset.local
upgradeOptions:
versionServiceEndpoint: https://check.percona.com
apply: disabled
schedule: "0 2 * * *"
setFCV: false
secrets:
users: my-cluster-name-secrets
encryptionKey: my-cluster-name-mongodb-encryption-key
# vault: my-cluster-name-vault
pmm:
enabled: false
image: percona/pmm-client:2.30.0
serverHost: monitoring-service
# mongodParams: --environment=ENVIRONMENT
# mongosParams: --environment=ENVIRONMENT
replsets:
- name: rs0
size: 3
# externalNodes:
# - host: 34.124.76.90
# - host: 34.124.76.91
# port: 27017
# votes: 0
# priority: 0
# - host: 34.124.76.92
# # for more configuration fields refer to https://docs.mongodb.com/manual/reference/configuration-options/
# configuration: |
# net:
# tls:
# mode: preferTLS
# operationProfiling:
# mode: slowOp
# systemLog:
# verbosity: 1
# storage:
# engine: wiredTiger
# wiredTiger:
# engineConfig:
# directoryForIndexes: false
# journalCompressor: snappy
# collectionConfig:
# blockCompressor: snappy
# indexConfig:
# prefixCompression: true
affinity:
antiAffinityTopologyKey: "kubernetes.io/hostname"
# advanced:
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: kubernetes.io/e2e-az-name
# operator: In
# values:
# - e2e-az1
# - e2e-az2
# tolerations:
# - key: "node.alpha.kubernetes.io/unreachable"
# operator: "Exists"
# effect: "NoExecute"
# tolerationSeconds: 6000
# priorityClassName: high-priority
# annotations:
# iam.amazonaws.com/role: role-arn
# labels:
# rack: rack-22
# nodeSelector:
# disktype: ssd
# storage:
# engine: wiredTiger
# wiredTiger:
# engineConfig:
# cacheSizeRatio: 0.5
# directoryForIndexes: false
# journalCompressor: snappy
# collectionConfig:
# blockCompressor: snappy
# indexConfig:
# prefixCompression: true
# inMemory:
# engineConfig:
# inMemorySizeRatio: 0.5
# livenessProbe:
# failureThreshold: 4
# initialDelaySeconds: 60
# periodSeconds: 30
# timeoutSeconds: 10
# startupDelaySeconds: 7200
# readinessProbe:
# failureThreshold: 8
# initialDelaySeconds: 10
# periodSeconds: 3
# successThreshold: 1
# timeoutSeconds: 2
# runtimeClassName: image-rc
# sidecars:
# - image: busybox
# command: ["/bin/sh"]
# args: ["-c", "while true; do echo echo $(date -u) 'test' >> /dev/null; sleep 5;done"]
# name: rs-sidecar-1
# volumeMounts:
# - mountPath: /volume1
# name: sidecar-volume-claim
# - mountPath: /secret
# name: sidecar-secret
# - mountPath: /configmap
# name: sidecar-config
# sidecarVolumes:
# - name: sidecar-secret
# secret:
# secretName: mysecret
# - name: sidecar-config
# configMap:
# name: myconfigmap
# sidecarPVCs:
# - apiVersion: v1
# kind: PersistentVolumeClaim
# metadata:
# name: sidecar-volume-claim
# spec:
# resources:
# requests:
# storage: 1Gi
# volumeMode: Filesystem
# accessModes:
# - ReadWriteOnce
podDisruptionBudget:
maxUnavailable: 1
# minAvailable: 0
expose:
enabled: true
exposeType: NodePort
# loadBalancerSourceRanges:
# - 10.0.0.0/8
# serviceAnnotations:
# service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
# serviceLabels:
# rack: rack-22
resources:
limits:
cpu: "300m"
memory: "0.5G"
requests:
cpu: "300m"
memory: "0.5G"
volumeSpec:
# emptyDir: {}
# hostPath:
# path: /data
# type: Directory
persistentVolumeClaim:
# storageClassName: standard
# accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 3Gi
nonvoting:
enabled: false
# podSecurityContext: {}
# containerSecurityContext: {}
size: 3
# # for more configuration fields refer to https://docs.mongodb.com/manual/reference/configuration-options/
# configuration: |
# operationProfiling:
# mode: slowOp
# systemLog:
# verbosity: 1
affinity:
antiAffinityTopologyKey: "kubernetes.io/hostname"
# advanced:
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: kubernetes.io/e2e-az-name
# operator: In
# values:
# - e2e-az1
# - e2e-az2
# tolerations:
# - key: "node.alpha.kubernetes.io/unreachable"
# operator: "Exists"
# effect: "NoExecute"
# tolerationSeconds: 6000
# priorityClassName: high-priority
# annotations:
# iam.amazonaws.com/role: role-arn
# labels:
# rack: rack-22
# nodeSelector:
# disktype: ssd
podDisruptionBudget:
maxUnavailable: 1
# minAvailable: 0
resources:
limits:
cpu: "300m"
memory: "0.5G"
requests:
cpu: "300m"
memory: "0.5G"
volumeSpec:
# emptyDir: {}
# hostPath:
# path: /data
# type: Directory
persistentVolumeClaim:
# storageClassName: standard
# accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 3Gi
arbiter:
enabled: false
size: 1
affinity:
antiAffinityTopologyKey: "kubernetes.io/hostname"
# advanced:
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: kubernetes.io/e2e-az-name
# operator: In
# values:
# - e2e-az1
# - e2e-az2
# tolerations:
# - key: "node.alpha.kubernetes.io/unreachable"
# operator: "Exists"
# effect: "NoExecute"
# tolerationSeconds: 6000
# priorityClassName: high-priority
# annotations:
# iam.amazonaws.com/role: role-arn
# labels:
# rack: rack-22
# nodeSelector:
# disktype: ssd
sharding:
enabled: false
configsvrReplSet:
size: 3
# externalNodes:
# - host: 34.124.76.93
# - host: 34.124.76.94
# port: 27017
# votes: 0
# priority: 0
# - host: 34.124.76.95
# # for more configuration fields refer to https://docs.mongodb.com/manual/reference/configuration-options/
# configuration: |
# operationProfiling:
# mode: slowOp
# systemLog:
# verbosity: 1
affinity:
antiAffinityTopologyKey: "kubernetes.io/hostname"
# advanced:
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: kubernetes.io/e2e-az-name
# operator: In
# values:
# - e2e-az1
# - e2e-az2
# tolerations:
# - key: "node.alpha.kubernetes.io/unreachable"
# operator: "Exists"
# effect: "NoExecute"
# tolerationSeconds: 6000
# priorityClassName: high-priority
# annotations:
# iam.amazonaws.com/role: role-arn
# labels:
# rack: rack-22
# nodeSelector:
# disktype: ssd
# livenessProbe:
# failureThreshold: 4
# initialDelaySeconds: 60
# periodSeconds: 30
# timeoutSeconds: 10
# startupDelaySeconds: 7200
# readinessProbe:
# failureThreshold: 3
# initialDelaySeconds: 10
# periodSeconds: 3
# successThreshold: 1
# timeoutSeconds: 2
# runtimeClassName: image-rc
# sidecars:
# - image: busybox
# command: ["/bin/sh"]
# args: ["-c", "while true; do echo echo $(date -u) 'test' >> /dev/null; sleep 5;done"]
# name: rs-sidecar-1
podDisruptionBudget:
maxUnavailable: 1
expose:
enabled: false
exposeType: ClusterIP
# loadBalancerSourceRanges:
# - 10.0.0.0/8
# serviceAnnotations:
# service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
# serviceLabels:
# rack: rack-22
resources:
limits:
cpu: "300m"
memory: "0.5G"
requests:
cpu: "300m"
memory: "0.5G"
volumeSpec:
# emptyDir: {}
# hostPath:
# path: /data
# type: Directory
persistentVolumeClaim:
# storageClassName: standard
# accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 3Gi
mongos:
size: 3
# # for more configuration fields refer to https://docs.mongodb.com/manual/reference/configuration-options/
# configuration: |
# systemLog:
# verbosity: 1
affinity:
antiAffinityTopologyKey: "kubernetes.io/hostname"
# advanced:
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: kubernetes.io/e2e-az-name
# operator: In
# values:
# - e2e-az1
# - e2e-az2
# tolerations:
# - key: "node.alpha.kubernetes.io/unreachable"
# operator: "Exists"
# effect: "NoExecute"
# tolerationSeconds: 6000
# priorityClassName: high-priority
# annotations:
# iam.amazonaws.com/role: role-arn
# labels:
# rack: rack-22
# nodeSelector:
# disktype: ssd
# livenessProbe:
# failureThreshold: 4
# initialDelaySeconds: 60
# periodSeconds: 30
# timeoutSeconds: 10
# startupDelaySeconds: 7200
# readinessProbe:
# failureThreshold: 3
# initialDelaySeconds: 10
# periodSeconds: 3
# successThreshold: 1
# timeoutSeconds: 2
# runtimeClassName: image-rc
# sidecars:
# - image: busybox
# command: ["/bin/sh"]
# args: ["-c", "while true; do echo echo $(date -u) 'test' >> /dev/null; sleep 5;done"]
# name: rs-sidecar-1
podDisruptionBudget:
maxUnavailable: 1
resources:
limits:
cpu: "300m"
memory: "0.5G"
requests:
cpu: "300m"
memory: "0.5G"
expose:
exposeType: ClusterIP
# servicePerPod: true
# loadBalancerSourceRanges:
# - 10.0.0.0/8
# serviceAnnotations:
# service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
# serviceLabels:
# rack: rack-22
# mongod:
# security:
# encryptionKeySecret: "my-cluster-name-mongodb-encryption-key"
backup:
enabled: true
image: percona/percona-backup-mongodb:1.8.1
serviceAccountName: percona-server-mongodb-operator
# annotations:
# iam.amazonaws.com/role: role-arn
# resources:
# limits:
# cpu: "300m"
# memory: "0.5G"
# requests:
# cpu: "300m"
# memory: "0.5G"
# storages:
# s3-us-west:
# type: s3
# s3:
# bucket: S3-BACKUP-BUCKET-NAME-HERE
# credentialsSecret: my-cluster-name-backup-s3
# region: us-west-2
# prefix: ""
# uploadPartSize: 10485760
# maxUploadParts: 10000
# storageClass: STANDARD
# insecureSkipTLSVerify: false
# minio:
# type: s3
# s3:
# bucket: MINIO-BACKUP-BUCKET-NAME-HERE
# region: us-east-1
# credentialsSecret: my-cluster-name-backup-minio
# endpointUrl: http://minio.psmdb.svc.cluster.local:9000/minio/
# insecureSkipTLSVerify: false
# prefix: ""
# azure-blob:
# type: azure
# azure:
# container: CONTAINER-NAME
# prefix: PREFIX-NAME
# credentialsSecret: SECRET-NAME
pitr:
enabled: false
# oplogSpanMin: 10
compressionType: gzip
compressionLevel: 6
# tasks:
# - name: daily-s3-us-west
# enabled: true
# schedule: "0 0 * * *"
# keep: 3
# storageName: s3-us-west
# compressionType: gzip
# compressionLevel: 6
# - name: weekly-s3-us-west
# enabled: false
# schedule: "0 0 * * 0"
# keep: 5
# storageName: s3-us-west
# compressionType: gzip
# compressionLevel: 6
1 Like
I generated certificates manually to solve it. I put the cluster IPs into certificates:
cat <<EOF | cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=./ca-config.json - | cfssljson -bare server
{
"hosts": [
"localhost",
"10.181.4.14",
"10.181.5.143",
"10.181.5.148",
"10.181.5.149",
"10.181.4.145",
"10.181.4.174",
"10.181.4.181",
"${CLUSTER_NAME}-rs0",
"${CLUSTER_NAME}-rs0.${NAMESPACE}",
"${CLUSTER_NAME}-rs0.${NAMESPACE}.svc.cluster.local",
"*.${CLUSTER_NAME}-rs0",
"*.${CLUSTER_NAME}-rs0.${NAMESPACE}",
"*.${CLUSTER_NAME}-rs0.${NAMESPACE}.svc.cluster.local"
],
"names": [
{
"O": "PSMDB"
}
],
"CN": "${CLUSTER_NAME/-rs0}",
"key": {
"algo": "rsa",
"size": 2048
}
}
EOF```
If I enable the replset 'expose' option and set the 'NodePort 'type, I must generated certificates manually.
1 Like
@Worlder_Mo this is awsm that you figured it out! Thank you.
I created a bug in our JIRA to track it: [K8SPSMDB-823] ReplicaSet nodeport exposure breaks backups - Percona JIRA
1 Like
I think the issue about TLS validation should be an automated operation.
1 Like
hello @Worlder_Mo .
I am also experiencing the same error. May I ask how you created the certficate and put it in?
1 Like
This is mentioned in the TLS section of the official deployment documentation