MongoDB Cluster IPv6, Replicaset initialization is failing

After the creation of the PSMDB cluster with the k8s Percona operator.
It seems that the replica is failing to initialize.
I used the following CR

apiVersion: psmdb.percona.com/v1-12-0
kind: PerconaServerMongoDB
metadata:
  finalizers:
  - delete-psmdb-pods-in-order
  name: rawdb-cluster
  namespace: mongodb-cluster
spec:
  allowUnsafeConfigurations: true
  backup:
    enabled: false
    image: percona/percona-backup-mongodb:1.7.0
    pitr:
      enabled: true
      oplogSpanMin: 10
    resources:
      limits:
        cpu: 300m
        memory: 0.5G
      requests:
        cpu: 300m
        memory: 0.5G
    serviceAccountName: percona-server-mongodb-operator
    storages:
      s3-us-east:
        s3:
          bucket: rawdb-backup-data-bucket-qa
          credentialsSecret: mongodb-cluster-backup-s3
          prefix: data/pbm/backup
          region: us-east-1
        type: s3
    tasks:
    - compressionLevel: 6
      compressionType: gzip
      enabled: true
      keep: 4
      name: s3-us-east
      schedule: 1 1 * * *
      storageName: s3-us-east
  crVersion: 1.12.0
  image: percona/percona-server-mongodb:4.2.19-19
  imagePullPolicy: Always
  mongod:
    auditLog:
      destination: file
      filter: '{}'
      format: JSON
    net:
      hostPort: 0
      port: 27017
    operationProfiling:
      mode: slowOp
      rateLimit: 100
      slowOpThresholdMs: 100
    replication:
      oplogSizeMB: 204800
    security:
      enableEncryption: false
      encryptionCipherMode: AES256-CBC
      redactClientLogData: false
    setParameter:
      ttlMonitorSleepSecs: 60
      wiredTigerConcurrentReadTransactions: 128
      wiredTigerConcurrentWriteTransactions: 128
    storage:
      engine: wiredTiger
      wiredTiger:
        collectionConfig:
          blockCompressor: snappy
        engineConfig:
          cacheSizeRatio: 0.5
          directoryForIndexes: false
          journalCompressor: snappy
        indexConfig:
          prefixCompression: true
  pmm:
    enabled: true
    image: percona/pmm-client:2.27.0
    serverHost: pmm-monitoring-service.percona-monitoring.svc.cluster.local
    serverUser: admin
  replsets:
  - affinity:
      advanced:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: service
                operator: In
                values:
                - rawdb-mongodb
      antiAffinityTopologyKey: kubernetes.io/hostname
    arbiter:
      affinity:
        antiAffinityTopologyKey: kubernetes.io/hostname
      enabled: false
      size: 2
    configuration: |
      systemLog:
        verbosity: 1
      net:
        ipv6: true
        bindIpAll: true
    expose:
      enabled: true
      exposeType: ClusterIP
    livenessProbe:
      failureThreshold: 8
      initialDelaySeconds: 60
      periodSeconds: 30
      startupDelaySeconds: 7200
      timeoutSeconds: 10
    name: rs0
    nonvoting:
      affinity:
        antiAffinityTopologyKey: kubernetes.io/hostname
      enabled: false
      podDisruptionBudget:
        maxUnavailable: 1
      resources:
        limits:
          cpu: 300m
          memory: 0.5G
        requests:
          cpu: 300m
          memory: 0.5G
      size: 3
      volumeSpec:
        persistentVolumeClaim:
          resources:
            requests:
              storage: 3Gi
    podDisruptionBudget:
      maxUnavailable: 1
    priorityClassName: mongod-priority-class
    readinessProbe:
      failureThreshold: 8
      initialDelaySeconds: 10
      periodSeconds: 5
      successThreshold: 1
      timeoutSeconds: 2
    resources:
      limits:
        cpu: 300m
        memory: 0.5G
      requests:
        cpu: 300m
        memory: 0.5G
    sidecars: null
    size: 3
    tolerations:
    - effect: NoExecute
      key: node.alpha.kubernetes.io/unreachable
      operator: Exists
      tolerationSeconds: 6000
    - effect: NoSchedule
      key: service-type
      operator: Equal
      value: mongod
    - effect: NoSchedule
      key: replset
      operator: Equal
      value: rawdb-rs0
    volumeSpec:
      persistentVolumeClaim:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 3Gi
        storageClassName: gp2
  runUid: 1001
  secrets:
    encryptionKey: mongodb-cluster-encryption-key
    users: my-cluster-name-secrets
  sharding:
    configsvrReplSet:
      affinity:
        advanced:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
              - matchExpressions:
                - key: service
                  operator: In
                  values:
                  - rawdb-mongodb
        antiAffinityTopologyKey: kubernetes.io/hostname
      configuration: |
        systemLog:
          verbosity: 1
        net:
          ipv6: true
          bindIpAll: true
      expose:
        enabled: false
        exposeType: ClusterIP
      livenessProbe:
        failureThreshold: 4
        initialDelaySeconds: 65
        periodSeconds: 30
        startupDelaySeconds: 7200
        timeoutSeconds: 10
      podDisruptionBudget:
        maxUnavailable: 1
      priorityClassName: mongod-priority-class
      readinessProbe:
        failureThreshold: 8
        initialDelaySeconds: 60
        periodSeconds: 5
        successThreshold: 1
        timeoutSeconds: 2
      resources:
        limits:
          cpu: 2048m
          memory: 2G
        requests:
          cpu: 1024m
          memory: 1G
      size: 3
      storage:
        engine: wiredTiger
        wiredTiger:
          collectionConfig:
            blockCompressor: snappy
          engineConfig:
            cacheSizeRatio: 0.5
            directoryForIndexes: false
            journalCompressor: snappy
          indexConfig:
            prefixCompression: true
      tolerations:
      - effect: NoExecute
        key: node.alpha.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 6000
      - effect: NoSchedule
        key: service-type
        operator: Equal
        value: rawdb-config
      volumeSpec:
        persistentVolumeClaim:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 3Gi
          storageClassName: gp2
    enabled: true
    mongos:
      affinity:
        advanced:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
              - matchExpressions:
                - key: service
                  operator: In
                  values:
                  - rawdb-mongodb
        antiAffinityTopologyKey: kubernetes.io/hostname
      configuration: |
        systemLog:
          verbosity: 1
        net:
          ipv6: true
          bindIpAll: true
      expose:
        exposeType: LoadBalancer
        serviceAnnotations:
          service.beta.kubernetes.io/aws-load-balancer-ip-address-type: dualstack
          service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
          service.beta.kubernetes.io/aws-load-balancer-scheme: internal
          service.beta.kubernetes.io/aws-load-balancer-type: external
      podDisruptionBudget:
        maxUnavailable: 1
      priorityClassName: mongod-priority-class
      resources:
        limits:
          cpu: 300m
          memory: 0.5G
        requests:
          cpu: 300m
          memory: 0.5G
      size: 3
      tolerations:
      - effect: NoExecute
        key: node.alpha.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 6000
      - effect: NoSchedule
        key: service-type
        operator: Equal
        value: mongos
  updateStrategy: Never
  upgradeOptions:
    apply: 4.4-recommended
    schedule: 0 2 * * *
    setFCV: false
    versionServiceEndpoint: https://check.percona.com

The mongos and the configserv all seem to be working ok.
the replicaset is restarting over and over with an initialization error in the operator

{"level":"info","ts":1656613690.6291065,"logger":"controller_psmdb","msg":"initiating replset","replset":"rs0","pod":"rawdb-cluster-rs0-0"}
{"level":"error","ts":1656613695.9686441,"logger":"controller_psmdb","msg":"failed to reconcile cluster","Request.Namespace":"mongodb-cluster","Request.Name":"rawdb-cluster","replset":"rs0","error":"handleReplsetInit: exec add admin user: command terminated with exit code 1 / Percona Server for MongoDB shell version v4.2.19-19\nconnecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb\nImplicit session: session { \"id\" : UUID(\"3981f9f7-3308-4aee-aad4-e3bc0bd0b689\") }\nPercona Server for MongoDB server version: v4.2.19-19\n2022-06-30T18:28:15.963+0000 E  QUERY    [js] uncaught exception: Error: couldn't add user: not master :\n_getErrorWithCode@src/mongo/shell/utils.js:25:13\nDB.prototype.createUser@src/mongo/shell/db.js:1413:11\n@(shell):1:1\nbye\n2022-06-30T18:28:15.964+0000 E  -        [main] Error saving history file: FileOpenFailed: Unable to open() file /home/mongodb/.dbshell: No such file or directory\n / ","errorVerbose":"exec add admin user: command terminated with exit code 1 / Percona Server for MongoDB shell version v4.2.19-19\nconnecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb\nImplicit session: session { \"id\" : UUID(\"3981f9f7-3308-4aee-aad4-e3bc0bd0b689\") }\nPercona Server for MongoDB server version: v4.2.19-19\n2022-06-30T18:28:15.963+0000 E  QUERY    [js] uncaught exception: Error: couldn't add user: not master :\n_getErrorWithCode@src/mongo/shell/utils.js:25:13\nDB.prototype.createUser@src/mongo/shell/db.js:1413:11\n@(shell):1:1\nbye\n2022-06-30T18:28:15.964+0000 E  -        [main] Error saving history file: FileOpenFailed: Unable to open() file /home/mongodb/.dbshell: No such file or directory\n / \nhandleReplsetInit\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).reconcileCluster\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:55\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:452\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:454\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

I see this error
**
"{“level”:“error”,“ts”:1656613353.4768405,“logger”:“controller_psmdb”,“msg”:“get psmdb connection endpoint”,“error”:“get service: Service "rawdb-cluster-mongos" not found”,“errorVerbose”:"Service "rawdb-cluster-mongos" not found"**
which seems relevant to the IPv6 …might be that the operator is not configured for the ipv6…any ideas?

{"level":"error","ts":1656613353.4768405,"logger":"controller_psmdb","msg":"get psmdb connection endpoint","error":"get service: Service \"rawdb-cluster-mongos\" not found","errorVerbose":"Service \"rawdb-cluster-mongos\" not found\nget service\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.loadBalancerServiceEndpoint\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/status.go:375\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).connectionEndpoint\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/status.go:333\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).updateStatus\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/status.go:175\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile.func1\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:214\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:500\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).updateStatus\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/status.go:177\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile.func1\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:214\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:500\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
1 Like

Hello @Yossi_Cohn ,

frankly we have not tested our operator with IPv6.
Is your cluster IPv6 only or you have dual stack?

1 Like

@Yossi_Cohn Thanks for your request! It looks like our operators do not support ipv6. It could be fixed in next releases. You can check this task for updates [CLOUD-716] Add ipv6 support for operators - Percona JIRA.

1 Like

@Natalia_Marukovich thanks for the answer
btw, I managed to run the IPv6 with the MongoDB Operator
the cluster was configured as a Sharding, also created a cluster with a Replicaset (no sharding) and it worked as well.
I think this is an important feature to support and a low-hanging fruit.
AS I see it the k8s community is moving forward to IPv6 because of IPv4 deficiency issues.
I’ll be glad if this is pushed forward

1 Like

Hi,
This is EKS with IPv6 meaning pods are IPV6 only…and I managed to make it work

1 Like

@Yossi_Cohn I’m a bit confused about cluster configuration) Could you please share more details what configuration you use?

1 Like

@Natalia_Marukovich The Configuration is almost identical to IPv4 config.
Used the Percona PSMDB Operator, the important part is to add to all the replicaset and the Configserv, Mongos the Configuration (inside the configuration file section)

      configuration: |
        systemLog:
          verbosity: 1
        net:
          ipv6: true
          bindIpAll: true

When creating a non-sharded cluster it was enough of course to add it to the replica sets and it was working quite easily.
For sharding, I had to set also the expose filed to false specifically at the ConfigServ.
It seems that the Mongos was using IPv6 Address instead of DNS name

expose:
        enabled: false

with this, the sharding was working and seems to be OK.

1 Like

So I see your CR.yml above. Is it up-to-date one? As I can see from you previous message you can’t start cluster correctly and got errors even using ipv6 configuration. Is it still true?))

configuration: |
        systemLog:
          verbosity: 1
        net:
          ipv6: true
          bindIpAll: true
1 Like