/opt/percona/ps-entry.sh: line 509: exec: numactl --interleave=all: not found

I have setup a test environment that somewhat resembles what we currently have in production. I am attempting to perform an upgrade before doing it for real in production.

I am trying to upgrade mongodb server version from 6.0.4-3 to 6.0.15-12 however i get the following error log in percona mongodb server.

  • exec ‘numactl --interleave=all’ mongod --bind_ip_all --auth --dbpath=/data/db --port=27017 --replSet=rs0 --storageEngine=wiredTiger --relaxPermChecks --clusterAuthMode=keyFile --keyFile=/etc/mongodb-secrets/mongodb-key --enableEncryption --encryptionKeyFile=/etc/mongodb-encryption/encryption-key --wiredTigerIndexPrefixCompression=true --config=/etc/mongodb-config/mongod.conf --tlsAllowInvalidCertificates
    /opt/percona/ps-entry.sh: line 509: exec: numactl --interleave=all: not found

Can you see if im doing something obviously wrong?

  1. Initially install the operator and server
helm install psmdb-operator percona/psmdb-operator --namespace mongodb2 -f psmdb-operator.values.yaml --version 1.14.0
helm install psmdb-db-internal percona/psmdb-db --namespace mongodb2 -f psmdb-db-internal.values.yaml --version 1.14.0
  1. update image.tag to 6.0.15-12

  2. Run upgrade

helm upgrade psmdb-db-internal percona/psmdb-db --namespace mongodb2 -f psmdb-db-internal.values.yaml --version 1.14.0

Operator Logs show

2024-07-11T15:40:11.722Z ERROR failed to reconcile cluster {“controller”: “psmdb-controller”, “object”: {“name”:“psmdb-db-internal”,“namespace”:“mongodb2”}, “namespace”: “mongodb2”, “name”: “psmdb-db-internal”, “reconcileID”: “31ce433f-c248-4f12-80d9-631cdf395b84”, “replset”: “rs0”, “error”: “dial: ping mongo: server selection error: context deadline exceeded, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: psmdb-db-internal-rs0-0.psmdb-db-internal-rs0.mongodb2.svc.cluster.local:27017, Type: Unknown, Last error: dial tcp: lookup psmdb-db-internal-rs0-0.psmdb-db-internal-rs0.mongodb2.svc.cluster.local on [fdd1:fe9:30d1::a]:53: no such host }, ] }”, “errorVerbose”: “server selection error: context deadline exceeded, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: psmdb-db-internal-rs0-0.psmdb-db-internal-rs0.mongodb2.svc.cluster.local:27017, Type: Unknown, Last error: dial tcp: lookup psmdb-db-internal-rs0-0.psmdb-db-internal-rs0.mongodb2.svc.cluster.local on [fdd1:fe9:30d1::a]:53: no such host }, ] }\nping mongo\ngithub.com/percona/percona-server-mongodb-operator/pkg/psmdb/mongo.Dial\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/psmdb/mongo/mongo.go:64\ngithub.com/percona/percona-server-mongodb-operator/pkg/psmdb.MongoClient\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/psmdb/client.go:47\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).mongoClientWithRole\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/connections.go:21\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).reconcileCluster\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:88\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:487\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594\ndial\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).reconcileCluster\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:94\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:487\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594”}
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile
/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:489
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:122
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:323
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:235

percona-mongodb-server.yaml

# Default values for psmdb-cluster.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

# Platform type: kubernetes, openshift
# platform: kubernetes

# Cluster DNS Suffix
# clusterServiceDNSSuffix: svc.cluster.local
# clusterServiceDNSMode: "Internal"

finalizers:
  ## Set this if you want that operator deletes the primary pod last
  - delete-psmdb-pods-in-order
## Set this if you want to delete database persistent volumes on cluster deletion
#  - delete-psmdb-pvc

nameOverride: ""
fullnameOverride: ""

crVersion: 1.14.0
pause: false
unmanaged: false
allowUnsafeConfigurations: true
# ignoreAnnotations:
#   - service.beta.kubernetes.io/aws-load-balancer-backend-protocol
# ignoreLabels:
#   - rack
multiCluster:
  enabled: false
  # DNSSuffix: svc.clusterset.local
updateStrategy: SmartUpdate
upgradeOptions:
  versionServiceEndpoint: https://check.percona.com
  apply: disabled
  schedule: "0 2 * * *"
  setFCV: false

image:
  repository: percona/percona-server-mongodb
  tag: 6.0.4-3

imagePullPolicy: Always
# imagePullSecrets: []
# initImage:
#   repository: percona/percona-server-mongodb-operator
#   tag: 1.14.0
# initContainerSecurityContext: {}
# tls:
#   # 90 days in hours
#   certValidityDuration: 2160h
secrets:
  {}
  # If you set users secret here the operator will use existing one or generate random values
  # If not set the operator generates the default secret with name <cluster_name>-secrets
  # users: my-cluster-name-secrets
  # encryptionKey: my-cluster-name-mongodb-encryption-key

pmm:
  enabled: false
  image:
    repository: percona/pmm-client
    tag: 2.35.0
  serverHost: monitoring-service

replsets:
  - name: rs0
    size: 1
    configuration: |
      net:
        ipv6: true
    # externalNodes:
    # - host: 34.124.76.90
    # - host: 34.124.76.91
    #   port: 27017
    #   votes: 0
    #   priority: 0
    # - host: 34.124.76.92
    # configuration: |
    #   operationProfiling:
    #     mode: slowOp
    #   systemLog:
    #     verbosity: 1
    antiAffinityTopologyKey: "kubernetes.io/hostname"
    # tolerations: []
    # priorityClass: ""
    # annotations: {}
    # labels: {}
    nodeSelector: {}
    # livenessProbe:
    #   failureThreshold: 4
    #   initialDelaySeconds: 60
    #   periodSeconds: 30
    #   timeoutSeconds: 10
    #   startupDelaySeconds: 7200
    # readinessProbe:
    #   failureThreshold: 8
    #   initialDelaySeconds: 10
    #   periodSeconds: 3
    #   successThreshold: 1
    #   timeoutSeconds: 2
    # runtimeClassName: image-rc
    # storage:
    #   engine: wiredTiger
    #   wiredTiger:
    #     engineConfig:
    #       cacheSizeRatio: 0.5
    #       directoryForIndexes: false
    #       journalCompressor: snappy
    #     collectionConfig:
    #       blockCompressor: snappy
    #     indexConfig:
    #       prefixCompression: true
    #   inMemory:
    #     engineConfig:
    #        inMemorySizeRatio: 0.5
    #sidecars:
    #   volumeMounts:
    #     - mountPath: /volume1
    #       name: sidecar-volume-claim
    #     - mountPath: /secret
    #       name: sidecar-secret
    #     - mountPath: /configmap
    #       name: sidecar-config
    # sidecarVolumes:
    # - name: sidecar-secret
    #   secret:
    #     secretName: mysecret
    # - name: sidecar-config
    #   configMap:
    #     name: myconfigmap
    # sidecarPVCs:
    # - apiVersion: v1
    #   kind: PersistentVolumeClaim
    #   metadata:
    #     name: sidecar-volume-claim
    #   spec:
    #     resources:
    #       requests:
    #         storage: 1Gi
    #     volumeMode: Filesystem
    #     accessModes:
    #       - ReadWriteOnce
    podDisruptionBudget:
      maxUnavailable: 1
    expose:
      enabled: true
      exposeType: LoadBalancer
      # loadBalancerSourceRanges:
      #   - 10.0.0.0/8
      serviceAnnotations:
        service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: preserve_client_ip.enabled=false
        service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
        service.beta.kubernetes.io/aws-load-balancer-scheme: internal
        service.beta.kubernetes.io/aws-load-balancer-ip-address-type: dualstack
        # Consider enabling cross zone laod balancing and s3 logs
        # service.beta.kubernetes.io/aws-load-balancer-scheme: internal
        # service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
      # serviceLabels:
      #   some-label: some-key
    nonvoting:
      enabled: false
      # podSecurityContext: {}
      # containerSecurityContext: {}
      size: 3
      # configuration: |
      #   operationProfiling:
      #     mode: slowOp
      #   systemLog:
      #     verbosity: 1
      antiAffinityTopologyKey: "kubernetes.io/hostname"
      # tolerations: []
      # priorityClass: ""
      # annotations: {}
      # labels: {}
      # nodeSelector: {}
      podDisruptionBudget:
        maxUnavailable: 1
      resources:
        limits:
          cpu: "300m"
          memory: "0.5G"
        requests:
          cpu: "300m"
          memory: "0.5G"
      volumeSpec:
        # emptyDir: {}
        # hostPath:
        #   path: /data
        pvc:
          # annotations:
          #   volume.beta.kubernetes.io/storage-class: example-hostpath
          # labels:
          #   rack: rack-22
          # storageClassName: standard
          # accessModes: [ "ReadWriteOnce" ]
          resources:
            requests:
              storage: 3Gi
    arbiter:
      enabled: false
      size: 1
      antiAffinityTopologyKey: "kubernetes.io/hostname"
      # tolerations: []
      # priorityClass: ""
      # annotations: {}
      # labels: {}
      # nodeSelector: {}
    # schedulerName: ""
    # resources:
    #   limits:
    #     cpu: "300m"
    #     memory: "0.5G"
    #   requests:
    #     cpu: "300m"
    #     memory: "0.5G"
    volumeSpec:
      # emptyDir: {}
      # hostPath:
      #   path: /data
      pvc:
        # annotations:
        #   volume.beta.kubernetes.io/storage-class: example-hostpath
        # labels:
        #   rack: rack-22
        storageClassName: mongodb
        # accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 250Gi

sharding:
  enabled: false

backup:
  enabled: false
  image:
    repository: percona/percona-backup-mongodb
    tag: 2.0.5
  serviceAccountName: percona-server-mongodb-operator
  #  annotations:
  #  iam.amazonaws.com/role: arn:aws:iam::<removed>:role/removed-test-default-eks-mongodb
  # resources:
  #   limits:
  #     cpu: "300m"
  #     memory: "0.5G"
  #   requests:
  #     cpu: "300m"
  #     memory: "0.5G"
  #storages:
  # minio:
  #   type: s3
  #   s3:
  #     bucket: MINIO-BACKUP-BUCKET-NAME-HERE
  #     region: us-east-1
  #     credentialsSecret: my-cluster-name-backup-minio
  #     endpointUrl: http://minio.psmdb.svc.cluster.local:9000/minio/
  #     prefix: ""
  #   azure-blob:
  #     type: azure
  #     azure:
  #       container: CONTAINER-NAME
  #       prefix: PREFIX-NAME
  #       credentialsSecret: SECRET-NAME
  pitr:
    enabled: false
    # oplogSpanMin: 10
    # compressionType: gzip
    # compressionLevel: 6
    #tasks:

  # - name: daily-s3-us-west
  #   enabled: true
  #   schedule: "0 0 * * *"
  #   keep: 3
  #   storageName: s3-us-west
  #   compressionType: gzip
  # - name: weekly-s3-us-west
  #   enabled: false
  #   schedule: "0 0 * * 0"
  #   keep: 5
  #   storageName: s3-us-west
  #   compressionType: gzip
  # - name: weekly-s3-us-west-physical
  #   enabled: false
  #   schedule: "0 5 * * 0"
  #   keep: 5
  #   type: physical
  #   storageName: s3-us-west
  #   compressionType: gzip
  #   compressionLevel: 6
# If you set users here the secret will be constructed by helm with these values
# users:
#   MONGODB_BACKUP_USER: backup
#   MONGODB_BACKUP_PASSWORD: backup123456
#   MONGODB_DATABASE_ADMIN_USER: databaseAdmin
#   MONGODB_DATABASE_ADMIN_PASSWORD: databaseAdmin123456
#   MONGODB_CLUSTER_ADMIN_USER: clusterAdmin
#   MONGODB_CLUSTER_ADMIN_PASSWORD: clusterAdmin123456
#   MONGODB_CLUSTER_MONITOR_USER: clusterMonitor
#   MONGODB_CLUSTER_MONITOR_PASSWORD: clusterMonitor123456
#   MONGODB_USER_ADMIN_USER: userAdmin
#   MONGODB_USER_ADMIN_PASSWORD: userAdmin123456
#   PMM_SERVER_API_KEY: apikey
#   # PMM_SERVER_USER: admin
#   # PMM_SERVER_PASSWORD: admin

percona-mongodb-operator.yaml

replicaCount: 1

image:
  repository: percona/percona-server-mongodb-operator
  tag: 1.14.0
  pullPolicy: IfNotPresent

# set if you want to specify a namespace to watch
# defaults to `.Release.namespace` if left blank
# watchNamespace:

# set if operator should be deployed in cluster wide mode. defaults to false
watchAllNamespaces: false

# rbac: settings for deployer RBAC creation
rbac:
  # rbac.create: if false RBAC resources should be in place
  create: true

# serviceAccount: settings for Service Accounts used by the deployer
serviceAccount:
  # serviceAccount.create: Whether to create the Service Accounts or not
  create: true

podAnnotations: {}
  # prometheus.io/scrape: "true"
  # prometheus.io/port: "8080"

podSecurityContext: {}
  # runAsNonRoot: true
  # runAsUser: 2
  # runAsGroup: 2
  # fsGroup: 2
  # fsGroupChangePolicy: "OnRootMismatch"

securityContext: {}
  # allowPrivilegeEscalation: false
  # capabilities:
  #   drop:
  #   - ALL
  # seccompProfile:
  #   type: RuntimeDefault

# set if you want to use a different operator name
# defaults to `percona-server-mongodb-operator`
# operatorName:

imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""

env:
  resyncPeriod: 5s
  logVerbose: false

resources: {}
  # We usually recommend not to specify default resources and to leave this as a conscious
  # choice for the user. This also increases chances charts run on environments with little
  # resources, such as Minikube. If you do want to specify resources, uncomment the following
  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
  # limits:
  #   cpu: 100m
  #   memory: 128Mi
  # requests:
  #   cpu: 100m
  #   memory: 128Mi

nodeSelector: {}


tolerations: []

affinity: {}

Note: I am able to revert the image tag back to 6.0.4-3 and get it up and running again once i experience the error.

Note2: I also want to mention i am having difficulty connecting to mongodb via the generated kubectl command. But i am able to connect to it via mongosh using the load balancer url created.

kubectl run -i --rm --tty percona-client --image=percona/percona-server-mongodb:5.0 --restart=Never \
  -- mongo "mongodb+srv://${ADMIN_USER}:${ADMIN_PASSWORD}@psmdb-db-internal-rs0.mongodb2.svc.cluster.local/admin?replicaSet=rs0&ssl=false"

{“t”:{“$date”:“2024-07-11T15:50:11.436Z”},“s”:“I”, “c”:“NETWORK”, “id”:4333208, “ctx”:“ReplicaSetMonitor-TaskExecutor”,“msg”:“RSM host selection timeout”,“attr”:{“replicaSet”:“rs0”,“error”:“FailedToSatisfyReadPreference: Could not find host matching read preference { mode: "nearest" } for set rs0”}}
Error: Could not find host matching read preference { mode: “nearest” } for set rs0, rs0/psmdb-db-internal-rs0-0.psmdb-db-internal-rs0.mongodb2.svc.cluster.local:27017 :
connect@src/mongo/shell/mongo.js:372:17
@(connect):2:6
exception: connect failed
exiting with code 1
pod “percona-client” deleted
pod default/percona-client terminated (Error)

If anyone comes across this, I managed to find a solution to get this working properly. I needed to upgrade the operator first to version 1.16.1.

  1. Upgrade Operator
helm upgrade psmdb-operator percona/psmdb-operator --version 1.16.1 -f psmdb-operator.values.yaml -n mongodb2
image:
  repository: percona/percona-server-mongodb-operator
  tag: 1.16.1
  1. Upgrade Server
helm upgrade psmdb-db-internal percona/psmdb-db --namespace mongodb2 -f psmdb-db-internal.values.yaml --version 1.16.1
crVersion: 1.16.1
  1. Update CRDS ( had to add --force-conflicts, as it errored the first time)
kubectl apply --server-side -f https://raw.githubusercontent.com/percona/percona-server-mongodb-operator/v1.16.1/deploy/crd.yaml --force-conflicts
kubectl apply -f https://raw.githubusercontent.com/percona/percona-server-mongodb-operator/v1.16.1/deploy/rbac.yaml
  1. Then i updated the server yaml psmdb-db-internal.values.yaml to comply with the spec changes.
unsafeFlags:
  replsetSize: true
...

replsets:
  rs0:
    size: 1
  1. Then finally updated the server version.
image:
  repository: percona/percona-server-mongodb
  tag: 6.0.15-12
helm upgrade psmdb-db-internal percona/psmdb-db --namespace mongodb2 -f psmdb-db-internal.values.yaml --version 1.16.1

Context behind this change is that i need to perform a cluster-2-cluster sync between a new cluster containing mongodb version 7 and an existing mongodb version 6. But the cluster-2-cluster sync requires a minimum patch version of 6.0.13.