Helm Upgrade from 1.11 to 1.12 confusing crds

Hello Team,
we have adjusted the configuration of our clusters according to the release notes for v1.12.

However, some of the settings that were moved do not work as we expected that.

The spec.mongod section is removed from the Custom Resource configuration. Starting from now, mongod options should be passed to Replica Sets using spec.replsets.[].configuration key…

For example, we moved operationProfiling from spec.mongod to spec.replsets[0].configuration without success. The desired OperationProfiling is not used.

Using spec.mongod as before, everything works as expected.

If we take a look at the v1.12 CRDs in the Helm Chart Repo we see that the spec.mongod section is still there and has not been removed as advertised.

https://github.com/percona/percona-helm-charts/blob/123ca017d2f5133808f4959be19bfe3b25e1a469/charts/psmdb-operator/crds/crd.yaml#L625

What is now the right way to configure these things?

Here is a rendered version of one of our clusters:

apiVersion: psmdb.percona.com/v1-12-0
kind: PerconaServerMongoDB
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"psmdb.percona.com/v1-12-0","kind":"PerconaServerMongoDB"}
  name: test-psmdb-db
  finalizers:
    - delete-psmdb-pods-in-order
    - delete-psmdb-pvc
spec:
  pause: false
  unmanaged: false
  image: "percona/percona-server-mongodb:4.4.8-9"
  imagePullPolicy: "Always"
  multiCluster:
    enabled: false
  secrets:
    users: test-psmdb-db-secrets
    encryptionKey: test-psmdb-db-mongodb-encryption-key
  updateStrategy: SmartUpdate
  upgradeOptions:
    versionServiceEndpoint: https://check.percona.com
    apply: 4.4-recommended
    schedule: 0 2 * * *
    setFCV: false
  pmm:
    enabled: false
    image: "percona/pmm-client:2.28.0"
    serverHost: monitoring-service
  replsets:
  - name: rs0
    size: 3
    configuration: |
      operationProfiling:
        mode: slowOp
        slowOpThresholdMs: 100
    affinity:
      antiAffinityTopologyKey: kubernetes.io/hostname
    nodeSelector:
      dedicated: database
    tolerations:
      - effect: NoSchedule
        key: dedicated
        operator: Equal
        value: database
    livenessProbe:
      failureThreshold: 4
      initialDelaySeconds: 60
      periodSeconds: 30
      startupDelaySeconds: 7200
      successThreshold: 1
      timeoutSeconds: 5
    readinessProbe:
      failureThreshold: 8
      initialDelaySeconds: 10
      periodSeconds: 5
      successThreshold: 1
      timeoutSeconds: 5
    storage:
      engine: wiredTiger
      inMemory:
        engineConfig:
          inMemorySizeRatio: 0.5
      wiredTiger:
        collectionConfig:
          blockCompressor: snappy
        engineConfig:
          cacheSizeRatio: 0.5
          directoryForIndexes: false
          journalCompressor: snappy
        indexConfig:
          prefixCompression: true
    podDisruptionBudget:
      maxUnavailable: 1
    expose:
      enabled: false
      exposeType: ClusterIP
    nonvoting:
      enabled: false
      size: 3
      affinity:
        antiAffinityTopologyKey: kubernetes.io/hostname
      podDisruptionBudget:
        maxUnavailable: 1
      resources:
        limits:
          cpu: 300m
          memory: 0.5G
        requests:
          cpu: 300m
          memory: 0.5G
      volumeSpec:
        persistentVolumeClaim:
          resources:
            requests:
              storage: 10Gi
  mongod:
    setParameter:
      ttlMonitorSleepSecs: 60
      wiredTigerConcurrentReadTransactions: 128
      wiredTigerConcurrentWriteTransactions: 128
    storage:
      engine: wiredTiger
      inMemory:
        engineConfig:
          inMemorySizeRatio: 0.9
      wiredTiger:
        engineConfig:
          cacheSizeRatio: 0.5
          directoryForIndexes: false
          journalCompressor: snappy
        collectionConfig:
          blockCompressor: snappy
        indexConfig:
          prefixCompression: true
    operationProfiling:
      mode: slowOp
      slowOpThresholdMs: 100
      rateLimit: 100
  backup:
    enabled: true
    image: "percona/percona-server-mongodb-operator:1.11.0-backup"
    serviceAccountName: percona-server-mongodb-operator
    storages:
      s11-dev-backup:
        s3:
          bucket: psmdb-backup
          credentialsSecret: test-psmdb-db-backup-secret
          endpointUrl: https://xxx
          region: us-east-1
        type: s3
    pitr:
      enabled: true
    tasks:
      - compressionType: gzip
        enabled: true
        keep: 7
        name: daily-dev
        schedule: 0 18 * * *
        storageName: s11-dev-backup

we would be very happy to be enlightened :slight_smile:

best regards,

Ricardo

1 Like

After checking the operator code, there seems to be a problem with spec.crVersion.
It seems our PerconaServerMongoDB Resources have the wrong version.

As can be seen here, the spec.crVersion field is currently not managed by psmdb-db the HelmChart.

In our case, the spec.crVersion is set to 1.10.0, which should be responsible for the problems with the Config. How should the field be handled? Does the operator take care of it ? Do we have to patch the field ourselves via Helm?

1 Like

ok, i think there is a misunderstanding here…

As i can read here, there are 2 Update variants.

  • Semi-automatic upgrade
  • Manual upgrade

we thought Helm would be a third option that would make the other steps unnecessary.

  • Full-automatic upgrade

I think the part of the documentation only refers to a deployment with deploy/cr.yaml.

For example, the psmdb-operator Helm Chart takes care of the RBAC update from the documentation but doesn’t take care of the other parts of the update.

These manual apply is not needed with the psmdb-operator Helm Chart…

$ kubectl apply --server-side -f https://raw.githubusercontent.com/percona/percona-server-mongodb-operator/v1.12.0/deploy/crd.yaml
$ kubectl apply -f https://raw.githubusercontent.com/percona/percona-server-mongodb-operator/v1.12.0/deploy/rbac.yaml

Patching the operator yourself is also unnecessary.

$ kubectl patch deployment percona-server-mongodb-operator \
   -p'{"spec":{"template":{"spec":{"containers":[{"name":"percona-server-mongodb-operator","image":"percona/percona-server-mongodb-operator:1.12.0"}]}}}}'

But it seems that the last part of the documentation is absolutely necessary

$ kubectl patch psmdb my-cluster-name --type=merge --patch '{
   "spec": {
      "crVersion":"1.12.0",
      "image": "percona/percona-server-mongodb:4.4.13-13",
      "backup": { "image": "percona/percona-server-mongodb-operator:1.12.0-backup" },
      "pmm": { "image": "percona/pmm-client:2.27.0" }
   }}'

However, the fields image, backup and pmm are also managed via the HelmChart.

Only crVersion was omitted in the psmdb-db Chart.

It is a 2 phase update anyway, first the psmdb-operator and then the psmdb-db Chart.
Why doesn’t psmdb-db set the appropriate spec.crVersion ?

I think at least the documentation should be expanded…

1 Like

Hi Ricardo! I am also a bit confused about upgrading MongoDB using HELM charts.

  1. You found out that HELM charts did everything necessary except to patch crVersion. Why do we need to patch crVersion at all? In my psmdb when i edit it (kubectl edit perconaservermongodb.psmdb.percona.com/databasemgmt-psmdb-db -n mongodb)
    there is no crVersion.

  2. With HELM charts upgrades are probably required to be also done incremental (e.g. 1.11 to 1.13 must be done via 1.12)?

Best regards, Anton

1 Like

Hey Anton,
it is probably the case that the operator includes all versions.
The PerconaMongoDB CRD then decides in which “mode” the operator runs for this single database.

If you use a 1.12 operator and your database uses spec.crVersion=1.10.0, the operator will use the 1.10.0 code internally. So you can simply upgrade the operator and upgrade the databases later.

So I think there is no need for incremental upgrades for minor / patch versions of the operator.

There is a note in the code regarding the missing spec.crVersion.
The operator always uses spec.crVersion internally to compare versions.

If spec.crVersion is empty like in your case, then a fallback is used which fills the version internally.

As you can see setVersion() use kubectl.kubernetes.io/last-applied-configuration as fallback.

It could be that spec.crVersion is from an earlier version and is no longer used for new databases.

best regards,

Ricardo

1 Like

Hi Ricardo!

Thank you for explanation.

I did test using HELMs:

  • installed operator and 2 databases, all version 1.11.0
  • upgraded operator to 1.12.0
  • upgraded one database to 1.12.0, other database stayed 1.11.0

After upgrade the value of kubectl.kubernetes.io/last-applied-configuration was successfully applied, at upgraded database to 1.12.0, at other stayed 1.11.0.
So i guess there is no need to patch spec.crVersion when using HELMs.

Do you maybe know if i need to do incremental upgrade of databases or not? Can i just run HELM to upgrade from e.g. 1.10 to 1.12? What does it do, it replaces binaries and sets operator to work in new mode and using new settings… i guess it can be done?

Best regards, Anton

1 Like