Upgrade from 1.6.0 to 1.7.0 fails

I’m upgrading an operator for mongodb server from 1.6.0 to 1.7.0. Once I deploy a new operator with 1.7.0 image, my psmdb goes into initializing state and stays in this state forever. While 1.7.0 operator produces the following logs:

{"level":"error","ts":1643043912.3269353,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"psmdb-controller","request":"datalayer/mongodb","error":"reconcile StatefulSet for rs0: update StatefulSet mongodb-rs0: StatefulSet.apps \"mongodb-rs0\" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden","errorVerbose":"reconcile StatefulSet for rs0: update StatefulSet mongodb-rs0: StatefulSet.apps \"mongodb-rs0\" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:350\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1373","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/percona/percona-server-mongodb-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

If I roll back operator to 1.6.0 image, my psmdb goes into ready state and there are no errors in the log anymore.

What am I missing?

Thanks,
Andrey

1 Like

Hi @andreyk,

Could you please share how you tried to upgrade? Have you used helm or installed/upgraded manually?

1 Like

We use ArgoCD to deploy applications into our k8s cluster.
Changes to templates:

Then I sync it into the cluster.

1 Like

I’ve created a new PSMDB with operator 1.7.0 and compared the generated StatefulSet with the one created by 1.6.0. And there is a difference in the label selectors (left side is 1.6.0, right side is 1.7.0):


So with 2 PSMDBs (1 created by 1.6.0, another by 1.7.0), when I deploy 1.6.0 operator, 1.6.0 PSMDB goes into “ready” state, while 1.7.0 PSMDB goes into “initializing” state.
When I deploy 1.7.0 operator, 1.6.0 PSMDB goes into “initializing” state, while 1.7.0 PSMDB goes into “ready” state.
Is there any way to align the stateful set without recreating it?

1 Like

@andreyk I’m not sure at what state is your cluster now and what is going on.

Could you please share the pods, psmdb objects and stateful sets?

1 Like

@Sergey_Pronin Currently I have 1.7.0 operator deployed and two psmdb objects:

NAME           ENDPOINT                                            STATUS         AGE
mongodb        mongodb-rs0.datalayer.svc.cluster.local             initializing   3d
mongodb-test   mongodb-test-rs0-test.datalayer.svc.cluster.local   ready          96m

“mongodb” was created with the 1.6.0 operator and “mongodb-test” was created with 1.7.0 operator.
If I deploy 1.6.0 operator then the states will switch. mongodb-test - initializing; mongodb - ready;
Unfortunately I can’t attach files here, so I’ve uploaded all the files to the drive:
https://drive.google.com/drive/folders/1LGV6AFXJxKQa5aPRNGTNhJOpyLfjsB8I?usp=sharing

1 Like