Percona Mongodb operator restarts with error "fatal error: concurrent map read and map write"

Description:

Today I’ve noticed that the Percona Operator for Mongodb restarted ~200 times on my Kubernetes cluster.
It runs 1 backup job and is configured as a replica set of size 3 with sharding disabled.

The error I’ve seen looks like nothing I can find on the internet. I pasted logs that I’ve got using kubectl logs mongodb-operator-psmdb-operator-XYZ --previous

Steps to Reproduce:

I installed the operator using helm.
My percona/psmdb-operator configuration is empty.
My percona/psmdb-db looks like this:

USER-SUPPLIED VALUES:
backup:
  enabled: true
  storages:
    s3-storage:
      s3:
        bucket: <my-bucket-name>
        credentialsSecret: <my-secret-name>
        prefix: <my-prefix>
        region: <my-region>
      type: s3
  tasks:
  - enabled: true
    keep: 56
    name: mongodb-backup
    schedule: 0 */3 * * *
    storageName: s3-storage
    type: logical
replsets:
  rs0:
    resources:
      limits:
        cpu: null
        memory: 1Gi
      requests:
        cpu: 300m
        memory: 500M
    volumeSpec:
      pvc:
        resources:
          requests:
            storage: 20Gi
sharding:
  enabled: false

Version:

percona/percona-server-mongodb-operator:1.16.2

Logs:

2024-12-23T07:30:53.137Z	INFO	setup	Manager starting up	{"gitCommit": "13627d423321257e18b77d270af922c6cd17c8f0", "gitBranch": "release-1-16-2", "goVersion": "go1.22.5", "os": "linux", "arch": "amd64"}
2024-12-23T07:30:53.178Z	INFO	server version	{"platform": "kubernetes", "version": "v1.30.2"}
2024-12-23T07:30:53.189Z	INFO	controller-runtime.metrics	Starting metrics server
2024-12-23T07:30:53.190Z	INFO	controller-runtime.metrics	Serving metrics server	{"bindAddress": ":8080", "secure": false}
2024-12-23T07:30:53.190Z	INFO	starting server	{"name": "health probe", "addr": "[::]:8081"}
I1223 07:30:53.190962       1 leaderelection.go:250] attempting to acquire leader lease default/08db0feb.percona.com...
I1223 07:31:10.436850       1 leaderelection.go:260] successfully acquired lease default/08db0feb.percona.com
2024-12-23T07:31:10.441Z	INFO	Starting EventSource	{"controller": "psmdb-controller", "source": "kind source: *v1.PerconaServerMongoDB"}
2024-12-23T07:31:10.441Z	INFO	Starting Controller	{"controller": "psmdb-controller"}
2024-12-23T07:31:10.444Z	INFO	Starting EventSource	{"controller": "psmdbbackup-controller", "source": "kind source: *v1.PerconaServerMongoDBBackup"}
2024-12-23T07:31:10.448Z	INFO	Starting EventSource	{"controller": "psmdbbackup-controller", "source": "kind source: *v1.Pod"}
2024-12-23T07:31:10.448Z	INFO	Starting Controller	{"controller": "psmdbbackup-controller"}
2024-12-23T07:31:10.441Z	INFO	Starting EventSource	{"controller": "psmdbrestore-controller", "source": "kind source: *v1.PerconaServerMongoDBRestore"}
2024-12-23T07:31:10.453Z	INFO	Starting EventSource	{"controller": "psmdbrestore-controller", "source": "kind source: *v1.Pod"}
2024-12-23T07:31:10.453Z	INFO	Starting Controller	{"controller": "psmdbrestore-controller"}
2024-12-23T07:31:10.795Z	INFO	Starting workers	{"controller": "psmdbbackup-controller", "worker count": 1}
2024-12-23T07:31:10.840Z	INFO	Starting workers	{"controller": "psmdb-controller", "worker count": 1}
2024-12-23T07:31:10.841Z	INFO	Starting workers	{"controller": "psmdbrestore-controller", "worker count": 1}
2024-12-23T07:31:11.061Z	INFO	Creating or updating backup job	{"controller": "psmdb-controller", "object": {"name":"mongodb-psmdb-db","namespace":"default"}, "namespace": "default", "name": "mongodb-psmdb-db", "reconcileID": "175772d4-1478-4b51-b99f-04ec4b3d179e", "name": "mongodb-backup", "namespace": "default", "schedule": "0 */3 * * *"}
2024-12-23T07:31:11.884Z	INFO	add new job	{"controller": "psmdb-controller", "object": {"name":"mongodb-psmdb-db","namespace":"default"}, "namespace": "default", "name": "mongodb-psmdb-db", "reconcileID": "175772d4-1478-4b51-b99f-04ec4b3d179e", "name": "ensure-version/default/mongodb-psmdb-db", "schedule": "0 2 * * *"}
fatal error: concurrent map read and map write

goroutine 200 [running]:
k8s.io/apimachinery/pkg/runtime.(*Scheme).ObjectKinds(0xc000525dc0, {0x299e138?, 0xc000daab48?})
	/go/pkg/mod/k8s.io/apimachinery@v0.30.0/pkg/runtime/scheme.go:267 +0x1cb
sigs.k8s.io/controller-runtime/pkg/client/apiutil.GVKForObject({0x299e138, 0xc000daab48}, 0x0?)
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/client/apiutil/apimachinery.go:115 +0x21f
sigs.k8s.io/controller-runtime/pkg/cache.(*informerCache).Get(0xc0005c08e8, {0x29b1d20, 0xc001b2ddd0}, {{0xc0012552a0?, 0x1a?}, {0xc00124d290?, 0x11?}}, {0x29cd280, 0xc000daab48}, {0x0, ...})
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/cache/informer_cache.go:75 +0x97
sigs.k8s.io/controller-runtime/pkg/cache.(*multiNamespaceCache).Get(0xc0008425d0, {0x29b1d20, 0xc001b2ddd0}, {{0xc0012552a0, 0x7}, {0xc00124d290, 0x2a}}, {0x29cd280, 0xc000daab48}, {0x0, ...})
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/cache/multi_namespace_cache.go:244 +0x22d
sigs.k8s.io/controller-runtime/pkg/client.(*client).Get(0xc0006e2b40, {0x29b1d20, 0xc001b2ddd0}, {{0xc0012552a0?, 0xc0014e1300?}, {0xc00124d290?, 0xc000505940?}}, {0x29cd280, 0xc000daab48}, {0x0, ...})
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/client/client.go:356 +0x230
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodbbackup.(*ReconcilePerconaServerMongoDBBackup).Reconcile(0xc00095c8a0, {0x29b1d20, 0xc001b2ddd0}, {{{0xc0012552a0?, 0x5?}, {0xc00124d290?, 0xc000505d10?}}})
	/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodbbackup/perconaservermongodbbackup_controller.go:120 +0x124
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x29b6b08?, {0x29b1d20?, 0xc001b2ddd0?}, {{{0xc0012552a0?, 0xb?}, {0xc00124d290?, 0x0?}}})
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:114 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000962160, {0x29b1d58, 0xc000844690}, {0x1fe0200, 0xc0017659c0})
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:311 +0x3bc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000962160, {0x29b1d58, 0xc000844690})
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:261 +0x1be
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:222 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 100
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:218 +0x486

goroutine 1 [select, 86 minutes]:
sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).Start(0xc0004b6000, {0x29b1d58, 0xc000844eb0})
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/manager/internal.go:435 +0x996
main.main()
	/go/src/github.com/percona/percona-server-mongodb-operator/cmd/manager/main.go:160 +0xc7d

goroutine 52 [syscall, 86 minutes]:
os/signal.signal_recv()
	/usr/local/go/src/runtime/sigqueue.go:152 +0x29
os/signal.loop()
	/usr/local/go/src/os/signal/signal_unix.go:23 +0x13
created by os/signal.Notify.func1.1 in goroutine 1
	/usr/local/go/src/os/signal/signal.go:151 +0x1f

goroutine 54 [chan receive, 86 minutes]:
sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile(0xc0006e3320)
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/manager/runnable_group.go:186 +0x45
created by sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).Start.func1 in goroutine 1
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/manager/runnable_group.go:139 +0xc8

goroutine 53 [chan receive, 86 minutes]:
sigs.k8s.io/controller-runtime/pkg/manager/signals.SetupSignalHandler.func1()
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/manager/signals/signal.go:38 +0x27
created by sigs.k8s.io/controller-runtime/pkg/manager/signals.SetupSignalHandler in goroutine 1
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/manager/signals/signal.go:37 +0xc5

goroutine 44 [IO wait]:
internal/poll.runtime_pollWait(0x7fc9bce4fde8, 0x72)
	/usr/local/go/src/runtime/netpoll.go:345 +0x85
internal/poll.(*pollDesc).wait(0xc000838b00?, 0xc000a10000?, 0x0)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc000838b00, {0xc000a10000, 0xa000, 0xa000})
	/usr/local/go/src/internal/poll/fd_unix.go:164 +0x27a
net.(*netFD).Read(0xc000838b00, {0xc000a10000?, 0x7fc9bcc164e0?, 0xc000ff83a8?})
	/usr/local/go/src/net/fd_posix.go:55 +0x25
net.(*conn).Read(0xc000683990, {0xc000a10000?, 0xc000d39938?, 0x411bbb?})
	/usr/local/go/src/net/net.go:185 +0x45
crypto/tls.(*atLeastReader).Read(0xc000ff83a8, {0xc000a10000?, 0x0?, 0xc000ff83a8?})
	/usr/local/go/src/crypto/tls/conn.go:806 +0x3b
bytes.(*Buffer).ReadFrom(0xc0007dad30, {0x2994160, 0xc000ff83a8})
	/usr/local/go/src/bytes/buffer.go:211 +0x98
crypto/tls.(*Conn).readFromUntil(0xc0007daa88, {0x29941e0, 0xc000683990}, 0xc000d39980?)
	/usr/local/go/src/crypto/tls/conn.go:828 +0xde
crypto/tls.(*Conn).readRecordOrCCS(0xc0007daa88, 0x0)
	/usr/local/go/src/crypto/tls/conn.go:626 +0x3cf
crypto/tls.(*Conn).readRecord(...)
	/usr/local/go/src/crypto/tls/conn.go:588
crypto/tls.(*Conn).Read(0xc0007daa88, {0xc0008f9000, 0x1000, 0xc001585180?})
	/usr/local/go/src/crypto/tls/conn.go:1370 +0x156
bufio.(*Reader).Read(0xc0008f4840, {0xc0006f7000, 0x9, 0x3aeb8b0?})
	/usr/local/go/src/bufio/bufio.go:241 +0x197
io.ReadAtLeast({0x2992940, 0xc0008f4840}, {0xc0006f7000, 0x9, 0x9}, 0x9)
	/usr/local/go/src/io/io.go:335 +0x90
io.ReadFull(...)
	/usr/local/go/src/io/io.go:354
golang.org/x/net/http2.readFrameHeader({0xc0006f7000, 0x9, 0xc000d39dc0?}, {0x2992940?, 0xc0008f4840?})
	/go/pkg/mod/golang.org/x/net@v0.24.0/http2/frame.go:237 +0x65
golang.org/x/net/http2.(*Framer).ReadFrame(0xc0006f6fc0)
	/go/pkg/mod/golang.org/x/net@v0.24.0/http2/frame.go:498 +0x85
golang.org/x/net/http2.(*clientConnReadLoop).run(0xc000d39fa8)
	/go/pkg/mod/golang.org/x/net@v0.24.0/http2/transport.go:2429 +0xd8
golang.org/x/net/http2.(*ClientConn).readLoop(0xc000836780)
	/go/pkg/mod/golang.org/x/net@v0.24.0/http2/transport.go:2325 +0x65
created by golang.org/x/net/http2.(*ClientConn).goRun in goroutine 43
	/go/pkg/mod/golang.org/x/net@v0.24.0/http2/transport.go:369 +0x2d
(...)

Full log: 2024-12-23T07:30:53.137Z INFO setup Manager starting up {"gitCommit": "13627d423 - Pastebin.com

Expected Result:

I expect to understand what the issue is about or why did it happen

Actual Result:

Error shows up and pod is restarted

Additional Information:

calling for @Inel_Pandzic or @Ege_Gunes help

Hello @Michal_Bartnicki , thanks for reporting this!

I found what is the issue here and created a task so we can fix it in our next release.

https://perconadev.atlassian.net/browse/K8SPSMDB-1239