Description:
One mysqld process crashed with OOM (out of memory), then cluster crashed, and now it can’t recover.
Steps to Reproduce:
I have Kubernetes cluster
- Harvester 1.3.1
- 2 baremetal nodes (32 cores Xeon Gold 5218, 128GB RAM, NVMe discs, 10Gbps Ethernet) which are management and workers and 1 witness node
- Installed “Percona Operator for MySQL based on Percona Server” 0.8.0:
helm install percona-op percona/ps-operator --namespace percona --create-namespace
helm install db3 percona/ps-db --namespace percona \
--set unsafeFlags.mysqlSize=true \
--set mysql.size=2 \
--set mysql.volumeSpec.pvc.storageClassName=ssd-1local \
--set mysql.volumeSpec.pvc.resources.requests.storage=20Gi \
--set proxy.haproxy.size=2
I know, that 2 nodes are not safe config, but everything worked fine for few days, but today I found that Percona cluster in the error state:
kubectl get ps db3-ps-db -n percona
NAME REPLICATION ENDPOINT STATE MYSQL ORCHESTRATOR HAPROXY ROUTER AGE
db3-ps-db group-replication db3-ps-db-haproxy.percona error 2 2 17d
# looking at events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FullClusterCrashDetected 3m9s (x5271 over 14h) ps-controller Full cluster crash detected
Logs:
While investigating, I found that mysqld process was killed by Kubernetes on one (only one) of the host:
t7920:/home/rancher # dmesg -T
[Sat Jul 27 01:10:03 2024] gcs_xcom invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=997
[Sat Jul 27 01:10:03 2024] CPU: 0 PID: 15334 Comm: gcs_xcom Tainted: G X 5.14.21-150400.24.119-default #1 SLE15-SP4 72b78b4bcd575b4ef23e194de811a4f54afb0381
[Sat Jul 27 01:10:03 2024] Hardware name: Dell Inc. Precision 7920 Tower/060K5C, BIOS 2.1.4 07/29/2019
[Sat Jul 27 01:10:03 2024] Call Trace:
[Sat Jul 27 01:10:03 2024] <TASK>
[Sat Jul 27 01:10:03 2024] dump_stack_lvl+0x45/0x5b
[Sat Jul 27 01:10:03 2024] dump_header+0x4a/0x220
[Sat Jul 27 01:10:03 2024] oom_kill_process+0xe8/0x140
[Sat Jul 27 01:10:03 2024] out_of_memory+0x113/0x580
[Sat Jul 27 01:10:03 2024] mem_cgroup_out_of_memory+0xe3/0x100
[Sat Jul 27 01:10:03 2024] try_charge_memcg+0x6bb/0x700
[Sat Jul 27 01:10:03 2024] ? __alloc_pages+0x180/0x320
[Sat Jul 27 01:10:03 2024] charge_memcg+0x40/0xa0
[Sat Jul 27 01:10:03 2024] __mem_cgroup_charge+0x2c/0xa0
[Sat Jul 27 01:10:03 2024] __handle_mm_fault+0xa37/0x1220
[Sat Jul 27 01:10:03 2024] handle_mm_fault+0xd5/0x290
[Sat Jul 27 01:10:03 2024] do_user_addr_fault+0x1eb/0x730
[Sat Jul 27 01:10:03 2024] ? do_syscall_64+0x67/0x80
[Sat Jul 27 01:10:03 2024] exc_page_fault+0x67/0x150
[Sat Jul 27 01:10:03 2024] asm_exc_page_fault+0x59/0x60
[Sat Jul 27 01:10:03 2024] RIP: 0033:0x7f02b055970a
[Sat Jul 27 01:10:03 2024] Code: 31 c9 48 8d 34 2a 48 39 fb 48 89 73 60 4c 8d 42 10 0f 95 c1 48 29 e8 48 c1 e1 02 48 83 c8 01 48 09 e9 48 83 c9 01 48 89 4a 08 <48> 89 46 08 48 83 c4 48 4c 89 c0 5b 5d 41 5c 41 5d 41 5e 41 5f c3
[Sat Jul 27 01:10:03 2024] RSP: 002b:00007f01fe7fae00 EFLAGS: 00010202
[Sat Jul 27 01:10:03 2024] RAX: 0000000000000fc1 RBX: 00007f01bc000020 RCX: 0000000000000115
[Sat Jul 27 01:10:03 2024] RDX: 00007f014c234f30 RSI: 00007f014c235040 RDI: 00007f02b06a6c80
[Sat Jul 27 01:10:03 2024] RBP: 0000000000000110 R08: 00007f014c234f40 R09: 0000000000236000
[Sat Jul 27 01:10:03 2024] R10: 0000000000000130 R11: 0000000000000206 R12: 00000000000000d0
[Sat Jul 27 01:10:03 2024] R13: 0000000000001000 R14: 00007f014c234f30 R15: 0000000000000130
[Sat Jul 27 01:10:03 2024] </TASK>
[Sat Jul 27 01:10:03 2024] memory: usage 1500000kB, limit 1500000kB, failcnt 0
[Sat Jul 27 01:10:03 2024] memory+swap: usage 1500000kB, limit 1500000kB, failcnt 20703
[Sat Jul 27 01:10:03 2024] kmem: usage 7192kB, limit 9007199254740988kB, failcnt 0
[Sat Jul 27 01:10:03 2024] Memory cgroup stats for /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf366b2e3_b354_4eab_af12_bdd6dee78c36.slice/cri-containerd-ec228c92ac8549ead3673ff4bb83ef44c8b41fe812942e9b72a38cb09918bd7d.scope:
[Sat Jul 27 01:10:03 2024] anon 1528438784
file 196608
kernel_stack 1064960
pagetables 3567616
percpu 0
sock 0
shmem 0
file_mapped 0
file_dirty 0
file_writeback 0
swapcached 0
anon_thp 1436549120
file_thp 0
shmem_thp 0
inactive_anon 1528434688
active_anon 4096
inactive_file 196608
active_file 0
unevictable 0
slab_reclaimable 1659808
slab_unreclaimable 1039216
slab 2699024
workingset_refault_anon 0
workingset_refault_file 29562
workingset_activate_anon 0
workingset_activate_file 2884
workingset_restore_anon 0
workingset_restore_file 1905
workingset_nodereclaim 0
pgfault 268222950
pgmajfault 426
pgrefill 13597
pgscan 104748
pgsteal 101384
pgactivate 40159
pgdeactivate 13520
pglazyfree 0
pglazyfreed 0
thp_fault_alloc 314424
thp_collapse_alloc 529
[Sat Jul 27 01:10:03 2024] Tasks state (memory values in pages):
[Sat Jul 27 01:10:03 2024] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[Sat Jul 27 01:10:03 2024] [ 15059] 1001 15059 1495894 379550 3579904 0 997 mysqld
[Sat Jul 27 01:10:03 2024] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=cri-containerd-ec228c92ac8549ead3673ff4bb83ef44c8b41fe812942e9b72a38cb09918bd7d.scope,mems_allowed=0,oom_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf366b2e3_b354_4eab_af12_bdd6dee78c36.slice/cri-containerd-ec228c92ac8549ead3673ff4bb83ef44c8b41fe812942e9b72a38cb09918bd7d.scope,task_memcg=/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf366b2e3_b354_4eab_af12_bdd6dee78c36.slice/cri-containerd-ec228c92ac8549ead3673ff4bb83ef44c8b41fe812942e9b72a38cb09918bd7d.scope,task=mysqld,pid=15059,uid=1001
[Sat Jul 27 01:10:03 2024] Memory cgroup out of memory: Killed process 15059 (mysqld) total-vm:5983576kB, anon-rss:1490300kB, file-rss:27900kB, shmem-rss:0kB, UID:1001 pgtables:3496kB oom_score_adj:997
[Sat Jul 27 01:10:03 2024] oom_reaper: reaped process 15059 (mysqld), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
After cluster killed the process, here is what I see in the Percona node logs:
kubectl logs -n percona db3-ps-db-mysql-0
2024-07-27T01:11:56.758865Z 14 [Note] [MY-010581] [Repl] Replica SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'INVALID' at position 0, relay log './db3-ps-db-mysql-0-relay-bin-group_replication_applier.000002' position: 4
2024-07-27T01:11:56.882234Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Using MySQL as Communication Stack for XCom'
2024-07-27T01:11:56.882446Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Successfully connected to the local XCom via anonymous pipe'
2024-07-27T01:11:56.962547Z 0 [ERROR] [MY-013780] [Repl] Plugin group_replication reported: 'Failed to establish MySQL client connection in Group Replication. Error establishing connection. Please refer to the manual to make sure that you configured Group Replication properly to work with MySQL Protocol connections.'
2024-07-27T01:11:56.962594Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error on opening a connection to peer node db3-ps-db-mysql-1.db3-ps-db-mysql.percona:3306 when joining a group. My local port is: 3306.'
...
2024-07-27T01:11:56.999063Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error connecting to all peers. Member join failed. Local port: 3306'
2024-07-27T01:11:57.074696Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 3306'
2024-07-27T01:11:57.074753Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sleeping for 5 seconds before retrying to join the group. There are 9 more attempt(s) before giving up.'
...
2024-07-27T01:12:49.678061Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Request failed: maximum number of retries (10) has been exhausted.'
2024-07-27T01:12:49.678122Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Failed to send add_node request to a peer XCom node.'
2024-07-27T01:12:49.701341Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] This node received a Configuration change request, but it not yet started. This could happen if one starts several nodes simultaneously. This request will be retried by whoever sent it.'
2024-07-27T01:12:49.738710Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] TCP_NODELAY already set'
2024-07-27T01:12:49.738739Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sucessfully connected to peer db3-ps-db-mysql-1.db3-ps-db-mysql.percona:3306. Sending a request to be added to the group'
2024-07-27T01:12:49.738752Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sending add_node request to a peer XCom node'
2024-07-27T01:12:49.739435Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Retrying a request to a remote XCom. Please check the remote node log for more details.'
2024-07-27T01:12:50.702047Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] This node received a Configuration change request, but it not yet started. This could happen if one starts several nodes simultaneously. This request will be retried by whoever sent it.'
...
2024-07-27T01:12:56.745411Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Retrying a request to a remote XCom. Please check the remote node log for more details.'
2024-07-27T01:12:56.759277Z 4 [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group'
2024-07-27T01:12:56.759348Z 4 [Note] [MY-011649] [Repl] Plugin group_replication reported: 'Requesting to leave the group despite of not being a member'
2024-07-27T01:12:56.759387Z 4 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is already leaving or joining a group.'
2024-07-27T01:12:57.708716Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] This node received a Configuration change request, but it not yet started. This could happen if one starts several nodes simultaneously. This request will be retried by whoever sent it.'
2024-07-27T01:12:57.746136Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Retrying a request to a remote XCom. Please check the remote node log for more details.'
2024-07-27T01:12:58.709674Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] This node received a Configuration change request, but it not yet started. This could happen if one starts several nodes simultaneously. This request will be retried by whoever sent it.'
2024-07-27T01:12:58.747064Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Retrying a request to a remote XCom. Please check the remote node log for more details.'
2024-07-27T01:12:59.747244Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Request failed: maximum number of retries (10) has been exhausted.'
2024-07-27T01:12:59.747321Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Failed to send add_node request to a peer XCom node.'
2024-07-27T01:12:59.751200Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error connecting to all peers. Member join failed. Local port: 3306'
2024-07-27T01:12:59.831740Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 3306'
2024-07-27T01:12:59.831800Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Sleeping for 5 seconds before retrying to join the group. There are 3 more attempt(s) before giving up.'
2024-07-27T01:13:04.833041Z 14 [Note] [MY-010596] [Repl] Error reading relay log event for channel 'group_replication_applier': replica SQL thread was killed
2024-07-27T01:13:04.833449Z 14 [Note] [MY-010587] [Repl] Replica SQL thread for channel 'group_replication_applier' exiting, replication stopped in log 'FIRST' at position 0
2024-07-27T01:13:04.833626Z 12 [Note] [MY-011444] [Repl] Plugin group_replication reported: 'The group replication applier thread was killed.'
and Percona operator logs:
kubectl logs -n percona percona-op-ps-operator-7b4d5d6d8c-cmnbk
2024-07-27T01:12:50.677Z INFO groupReplicationStatus.db3-ps-db-mysql-0.db3-ps-db-mysql.percona Member is not ONLINE {"controller": "ps-controller", "controllerGroup": "ps.percona.com", "controllerKind": "PerconaServerMySQL", "PerconaServerMySQL": {"name":"db3-ps-db","namespace":"percona"}, "namespace": "percona", "name": "db3-ps-db", "reconcileID": "54e6e0b9-517b-45b0-aa59-454f545098df", "state": "OFFLINE"}
2024-07-27T01:12:50.997Z INFO Crash recovery Pod is waiting for recovery {"controller": "ps-controller", "controllerGroup": "ps.percona.com", "controllerKind": "PerconaServerMySQL", "PerconaServerMySQL": {"name":"db3-ps-db","namespace":"percona"}, "namespace": "percona", "name": "db3-ps-db", "reconcileID": "21b59b9d-b37d-4a79-9d29-f852a38ade8c", "pod": "db3-ps-db-mysql-0", "gtidExecuted": "c23bdefa-3e77-11ef-ac3d-32605eb3ef85:1-4,cd68adda-3e77-11ef-a089-32605eb3ef85:1-5239:1000386-1000395,cd68b3ae-3e77-11ef-a089-32605eb3ef85:1-34"}
The resource settings I had for mysql:
resources:
limits:
memory: 1536M
requests:
memory: 512M
So, I increased RAM limit to 3G with this command:
kubectl edit ps -n percona db3-ps-db
resources:
limits:
memory: 3G
requests:
memory: 1G
But that change didn’t restart the cluster.
So, I restarted it by setting spec.pause
to true, waiting one minute and back to false
:
kubectl patch ps db3-ps-db -n percona --type='merge' -p '{"spec":{"pause":true}}'
kubectl patch ps db3-ps-db -n percona --type='merge' -p '{"spec":{"pause":false}}'
Cluster restarted, but getting to error state again.
What else can I do to try to fix it?