I’ve just upgraded PXC from 5.7 to 8.0. Galera Arbitrator version is
percona-xtradb-cluster-garbd/unknown,now 1:8.0.33-25-1.focal amd64
and it’s using 100% CPU.
There’s nothing suspicious in the log – apart from 0-nan% (0/0 events)
:
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.171 INFO: Flow-control interval: [1048575, 1048575]
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.171 INFO: Shifting OPEN -> PRIMARY (TO: 257099)
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.171 INFO: Sending state transfer request: 'trivial', size: 7
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.203 INFO: Member 0.0 (garb) requested state transfer from '*any*'. Selected 1.0 (nyc1)(SYNCED) as donor.
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.203 INFO: Shifting PRIMARY -> JOINER (TO: 257099)
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.235 INFO: 0.0 (garb): State transfer from 1.0 (nyc1) complete.
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.236 INFO: SST leaving flow control
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.236 INFO: Shifting JOINER -> JOINED (TO: 257099)
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.237 INFO: Processing event queue:...0-nan% (0/0 events) complete.
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.237 INFO: 1.0 (nyc1): State transfer to 0.0 (garb) complete.
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.264 INFO: Member 0.0 (garb) synced with group.
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.265 INFO: Processing event queue:...100.0% (1/1 events) complete.
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.265 INFO: Shifting JOINED -> SYNCED (TO: 257099)
Aug 13 12:38:52 aux garb-systemd[2284]: 2023-08-13 12:38:52.265 INFO: Member 1.0 (nyc1) synced with group.
Aug 13 12:38:54 aux garb-systemd[2284]: 2023-08-13 12:38:54.612 INFO: (5b34fe14-b138, 'ssl://0.0.0.0:4567') turning message relay requesting off
Here goes the head of the perf report
:
Samples: 1K of event 'cpu-clock:pppH', Event count (approx.): 9770000000
Children Self Command Shared Object Symbol
+ 100.00% 0.00% garbd libc-2.31.so [.] __clone
+ 100.00% 0.00% garbd libpthread-2.31.so [.] start_thread
+ 99.69% 0.00% garbd garbd [.] gcs_recv_thread
+ 99.69% 0.00% garbd garbd [.] gcs_core_recv
+ 99.69% 78.76% garbd garbd [.] gcomm_recv
+ 20.01% 19.96% garbd garbd [.] pfs_noop
+ 0.92% 0.00% garbd [kernel.kallsyms] [k] irq_exit_rcu
+ 0.92% 0.20% garbd [kernel.kallsyms] [k] __softirqentry_text_start
+ 0.51% 0.00% garbd [kernel.kallsyms] [k] asm_common_interrupt
+ 0.51% 0.00% garbd [kernel.kallsyms] [k] common_interrupt
0.46% 0.46% garbd [kernel.kallsyms] [k] __lock_text_start
0.46% 0.00% garbd [kernel.kallsyms] [k] asm_sysvec_apic_timer_interrupt
0.46% 0.00% garbd [kernel.kallsyms] [k] sysvec_apic_timer_interrupt
0.46% 0.00% garbd [kernel.kallsyms] [k] run_timer_softirq
0.41% 0.00% garbd [kernel.kallsyms] [k] call_timer_fn
0.41% 0.00% garbd [kernel.kallsyms] [k] rh_timer_func
0.41% 0.00% garbd [kernel.kallsyms] [k] usb_hcd_poll_rh_status
0.41% 0.00% garbd [kernel.kallsyms] [k] uhci_hub_status_data
0.31% 0.00% garbd garbd [.] run_fn
0.31% 0.00% garbd garbd [.] GCommConn::run
0.31% 0.00% garbd garbd [.] gcomm::AsioProtonet::event_loop
0.31% 0.00% garbd garbd [.] gu::AsioIoService::run
0.31% 0.00% garbd garbd [.] asio::detail::scheduler::run
0.26% 0.00% garbd [kernel.kallsyms] [k] net_rx_action
0.26% 0.00% garbd [kernel.kallsyms] [k] __napi_poll
0.26% 0.00% garbd [kernel.kallsyms] [k] virtnet_poll
0.15% 0.05% garbd garbd [.] asio::detail::epoll_reactor::descriptor_state::do_complete
This seems like a bug in garbd
.
Where should I report it?