We have two clusters in the same K8 both started to fail and re-start every minuet due to a segement fault in pg_stat_monitor. I have not been able to find the cause and looking for some assistance. I have disabled the pmm client for both clusters which resolves the issue. I have also deleted and re-installed pmm helm chart then added back the pmm-client to each cluster and the error returns. See the log below for the error.
2024-12-17 15:17:57.092 UTC [174] LOG: database system is ready to accept connections
2024-12-17 15:18:25.998 UTC [3041] LOG: could not receive data from client: Connection reset by peer
2024-12-17 15:18:25.998 UTC [3041] STATEMENT: START_REPLICATION 1AA/DD000000 TIMELINE 34
2024-12-17 15:18:25.998 UTC [3041] LOG: unexpected EOF on standby connection
2024-12-17 15:18:25.998 UTC [3041] STATEMENT: START_REPLICATION 1AA/DD000000 TIMELINE 34
2024-12-17 15:18:58.601 UTC [174] LOG: server process (PID 3257) was terminated by signal 11: Segmentation fault
2024-12-17 15:18:58.601 UTC [174] DETAIL: Failed process was running: SELECT /* agent=‘pgstatmonitor’ */ “pg_stat_monitor”.“bucket”, “pg_stat_monitor”.“client_ip”, “pg_stat_monitor”.“query”, “pg_stat_monitor”.“calls”, “pg_stat_monitor”.“shared_blks_hit”, “pg_stat_monitor”.“shared_blks_read”, “pg_stat_monitor”.“shared_blks_dirtied”, “pg_stat_monitor”.“shared_blks_written”, “pg_stat_monitor”.“local_blks_hit”, “pg_stat_monitor”.“local_blks_read”, “pg_stat_monitor”.“local_blks_dirtied”, “pg_stat_monitor”.“local_blks_written”, “pg_stat_monitor”.“temp_blks_read”, “pg_stat_monitor”.“temp_blks_written”, “pg_stat_monitor”.“blk_read_time”, “pg_stat_monitor”.“blk_write_time”, “pg_stat_monitor”.“resp_calls”, “pg_stat_monitor”.“cpu_user_time”, “pg_stat_monitor”.“cpu_sys_time”, “pg_stat_monitor”.“rows”, “pg_stat_monitor”.“relations”, “pg_stat_monitor”.“datname”, “pg_stat_monitor”.“userid”, “pg_stat_monitor”.“top_queryid”, “pg_stat_monitor”.“planid”, “pg_stat_monitor”.“query_plan”, “pg_stat_monitor”.“top_query”, “pg_stat_monitor”.“application_name”, “pg_stat_monitor”.“cmd_type”, "pg_stat_mon
2024-12-17 15:18:58.601 UTC [174] LOG: terminating any other active server processes
2024-12-17 15:18:58.603 UTC [3266] FATAL: the database system is in recovery mode
2024-12-17 15:18:58.604 UTC [174] LOG: all server processes terminated; reinitializing
2024-12-17 15:18:58.605 UTC [174] LOG: [pg_stat_monitor] pgsm_shmem_shutdown: Shutdown initiated.
2024-12-17 15:18:58.859 UTC [3267] LOG: database system was interrupted; last known up at 2024-12-17 15:17:57 UTC
2024-12-17 15:18:58.864 UTC [3270] FATAL: the database system is in recovery mode
2024-12-17 15:18:58.871 UTC [3267] LOG: database system was not properly shut down; automatic recovery in progress
2024-12-17 15:18:58.874 UTC [3267] LOG: redo starts at 1AA/DD013648
2024-12-17 15:18:58.874 UTC [3267] LOG: invalid record length at 1AA/DD013BA0: wanted 24, got 0
2024-12-17 15:18:58.874 UTC [3267] LOG: redo done at 1AA/DD013B68 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
2024-12-17 15:18:58.878 UTC [3268] LOG: checkpoint starting: end-of-recovery immediate wait
2024-12-17 15:18:58.889 UTC [3268] LOG: checkpoint complete: wrote 5 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.005 s, sync=0.003 s, total=0.013 s; sync files=4, longest=0.001 s, average=0.001 s; distance=1 kB, estimate=1 kB
2024-12-17 15:18:58.893 UTC [174] LOG: database system is ready to accept connections