Postgres crashes and goes to the recovering mode after patching to 16.10

Database reported core dump messages and kept going to recovery mode and was not starting up.

postgres=# select * from pg_extension ;

WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
postgres=?#

21,18All::wq
“/var/lib/pgsql/16/data/postgresql.conf” 55L, 1492B written

Redirecting to /bin/systemctl status postgresql-16.service

postgresql-16.service - PostgreSQL 16 database server
Loaded: loaded (/usr/lib/systemd/system/postgresql-16.service; disabled; preset: disabled)
Active: active (running) since Mon 2025-10-06 14:40:51 IST; 18h ago
Docs:
PostgreSQL: Documentation: 16: PostgreSQL 16.10 Documentation*
Main PID: 657515 (postgres)
Tasks: 2 (limit: 48390)
Memory: 437.1M
CPU: 4h 8min 55.871s
CGroup: /system.slice/postgresql-16.service
657515 /usr/pgsql-16/bin/postgres -D /var/lib/pgsql/16/data/
657516 "postgres: logger "*

Oct 07 09:14:32 systemd-coredump[1272580]: [ ] Process 1272573 (postgres) of user 26 dumped core.
Oct 07 09:14:34 systemd-coredump[1272639]: [ ] Process 1272629 (postgres) of user 26 dumped core.
Oct 07 09:14:35 systemd-coredump[1272708]: [ ] Process 1272700 (postgres) of user 26 dumped core.
Oct 07 09:14:37 systemd-coredump[1272771]: [ ] Process 1272752 (postgres) of user 26 dumped core.
Oct 07 09:14:39 systemd-coredump[1272832]: [ ] Process 1272825 (postgres) of user 26 dumped core.
Oct 07 09:14:41 systemd-coredump[1272871]: [ ] Process 1272866 (postgres) of user 26 dumped core.
Oct 07 09:14:42 systemd-coredump[1272924]: [ ] Process 1272919 (postgres) of user 26 dumped core.
Oct 07 09:14:43 systemd-coredump[1272966]: [ ] Process 1272961 (postgres) of user 26 dumped core.
Oct 07 09:14:45 systemd-coredump[1273012]: [ ] Process 1273007 (postgres) of user 26 dumped core.
Oct 07 09:14:47 systemd-coredump[1273052]: [ ] Process 1273047 (postgres) of user 26 dumped core.

Logs Below :



2025-10-07 07:20:19.558 IST,,,657515,,68e3879a.a086b,8,,2025-10-06 14:40:50 IST,,0,LOG,00000,"background worker ""Telemetry Reporter [1]"" (PID 1099380) was terminated by signal 6: Aborted",,,,,,,,,"","postmaster",,0
2025-10-07 07:20:19.558 IST,,,657515,,68e3879a.a086b,9,,2025-10-06 14:40:50 IST,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,"","postmaster",,0
2025-10-07 07:20:20.179 IST,,,657515,,68e3879a.a086b,10,,2025-10-06 14:40:50 IST,,0,LOG,00000,"all server processes terminated; reinitializing",,,,,,,,,"","postmaster",,0
2025-10-07 07:20:21.080 IST,,,657515,,68e3879a.a086b,11,,2025-10-06 14:40:50 IST,,0,LOG,00000,"pgauditlogtofile extension initialized",,,,,,,,,"","postmaster",,0
2025-10-07 07:20:21.085 IST,,,1099427,,68e471dd.10c6a3,1,,2025-10-07 07:20:21 IST,,0,LOG,00000,"database system was interrupted; last known up at 2025-10-07 07:00:33 IST",,,,,,,,,"","startup",,0
2025-10-07 07:20:21.090 IST,"m2pappusr","airflow",1099430,"100.0.1.131:59258",68e471dd.10c6a6,2,"",2025-10-07 07:20:21 IST,,0,FATAL,57P03,"the database system is in recovery mode",,,,,,,,,"","client backend",,0
2025-10-07 07:20:21.111 IST,"replusr","",1099431,"172.26.9.44:58878",68e471dd.10c6a7,2,"",2025-10-07 07:20:21 IST,,0,FATAL,57P03,"the database system is in recovery mode",,,,,,,,,"","walsender",,0
2025-10-07 07:20:21.323 IST,,,1099427,,68e471dd.10c6a3,2,,2025-10-07 07:20:21 IST,,0,LOG,00000,"database system was not properly shut down; automatic recovery in progress",,,,,,,,,"","startup",,0
2025-10-07 07:20:21.331 IST,,,1099427,,68e471dd.10c6a3,3,,2025-10-07 07:20:21 IST,,0,LOG,00000,"redo starts at E/F57E3478",,,,,,,,,"","startup",,0
2025-10-07 07:20:21.339 IST,,,1099427,,68e471dd.10c6a3,4,,2025-10-07 07:20:21 IST,,0,LOG,00000,"invalid record length at E/F58B6370: expected at least 24, got 0",,,,,,,,,"","startup",,0
2025-10-07 07:20:21.339 IST,,,1099427,,68e471dd.10c6a3,5,,2025-10-07 07:20:21 IST,,0,LOG,00000,"redo done at E/F58B6348 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s",,,,,,,,,"","startup",,0
2025-10-07 07:20:21.349 IST,,,1099428,,68e471dd.10c6a4,1,,2025-10-07 07:20:21 IST,,0,LOG,00000,"checkpoint starting: end-of-recovery immediate wait",,,,,,,,,"","checkpointer",,0
2025-10-07 07:20:21.382 IST,,,1099428,,68e471dd.10c6a4,2,,2025-10-07 07:20:21 IST,,0,LOG,00000,"checkpoint complete: wrote 77 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.019 s, sync=0.008 s, total=0.035 s;
sync files=21, longest=0.005 s, average=0.001 s; distance=843 kB, estimate=843 kB; lsn=E/F58B6370, redo lsn=E/F58B6370",,,,,,,,,"","checkpointer",,0
2025-10-07 07:20:21.401 IST,,,1099438,,68e471dd.10c6ae,1,,2025-10-07 07:20:21 IST,,0,LOG,00000,"pgauditlogtofile worker started",,,,,,,,,"","pgauditlogtofile launcher",,0
2025-10-07 07:20:21.412 IST,,,1099436,,68e471dd.10c6ac,1,,2025-10-07 07:20:21 IST,4/0,0,LOG,00000,"TimescaleDB background worker launcher connected to shared catalogs",,,,,,,,,"","TimescaleDB Background Worker Launcher",,0
2025-10-07 07:20:21.422 IST,,,1099436,,68e471dd.10c6ac,2,,2025-10-07 07:20:21 IST,4/0,0,LOG,53000,"no available background worker slots",,"Consider increasing max_worker_processes in tandem with timescaledb.max_background_workers.
",,,,,,,"","TimescaleDB Background Worker Launcher",,0
2025-10-07 07:20:21.423 IST,,,1099436,,68e471dd.10c6ac,3,,2025-10-07 07:20:21 IST,4/0,0,LOG,53000,"no available background worker slots",,"Consider increasing max_worker_processes in tandem with timescaledb.max_background_workers.
",,,,,,,"","TimescaleDB Background Worker Launcher",,0
2025-10-07 07:20:21.423 IST,,,1099436,,68e471dd.10c6ac,4,,2025-10-07 07:20:21 IST,4/0,0,LOG,53000,"no available background worker slots",,"Consider increasing max_worker_processes in tandem with timescaledb.max_background_workers.
",,,,,,,"","TimescaleDB Background Worker Launcher",,0
2025-10-07 07:20:21.423 IST,,,1099436,,68e471dd.10c6ac,5,,2025-10-07 07:20:21 IST,4/0,0,LOG,53000,"no available background worker slots",,"Consider increasing max_worker_processes in tandem with timescaledb.max_background_workers.
",,,,,,,"","TimescaleDB Background Worker Launcher",,0
2025-10-07 07:20:21.423 IST,,,1099436,,68e471dd.10c6ac,6,,2025-10-07 07:20:21 IST,4/0,0,LOG,53000,"no available background worker slots",,"Consider increasing max_worker_processes in tandem with timescaledb.max_background_workers.",,,,,,,"","TimescaleDB Background Worker Launcher",,0
2025-10-07 07:20:21.423 IST,,,1099436,,68e471dd.10c6ac,7,,2025-10-07 07:20:21 IST,4/0,0,LOG,53000,"no available background worker slots",,"Consider increasing max_worker_processes in tandem with timescaledb.max_background_workers.",,,,,,,"","TimescaleDB Background Worker Launcher",,0
2025-10-07 07:20:21.557 IST,,,1099437,,68e471dd.10c6ad,1,,2025-10-07 07:20:21 IST,3/0,0,LOG,00000,"pg_cron scheduler started",,,,,,,,,"","pg_cron launcher",,-3939546517400587147
2025-10-07 07:20:21.724 IST,,,657515,,68e3879a.a086b,13,,2025-10-06 14:40:50 IST,,0,LOG,00000,"background worker ""TimescaleDB Background Worker Scheduler"" (PID 1099446) was terminated by signal 6: Aborted",,,,,,,,,"","postmaster",,0
2025-10-07 07:20:21.724 IST,,,657515,,68e3879a.a086b,14,,2025-10-06 14:40:50 IST,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,"","postmaster",,0
2025-10-07 07:20:21.748 IST,,,657515,,68e3879a.a086b,15,,2025-10-06 14:40:50 IST,,0,LOG,00000,"all server processes terminated; reinitializing",,,,,,,,,"","postmaster",,0
2025-10-07 07:20:22.515 IST,,,657515,,68e3879a.a086b,16,,2025-10-06 14:40:50 IST,,0,LOG,00000,"pgauditlogtofile extension initialized",,,,,,,,,"","postmaster",,0
2025-10-07 07:20:22.520 IST,,,1099458,,68e471de.10c6c2,1,,2025-10-07 07:20:22 IST,,0,LOG,00000,"database system was interrupted; last known up at 2025-10-07 07:20:21 IST",,,,,,,,,"","startup",,0
2025-10-07 07:20:22.525 IST,"pmmusr","postgres",1099463,"10.160.9.41:48440",68e471de.10c6c7,2,"",2025-10-07 07:20:22 IST,,0,FATAL,57P03,"the database system is in recovery mode",,,,,,,,,"","client backend",,0
2025-10-07 07:20:22.525 IST,"pmmusr","postgres",1099462,"10.160.9.41:48426",68e471de.10c6c6,2,"",2025-10-07 07:20:22 IST,,0,FATAL,57P03,"the database system is in recovery mode",,,,,,,,,"","client backend",,0
2025-10-07 07:20:22.530 IST,"pmmusr","postgres",1099461,"10.160.9.41:48418",68e471de.10c6c5,2,"",2025-10-07 07:20:22 IST,,0,FATAL,57P03,"the database system is in recovery mode",,,,,,,,,"","client backend",,0
2025-10-07 07:20:22.738 IST,,,1099458,,68e471de.10c6c2,2,,2025-10-07 07:20:22 IST,,0,LOG,00000,"database system was not properly shut down; automatic recovery in progress",,,,,,,,,"","startup",,0
2025-10-07 07:20:22.745 IST,,,1099458,,68e471de.10c6c2,3,,2025-10-07 07:20:22 IST,,0,LOG,00000,"invalid record length at E/F58B63E8: expected at least 24, got 0",,,,,,,,,"","startup",,0
2025-10-07 07:20:22.745 IST,,,1099458,,68e471de.10c6c2,4,,2025-10-07 07:20:22 IST,,0,LOG,00000,"redo is not required",,,,,,,,,"","startup",,0
2025-10-07 07:20:22.755 IST,,,1099459,,68e471de.10c6c3,1,,2025-10-07 07:20:22 IST,,0,LOG,00000,"checkpoint starting: end-of-recovery immediate wait",,,,,,,,,"","checkpointer",,0
2025-10-07 07:20:22.771 IST,,,1099459,,68e471de.10c6c3,2,,2025-10-07 07:20:22 IST,,0,LOG,00000,"checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.006 s, sync=0.005 s, total=0.019 s; sync files=3, longest=0.003 s, average=0.002 s; distance=0 kB, estimate=0 kB; lsn=E/F58B63E8, redo lsn=E/F58B63E8",,,,,,,,,"","checkpointer",,0
2025-10-07 07:20:22.784 IST,,,1099466,,68e471de.10c6ca,1,,2025-10-07 07:20:22 IST,3/0,0,LOG,00000,"TimescaleDB background worker launcher connected to shared catalogs",,,,,,,,,"","TimescaleDB Background Worker Launcher",,0
2025-10-07 07:20:22.796 IST,,,1099468,,68e471de.10c6cc,1,,2025-10-07 07:20:22 IST,,0,LOG,00000,"pgauditlogtofile worker started",,,,,,,,,"","pgauditlogtofile launcher",,0
2025-10-07 07:20:22.804 IST,,,1099467,,68e471de.10c6cb,1,,2025-10-07 07:20:22 IST,2/0,0,LOG,00000,"pg_cron scheduler started",,,,,,,,,"","pg_cron launcher",,-3939546517400587147

@harish.jaipu

I see some extensions (TimescaleDB/pg_audit) resemblance around the crash time frame.

2025-10-07 07:20:21.412 IST,,,1099436,,68e471dd.10c6ac,1,,2025-10-07 07:20:21 IST,4/0,0,LOG,00000,"TimescaleDB background worker launcher connected to shared catalogs",,,,,,,,,"","TimescaleDB Background Worker Launcher",,0

It appears the background worker slot is not enough. You need to tune your timescale db max_background_workers limit:-

2025-10-07 07:20:21.422 IST,,,1099436,,68e471dd.10c6ac,2,,2025-10-07 07:20:21 IST,4/0,0,LOG,53000,"no available background worker slots",,"Consider increasing max_worker_processes in tandem with timescaledb.max_background_workers.
",,,,,,,"","TimescaleDB Background Worker Launcher",,0
2025-10-07 07:20:21.423 IST,,,1099436,,68e471dd.10c6ac,3,,2025-10-07 07:20:21 IST,4/0,0,LOG,53000,"no available background worker slots",,"Consider increasing max_worker_processes in tandem with timescaledb.max_background_workers.
",,,,,,,"","TimescaleDB Background Worker Launcher",,0

2025-10-07 07:20:21.080 IST,,,657515,,68e3879a.a086b,11,,2025-10-06 14:40:50 IST,,0,LOG,00000,"pgauditlogtofile extension initialized",,,,,,,,,"","postmaster",,0

Were you doing any changes or adding any extension post that patch activity ? What about the OS resources , are they fine and not saturated ?

Did you find any insight in os/kernel logs ?

/var/log/messages/
/var/log/syslog
dmesg -T

Is this behaviour reproducible ? Did you tried tweaking max_background_workers and then restart the service again ?