Encountering Toast Table Corruption and Missing Chunk Number Error During PostgreSQL Data Migration

Root_Cause · September 5, 2023, 8:25pm

Version - PostgreSQL 10.21, compiled by Visual C++ build 1800, 64-bit

Platform - Windows

When migrating data from PostgreSQL 10.21 to 14.7, we encountered an issue during the dump process related to table corruption, resulting in the error message ‘could not open file “base/16510/37857”: No such file or directory.’

ERROR: could not open file “base/16510/37857”: No such file or directory

Upon checking the ‘relkind’ for ‘relfilenode’, it was determined that the type is toast table. In an attempt to recover the data, we initially tried truncating the toast table, but encountered an error preventing successful truncation. As a workaround, we created a new empty file named “37857” under the directory “base/16510.” However, this led to a new error: ‘missing chunk number 0 for toast value 15977662 in pg_toast_37722.’

ERROR: missing chunk number 0 for toast value 15977662 in pg_toast_37722

ProdDB=# select count(*) from prrhtab;
 count
--------
 232966
(1 row)

ProdDB=# select * from prrhtab;
ERROR:  could not open file "base/16510/37857": No such file or directory
ProdDB=# select relname, relkind from pg_class where relfilenode=37857;
    relname     | relkind
----------------+---------
 pg_toast_37722 | t
(1 row)

ProdDB=# select * from pg_toast.pg_toast_37722;
ERROR:  could not open file "base/16510/37857": No such file or directory
ProdDB=# select count(*) from pg_toast.pg_toast_37722;
ERROR:  could not open file "base/16510/37857": No such file or directory

ProdDB=# truncate table pg_toast.pg_toast_37722;
ERROR:  "pg_toast_37722" is not a table

ProdDB=# select count(*) from pg_toast.pg_toast_37722;
 count
-------
     0
(1 row)

ProdDB=# select * from prrhtab;
ERROR:  could not read block 2635 in file "base/16510/37857": read only 0 of 8192 bytes
ProdDB=# reindex table prrhtab;
REINDEX
ProdDB=# select * from prrhtab;
ERROR:  missing chunk number 0 for toast value 15977662 in pg_toast_37722

ProdDB=# select count(*) from prrhtab;
 count
--------
 232966
(1 row)


ProdDB=# select * from prrhtab order by id desc limit 1;
    id   
---------
 1177027 
(1 row)

To resolve this issue, we proceeded to remove the corrupted rows one by one, and as a result, the ‘missing chunk number’ error no longer occurred.

Unfortunately, we do not have a backup from before the table corruption occurred. While we were able to proceed with the migration using the aforementioned approach, we were uncertain if there might be an alternative method to resolve this issue without any data loss. Furthermore, the process of checking rows one by one to resolve the ‘missing chunk number’ error does not appear to be efficient. Therefore, we would appreciate any suggestions for an optimal solution and what could have been the cause so that similar issues can be avoided in future.

Referred to the following thread, but it appears to be time-consuming when dealing with a higher row count.

Robert_Bernier · September 6, 2023, 12:19pm

Hi,

Sorry to hear about your migration issues.

we were uncertain if there might be an alternative method to resolve this issue without any data loss.

Short answer: Replication

Therefore, we would appreciate any suggestions for an optimal solution and what could have been the cause so that similar issues can be avoided in future.

Use RAID 10

One final suggestion: take a look at the postgres documentation and read about the developer option zero_damaged_pages as it “might” be useful in the future.

Hope this helps

Topic		Replies	Views
Addressing Table Corruption Issues Arising from Abrupt Server Shutdown During PostgreSQL 10.20 to 14.7 Migration PostgreSQL	1	756	September 6, 2023
Cp: cannot stat '/db/wallog/test_logs/00000002.history': No such file or directory PostgreSQL	7	2671	August 29, 2022
Count Mismatch Challenges During PostgreSQL Database Migration: Causes and Solutions PostgreSQL	1	713	September 6, 2023
Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows PostgreSQL closed-no-reply	0	533	October 3, 2023
Addressing Orphaned Rows During PostgreSQL Data Migration PostgreSQL closed-no-reply	0	782	September 5, 2023

Encountering Toast Table Corruption and Missing Chunk Number Error During PostgreSQL Data Migration

Related topics