All data deleted when restore external snapshot on psmdb

After restore status become StatusCopyDone, pbm agent cleanupDatadir with needFiles. filepath.Walk cannot enter symlink directory.

/data/db/journal may be wrongly deleted

[mongodb@dev-rs-mdb-rs0-2 /]$ ls /data/db/ -l | grep journal

lrwxrwxrwx 1 mongodb 1001      8 Dec  5 07:05 journal -> /journal

Hi @xiaobao_wen,

I was able to reproduce this with standalone Go experiments that extract the exact cleanupDatadir logic from PBM v2.12.0 (physical.go L1344). You pinpointed the right spot.

The root cause is that Go’s filepath.Walk does not follow symbolic links. When Walk encounters your journal -> /journal symlink, it reports info.IsDir() == false (because it is a symlink, not a directory). The guard on L1376 only protects real directories. Since "journal" is not a file in the backup manifest (files are listed as journal/WiredTigerLog.xxx), the cleanup treats the symlink as an orphan file and calls os.Remove on it. The journal files survive on your separate volume but become unreachable from the dbpath.

There is actually a second failure point: the removeAll function (L2856) called during flush() also destroys symlinks via os.RemoveAll with no type checking.

The fix for cleanupDatadir is a one-line guard before the IsDir check:

if info.Mode()&os.ModeSymlink != 0 {
    return nil
}

removeAll needs a similar guard using os.Lstat to detect and skip symlinks in its loop.

Workaround for now: after each physical restore, recreate the symlink before starting mongod:

rm -rf /data/db/journal
ln -s /journal /data/db/journal

This bug affects all PBM 2.x versions. I filed PBM-1706 to track the fix.