PCSM 0.8.1 auto clone segmentation passes byte count to find().skip(), failing on large collections

Yeruchom · May 17, 2026, 5:04pm

Summary

In auto mode (the default), PCSM computes the clone segment size in bytes but passes it directly to MongoDB’s find().skip(), which expects a document count. On large collections this produces a nonsense skip value, the
boundary query times out, and the clone fails with clonedSizeBytes: 0.

Confirmed on percona/percona-clustersync-mongodb:0.8.1, still present on main.

Environment

Source: MongoDB Atlas M50, collection ~686 GB / ~670M docs / ~1 KB avg
Target: Percona Server for MongoDB on-prem (K8s)

Symptoms

pcsm status → state: failed, clonedSizeBytes: 0, no progress
Source primary IOWAIT ~70%, but disk IOPS/throughput/latency healthy
Segmentation find times out (MaxTimeMSExpired / ClientDisconnect)
Atlas Query Insights shows the smoking gun:
```
{ "find": "activities", "sort": {"_id": 1}, "limit": 1, "skip": 85861746676 }
```
skip(~85.86B) on a ~670M-doc collection → full _id index walk → timeout. The IOWAIT was millions of yields during that scan, not real I/O pressure.

Root cause

pcsm/clone/copy.go:657-664:

if options.SegmentSizeBytes == config.AutoCloneSegmentSize {
    // AUTO — bytes, passed as if it were a doc count
    segmentSize = max(stats.Size/int64(options.AutoNumSegment), config.MinCloneSegmentSizeBytes)
} else {
    // EXPLICIT — correctly divides by AvgObjSize
    segmentSize = options.SegmentSizeBytes / stats.AvgObjSize
}

Then passed verbatim to SetSkip(seg.segmentSize) at copy.go:805. The explicit branch converts bytes → docs; the auto branch forgets to.

Why smaller collections look fine

The bug always fires. With skip >> collection size, Mongo walks the full _id index and returns nothing → segmenter falls back to one big segment → clone succeeds on small collections (single-threaded, silently) and fails on large
ones (full-index walk exceeds timeout).

Tell: totalSegments: 1 on a large collection means this bug, not “auto being conservative.”

Workaround (confirmed working)

The hidden --clone-segment-size flag (main.go:294-295, MarkHidden) routes through the explicit branch:

pcsm start --clone-segment-size=1GiB --clone-num-read-workers=8 ...

After switching: skip values dropped from ~85B to ~1M, segmentation queries finish in seconds, clonedSizeBytes advances steadily, source IOWAIT back to normal.

Suggested fix

Mirror the explicit branch in auto mode — divide by stats.AvgObjSize to produce a doc count. Also worth un-hiding --clone-segment-size; it’s the only escape hatch for edge cases. Happy to open a PR.

Thanks — PCSM has been great to work with otherwise.

Inel_Pandzic · May 18, 2026, 11:11am

Hey @Yeruchom , thank you very much for reporting this and in such a clear way. We have checked it out and indeed it looks like a bug needed fixing asap. I’ll talk to the team to add it into our next release (not 0.9.0 that we already wrapped up), but the next one.

Inel_Pandzic · May 18, 2026, 1:24pm

Hello @Yeruchom , I have created a task for us based on this report: https://perconadev.atlassian.net/browse/PCSM-323

And we think it is a urgent issue that can not wait and we will add it to our next release (I’m already on it).

Thank for trying PCSM and thank you for the report!

Yeruchom · May 18, 2026, 3:30pm

Hey @Inel_Pandzic

Actually Claude investigated this and wrote up the report for me (including the “Happy to open a PR” ), I’m glad to see it’s helpful.

Thanks guys! been great using Percona

Inel_Pandzic · May 19, 2026, 10:05am

I somehow knew a human wouldn’t just write “Happy to open a PR”

Topic		Replies	Views
Guidance Needed — 8TB MongoDB Atlas to PSMDB Migration Using PCSM v0.7.0 Percona ClusterSync for MongoDB (PCSM)	5	110	April 8, 2026
Percona ClusterSync for Mongodb Feedback Percona ClusterSync for MongoDB (PCSM)	3	636	October 8, 2025
Percona ClusterSync for MongoDB (PCSM) 0.8.1 Released Percona ClusterSync for MongoDB (PCSM) mongodb , new-release	0	34	May 7, 2026
Restore failing with error when restoring collection Percona Operator for MongoDB psmdb-operator , pbm	1	385	March 30, 2024
Problem with pbm restore - skipping restoring x.x, it is not included Percona Backup for MongoDB	7	1644	November 16, 2022