PCSM 0.8.1 auto clone segmentation passes byte count to find().skip(), failing on large collections

Summary

In auto mode (the default), PCSM computes the clone segment size in bytes but passes it directly to MongoDB’s find().skip(), which expects a document count. On large collections this produces a nonsense skip value, the
boundary query times out, and the clone fails with clonedSizeBytes: 0.

Confirmed on percona/percona-clustersync-mongodb:0.8.1, still present on main.

Environment

  • Source: MongoDB Atlas M50, collection ~686 GB / ~670M docs / ~1 KB avg
  • Target: Percona Server for MongoDB on-prem (K8s)

Symptoms

  • pcsm statusstate: failed, clonedSizeBytes: 0, no progress

  • Source primary IOWAIT ~70%, but disk IOPS/throughput/latency healthy

  • Segmentation find times out (MaxTimeMSExpired / ClientDisconnect)

  • Atlas Query Insights shows the smoking gun:

    { "find": "activities", "sort": {"_id": 1}, "limit": 1, "skip": 85861746676 }
    

    skip(~85.86B) on a ~670M-doc collection → full _id index walk → timeout. The IOWAIT was millions of yields during that scan, not real I/O pressure.

Root cause

pcsm/clone/copy.go:657-664:

if options.SegmentSizeBytes == config.AutoCloneSegmentSize {
    // AUTO — bytes, passed as if it were a doc count
    segmentSize = max(stats.Size/int64(options.AutoNumSegment), config.MinCloneSegmentSizeBytes)
} else {
    // EXPLICIT — correctly divides by AvgObjSize
    segmentSize = options.SegmentSizeBytes / stats.AvgObjSize
}

Then passed verbatim to SetSkip(seg.segmentSize) at copy.go:805. The explicit branch converts bytes → docs; the auto branch forgets to.

Why smaller collections look fine

The bug always fires. With skip >> collection size, Mongo walks the full _id index and returns nothing → segmenter falls back to one big segment → clone succeeds on small collections (single-threaded, silently) and fails on large
ones (full-index walk exceeds timeout).

Tell: totalSegments: 1 on a large collection means this bug, not “auto being conservative.”

Workaround (confirmed working)

The hidden --clone-segment-size flag (main.go:294-295, MarkHidden) routes through the explicit branch:

pcsm start --clone-segment-size=1GiB --clone-num-read-workers=8 ...

After switching: skip values dropped from ~85B to ~1M, segmentation queries finish in seconds, clonedSizeBytes advances steadily, source IOWAIT back to normal.

Suggested fix

Mirror the explicit branch in auto mode — divide by stats.AvgObjSize to produce a doc count. Also worth un-hiding --clone-segment-size; it’s the only escape hatch for edge cases. Happy to open a PR.

Thanks — PCSM has been great to work with otherwise.

Hey @Yeruchom , thank you very much for reporting this and in such a clear way. We have checked it out and indeed it looks like a bug needed fixing asap. I’ll talk to the team to add it into our next release (not 0.9.0 that we already wrapped up), but the next one.

Hello @Yeruchom , I have created a task for us based on this report: https://perconadev.atlassian.net/browse/PCSM-323

And we think it is a urgent issue that can not wait and we will add it to our next release (I’m already on it).

Thank for trying PCSM and thank you for the report!

Hey @Inel_Pandzic

Actually Claude investigated this and wrote up the report for me (including the “Happy to open a PR” :sweat_smile: ), I’m glad to see it’s helpful.

Thanks guys! been great using Percona

I somehow knew a human wouldn’t just write “Happy to open a PR” :slight_smile: