I read the release and announcement for ClusterSync, and the tool sounds brilliant, huge possibilities.
I have not been able to test this yet but 2 initial questions:
Does the target need the same replicasetname, and or does the meta data / users and roles get copied over as part of the sync?
I read that various index changes only get applied at the “finalize” stage. I can see potential for using this tool to maintain a safe walled replica of a production rate particularly if users/roles are not synced, but if the ttl indexes never apply I can see that being an issue with keeping the sync running for an extended period. Interested if this use case was considered.
Regardless, sounds like a brilliant addition from Percona.
We’re very glad you read our announcement and are planning to evaluate it! We’re definitely looking for your feedback.
To answer your questions:
1) Does the target need the same Replica Set name?
No, it doesn’t. When you configure the connection, you can list all replica set members of the source and target clusters in the respective MongoDB connection string URIs to ensure PCSM can reach each replica set member.
2)Do the metadata/users and roles get copied over as part of the sync?
No, unfortunately, users and roles are not synchronized. Additionally, PCSM doesn’t sync system.* collections.
3) Can I use this tool to maintain a safe walled replica of a production rate?
Currently, we focus on a one-time migration use case. However, it’s already possible to prepare a safe, walled replica of a production cluster for some cases. Unfortunately, TTL indexes on collections might be challenging. TTL indexes are created at the time of clone and data replication, but PCSM temporarily disables TTL expiration and saves expireAfterSeconds property for later restoration (during finalize). This way PCSM ensures documents won’t expire while being copied.
We’ll definitely try to support this case better in the future. Until that moment, maybe you can achieve your goals using Percona Backup for MongoDB?
Currently, the target cluster needs to be empty before your start the synchronization. But if you schedule a job that “cleans the cluster and then synchronize” then maybe that’s a way to go.