We setup two clusters, one was a staging cluster and we set it up initially empty and replicated to it. The other one the data was bigger so we restored from a backup. In the one that we restored from a backup, the “horizons” feature on the members is not working - we can only connect to the replica set from machines where the internal addresses are reachable. However, in the staging cluster we have no issue.
My thought is there might some state that came from the backup that is preventing the horizons feature from working, but I can’t figure out what that might be or where.
Is there some way in which restoring the database from a backup and restarting the replication could have disabled this horizons feature?
Here’s the replica set configuration for reference:
{
_id: 'rs0',
version: 6,
term: 5,
members: [
{
_id: 0,
host: 'c.internal.productiondb.databases.xxxx.com:27017',
arbiterOnly: false,
buildIndexes: true,
hidden: false,
priority: 1,
tags: {},
horizons: {
external: 'c.external.productiondb.databases.xxxx.com:27017'
},
secondaryDelaySecs: Long('0'),
votes: 1
},
{
_id: 1,
host: 'a.internal.productiondb.databases.xxxx.com:27017',
arbiterOnly: false,
buildIndexes: true,
hidden: false,
priority: 2,
tags: {},
horizons: {
external: 'a.external.productiondb.databases.xxxx.com:27017'
},
secondaryDelaySecs: Long('0'),
votes: 1
},
{
_id: 2,
host: 'b.internal.productiondb.databases.xxxx.com:27017',
arbiterOnly: false,
buildIndexes: true,
hidden: false,
priority: 2,
tags: {},
horizons: {
external: 'b.external.productiondb.databases.xxxx.com:27017'
},
secondaryDelaySecs: Long('0'),
votes: 1
}
],
protocolVersion: Long('1'),
writeConcernMajorityJournalDefault: true,
settings: {
chainingAllowed: true,
heartbeatIntervalMillis: 2000,
heartbeatTimeoutSecs: 10,
electionTimeoutMillis: 10000,
catchUpTimeoutMillis: -1,
catchUpTakeoverDelayMillis: 30000,
getLastErrorModes: {},
getLastErrorDefaults: { w: 1, wtimeout: 0 },
replicaSetId: ObjectId('68700b11129f540835e578d2')
}
}
I’ve been comparing this so many times with the cluster where it is all working and I can’t figure it out. We’re using MONGO-X509 auth with TLS in both clusters, almost identical configuration and connection strings.