Exposing cluster outside of K8S without sharding -and- without enforcing TLS connections?

Hi,

We’ve been working on a PoC-setup for migrating our large mongodb environment to kubernetes, using the Percona MongoDB Operator

However an issue we face is how to expose our clusters to applications outside of kubernetes in a reliable and cost effective way.

We’ve gone through a few different iterations here:

Direct connections

In this model we just make the applications use direct connections rather than connecting to the entire replica set.

Pros: Very simple, works (kind of)

Cons: Multitudes. Not only do we not get any redundancy in case one of the replicas (pods) goes down, we also run into major issues if the primary switches. Ie if pod 0 is the primary (all clients able to read and write) but then pod 1 becomes the primary, then all writes will fail - unless we build explicit logic into the clients to reconnect to a different DNS name in case of write failures

Split Horizon

Here, we make use of the splitHorizon functionality to make sure that replica set connections properly work since clients will now get presented with externally resolvable hostnames rather than the K8S podnames.

Pros: Works great! And relatively easy to understand / configure

Cons: Clients must use TLS when connecting. While this is obviously a good thing overall, it significantly hampers us in our ability to roll out the new platform at scale. We have a large system with multiple different teams and types of applications and getting everyone to prioritise enabling and using TLS in their clients will be a lot of work. Our goal is to make our new platform as “plug-and-play” as possible, requiring minimal time from the clients / users.

Sharding

Here, we make every cluster a sharded cluster (single shard unless explicitly configured) and we then use the mongos router as the entrypoint

Pros: Works pretty well - this allows us to have redundant/failover client connections without requiring the use of TLS. It’s nice to be able to use “actual” sharding should the use case require it. The operator takes care of a lot of the legwork of setting up sharding from the infrastructure PoV

Cons: We currently don’t use sharding (we prefer to do “application-level” sharding where we simply split ut databases / collections instead) and so it adds a lot of unnecessary complexity. But more pressingly, it significantly increases the cost of the solution - even if we run very small clusters there is still a minimum resources requirement to run the config servers and mongos servers, so the “overhead” for a new cluster is much larger than in the non-sharded case.

One of the goals for our new solution is to be as cost-effective as possible, so this is a big concern for us.

So - we’re wondering if there is some better way of doing this? That is, can we achieve replica set connections (=> failover-capable connections) without requring the use of TLS for splitHorizon -and- without using sharding/mongos?

One possible workaround we theorized around was overriding the hostname of the replicas - using the replSetOverrides configuration: Custom Resource options - Percona Operator for MongoDB - to match the externally available DNS records we create. That is we basically try to make the “internal” names of the replicas externally resolvable.

However in our initial testing at least this didn’t work at all. If that field is set to any value the cluster fails to come up / the replicas don’t get added. But even if that would work it feels pretty ugly / as a non-intended way of using that parameter

Any guidance here would be greatly appreciated :slight_smile:

Hi Jesper, in your case splitHorizon seems to be the best fit. Even though clients have to use TLS it doesn’t mean that every client needs an x509 certificate. You should be able to set server-side:

allowConnectionsWithoutCertificates: true

and the connection will still be using TLS.

Another possibility would be to provision the sharded cluster using config shards to save resources. This is unfortunately not supported yet by Operator but will be in the future.

Hi Ivan,

Thanks for your reply and thanks for the pointer - that does indeed seem like a reasonable way forward, but ideally we’d want to avoid -any- type of client update. In this case, teams / developers would still need to update their driver config etc to use TLS

However!

I took a stab at using the member name override again, and this time it seems to work perfectly - very strange, perhaps the behaviour we saw was a bug that has been fixed in a later release of the operator?

What we do is that we create (semi-manually, through our helm chart) a clusterIP service for each Mongod pod, assign it a DNS name using external-DNS, and then use this as the (overriden) replica set member name.

And this works just as we’d expect - the cluster comes up, and when clients connect and run ie db.hello() the hostnames they get back are externally resolvable, without using splitHorizon.

We need to do some more testing to see if this has any bearing on the performance etc but it seems very promising

1 Like

Glad to hear it worked now! thanks for letting us know how you accomplished it.