Hello, I’m about to setup a 3 node PXC and wanted to ask 3 quick questions.
Does PXC has any built-in balancer? I read about haproxy or sqlproxy everywhere, but I’m wondering if there is a simple balancing integrated in it. I don’t need advanced options.
If not, what is the current preferred balancer between the 2? Less overhead, faster connection, etc. Considering 99% of the queries I have are SELECTS.
Considering I have 2 identical nvme on each node, does it make sense to configure them in raid0 for speed and more storage? They are in raid1 at the moment, but since I will have 3 replicas thanks to PXC, does it still make sense to have a raid1 for data replication on each node? Is there any problem in running a PXC on raid0?
Does PXC has any built-in balancer? I read about haproxy or sqlproxy everywhere, but I’m wondering if there is a simple balancing integrated in it. I don’t need advanced options.
If not, what is the current preferred balancer between the 2? Less overhead, faster connection, etc. Considering 99% of the queries I have are SELECTS.
Unfortunately PXC doesn’t come with any inbuilt LB however you can opt some popular choices (ProxySQL, Haproxy)
ProxySQL operates in layer 7 so can understand SQL conversation better while Haproxy is a simple layer 4 proxy however looking over your requirement where you just reading 99% of your workload haproxy mignt be a good/light solution.
Well ProxySQL has an additional advantage that it transparently handles the Read/Writes over a single TCP port while on Haproxy you have to handle read/writes over separate TCP ports.
Both the Proxies provide the auto-failover solution also based on some internal/external health check scripts. Further, you can read more about both of them as below.
vs
Moreover, for a simple reading purpose a async replication set also would be goof fit if you can accept some replication lag.
Let me also add some setup/config docs.
PXC/ProxySQL
PXC/Haproxy:
Considering I have 2 identical nvme on each node, does it make sense to configure them in raid0 for speed and more storage? They are in raid1 at the moment, but since I will have 3 replicas thanks to PXC, does it still make sense to have a raid1 for data replication on each node? Is there any problem in running a PXC on raid0?
Well its a choice between performance and redundancy.
PXC has an advantage that it supports a true synchronous commits so possability of data loss is less. Still RAID 1 provides local redundancy at the node level but can introduce some amount of latency/performance trade-offs at the cost of durability.
Since you are not doing much writes so RAID 0 should be fine from performance perspective as far as other nodes are available and syncing data.
Rest, you can consider your business requirement and cost while taking decisions.
Thanks for your reply. I already read the whole docs, those links included.
Not doing much writes compared to reads, but still lots of writes in absolute numbers.
I’m also going to setup, in addition to PXC, a full regular backups of all DBs using XtraBackup (1 full per day and incremental every 1 or 2hours) on an external network disk.
Considering all this, I think that using RAID1 is probably overkill. I know you can NEVER be 100% sure, but PXC + manual replicated backups should already cover most of the disaster. Well, except meteorites
So do you think I can safely go for RAID0 then?
Also my question was implying another one, and it is: Do I have to keep attention to some specific my.cnf or PXC config value when using PXC or a DB server in RAID0? Considering I’m using only InnoDB tables. Any value I should strictly set for this setup other than the ones already suggested in the PXC Docs?
The daily fulls I agree with. Incrementals every 2 hours is a bit much. Consider how annoying this will be when you need to restore. The process is ‘restore full’, then “foreach incremental: apply incremental”. That could be up to 11 incrementals you need to apply.
If, as you say, you don’t have lots of writes, consider using the built-in incremental system known as binary logging. Simply rsync your binary logs somewhere safe every 5m. When recovery time happens, restore the full, then replay the binlogs.
If you don’t have many writes, why is drive speed such a concern? Stick to standard RAID1 or RAID5 and run things like normal. The whole point of a cluster is that if one server fails, there’s at least 2 others ready to take over.
No, there isn’t anything specific for RAID0. You can set innodb_flush_log_at_trx_commit=2 to reduce disk fsync from ‘every commit’ (=1, default), to ‘every 1 second’ (=2).
Because I have heavy readings from multiple websites. I need speed for SELECTS not for writes. I’m actually using PXC with a balancer for this in the first place, not for security. Is this a wrong way to obtain faster readings? Should I just split the reading (websites) to different not clustered single DB Servers?
I also need to be able to restore the single DB or table. Should I just switch to mydumper for that? AFAIK there is no logical backup with incremental system though, and having to backup the whole thing every X hours needs lots of space.
Speed for SELECTs does not come from disks, it comes from RAM/CPU. 99% of your reads should be coming from InnoDB’s buffer pool; that’s why the pool should be set to 80-90% of your system memory. Fast disks only move data into RAM fast. Once in memory, the disks are never touched for reads. Optimizing your storage for SELECTs won’t gain you anything.
What you are doing is correct: Load balancing your SELECT traffic across multiple PXC nodes is the correct way to achieve higher read capacity. If you implement ProxySQL as your load balancer, you can have ProxySQL cache certain queries to speed things up even more. Or implement your own query cache using Valkey, or memcached. It only takes a few lines of code to implement such a cache in any language.
The benefit of clustering is having instant high-availability. If you switch to single servers, what’s your HA plan?
Absolutely. Single DB, or single table restore with physical backups is an absolute nightmare. mydumper gives you the ability to do single-table restore with just a couple commands.
Not true. Do some research on the binary logs, and understand what they record. Take a logical mydumper backup at 2am. A simple cron job saves the binary logs every 5m. Now, it’s 1pm and you need to restore database FOO. Use the mydumper backup from 2am and restore database FOO. Then use mysqlbinlog to replay all changes from 2am to 1pm for database FOO. The binary logs don’t let you filter per-table, only per-database.
That’s not a problem. Every single server is mirrored both in raid and in a different location as a full logical+xtra backup. If a disaster happens I can turn a new DB server up within minutes.
But since my main concern here is speed, not reliability, I wonder if having (for example) 12 websites on a 3 node PXC would be faster than having 4 website on 3 single DB servers.
Well, my experience is different. My buffer pool is bigger than DB usage, still for some big and badly written queries the query time is high. Sometimes there are bad queries on CMS that can’t be changed ot that can’t use indexes because they nest SELECT WHERE IN (SELECT…). So I’m looking for a way to reduce the CPU stress on the DB server (at the moment a single one) when those happens. Either splitting to 3 single instances or using a cluster. I’m wondering what’s the best way to go right now. The specs of the DB servers are high, we are talking about EPYC 9454P 96core/256GB/NVMe boxes.
That’s my problem. Most of my restores are tables based like customer asking to restore a single table when they do stupid things. So my only chance is to stick with a daily or twice day mydumper cron I guess.