I seems to unable to find a good reference to find the best number of thread_pool_size.
I have a very read heavy db server (barely any write). percona 5.7 on 16 VCPU + 32GB RAM.
i was using 80 thread_pool_size. When i go to 16-32 thread_pool_size i am seeing the latency gets worse. But latency gets better with setting the stall limit from 500 to 100.
I am trying to benchmark different TPS using sysbench but they mostly shows similar result (4K clients) between using 16 32 64 96 128 TPS (default 500ms stall limit).
What would be the best way to find the right size for TPS? (NCPU <= TPX <= NCPU + NCPU/2) doesnt seem to work best here.
@elisia,
How many connections are there to MySQL? Using the thread-pool for less than 20,000 connections won’t gain you much. If you have less connections than this, you are best to leave the thread pooling disable and use the default methodology. I would suggest that you focus on setting up a replica, or pool of replicas, and load-balancing your reads across multiple servers. You can scale much faster and much easier this way.
For this particular db it has 5K connections. Yes we have replicas, lb to replicas etc.
I have another DB thats roughly has 9K connections with lower CPU count (8VCPU), and exploring the ideas of using thread pool as well because i wanted to double the number of connections.
@elisia,
Look at your Threads_running metric. What’s that like compared to connections? Threads_running is the true “load” metric of your server. You can have 30,000 connections to mysql, but if you have < 100 average Threads_running per second, thread pool will make no difference. I have many clients that run 10K-30K connections and all are using the default pooling without issue. I feel like you are jumping the gun by going down this path.
What issue/problem are you experiencing that is making you believe that thread pooling is the solution?
Looks like my threads running is around 30 under normal load. I had a case when there was a spike of load it went to 10K threads running but I dont have that anymore and its averaging around 30-50
@matthewb What do you mean by default pooling? isnt the default thread handling is one thread per connection? Or are you referring to something else here?
Yes, that’s what I was referring to. 10K threads running seems implausible. Most likely you had a bad query which locked a critical table and your application started to “dog-pile” connections.
Right, so instead of chasing after the wrong thing, focus on what caused the original issue and fix that. If there was a query causing this, lets make sure that query is well optimized (use EXPLAIN) check indexes, engine, etc.
@matthewb That issue was already fixed, that wasnt a query causing this but the configuration itself.
im exploring threadpool is to account for future growth and to increase stability toward huge load spike to the dbs. Threads running is hovering around 30. Some dbs spiked up to 80-90 running threads.
I would also suggest that you look at innodb-thread-concurrency and make sure that is = NCPU and test how well that works in your setup before messing with thread-pool. As I said earlier, thread-pool is an extremely rare thing to enable, especially with such low NCPU/RAM configuration such as yours.
If you do decide to go the thread-pool route, to answer your original question, all I can say is to test the different configuration values with your workload as the best tuning will be specific to your queries and their volume.
Nothing new @matthewb, As you suggested i switched to one thread per connection. p99 query times is better now.
However, occasionally i saw threads running spiking to 40-80 range. When loads to dbs doubled, this might be a different story but i might have another thing to worry upstream instead of the dbs.