Sounds like two issues: query tuning, and warming up the buffer pool. For query tuning, you just need to pull a slow query log and start going over it with pt-query-digest to see what you can improve as far as modifying queries and adding/removing/modifying indexes.
This could potentially help get things going quicker after a restart, but that is highly situational so you would need to test it out with your workload.
As for MySQL spawning it’s own connections, what is likely happening is there is a query or queries that are getting locked up, which causes a backlog of connections to form. This is when you get a large number of connections that has the potential to bring the server down, which appears to be happening with you. This could also be caused by a “cache stampede”, which happens when one or more large caches (i.e. from an ORM like Hibernate or a caching solution like memcached) get refreshed at the same time, overloading the database.
You’ll want to look for threads that are waiting on locks, and try to track down what is blocking that thread. If it is a single thread locking everything up, then you will likely see a bunch of different threads waiting on one thread. Below is a simple example showing a thread waiting on a lock from another thread:
------------
TRANSACTIONS
------------
Trx id counter 9498
Purge done for trx's n:o < 9494 undo n:o < 0 state: running but idle
History list length 4
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 9494, not started
MySQL thread id 1, OS thread handle 0x12ff05000, query id 33 localhost root init
show engine innodb status
---TRANSACTION 9497, ACTIVE 5 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 376, 1 row lock(s)
MySQL thread id 3, OS thread handle 0x12ffab000, query id 32 localhost root updating
update mytable set a = 6 where a = 1
------- TRX HAS BEEN WAITING 5 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6 page no 3 n bits 72 index `GEN_CLUST_INDEX` of table `test`.`mytable` trx id 9497 lock_mode X waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
0: len 6; hex 000000000200; asc ;;
1: len 6; hex 000000002517; asc % ;;
2: len 7; hex 13000001410110; asc A ;;
3: len 4; hex 80000005; asc ;;
------------------
---TRANSACTION 9495, ACTIVE 67 sec
2 lock struct(s), heap size 376, 4 row lock(s), undo log entries 1
MySQL thread id 2, OS thread handle 0x12ff68000, query id 22 localhost root cleaning up
You can see three threads here:
Transaction 9494; this thread ran the “show engine innodb status \G” command, which gave me the output.
Transaction 9497; this thread is attempting to update a record in “mytable”, which is locked. Note the “TRX HAS BEEN WAITING 5 SEC FOR THIS LOCK TO BE GRANTED” line, which tells you what is being blocked.
Transaction 9495; this thread is performing an update on the same “mytable”, which is blocking the thread listed above. You can tell it is this one because it says “ACTIVE 5 sec” after the transaction number, meaning that the transaction is actively running (and not blocked like the previous one).
So basically you need to weed through your output and try to determine what (if anything) is blocking the other queries and causing the backup in your server.
As for the cache stampede, what you would need to do there is going through the same output and look for queries that you know are populating a cache. So if you see a ton of cache queries, that could mean there is a cache stampede occurring. This is subjective, as the caches will likely be populating throughout the day in many cases. But usually there should not be a lot at once, as the nature of a caching query means it is pulling a lot of data normally, so having a bunch at once can cause heavy load.