100% Connections Usage Problem

Hi board!
i have a big problem with my mysql db server!
i have a site with perm 1000 logged in users, and a lot of selects.
All Servers are quad core amd, with 8GB RAM.

i got 2 apache2 servers with this config:

StartServers 10 MinSpareServers 5 MaxSpareServers 10# ServerLimit 1000 MaxClients 256 MaxRequestsPerChild 10000

here is my my.cnf

tmp_table_size=128Mmax_connections=500wait_timeout=20connect_timeout=10interactive_timeout=120join_buffer_size=16Mquery_cache_size=128Mquery_cache_limit=16Mmax_allowed_packet=16Mtable_cache=2048read_buffer_size=16Mread_rnd_buffer_size=8Msort_buffer_size=16Mkey_buffer=128Mkey_buffer_size=64Minnodb_buffer_pool_size=6Gthread_cache_size=60

show status;

| Handler_update | 0 || Handler_write | 131 || Innodb_buffer_pool_pages_data | 161876 || Innodb_buffer_pool_pages_dirty | 772 || Innodb_buffer_pool_pages_flushed | 2343443 || Innodb_buffer_pool_pages_free | 214922 || Innodb_buffer_pool_pages_latched | 0 || Innodb_buffer_pool_pages_misc | 16418 || Innodb_buffer_pool_pages_total | 393216 || Innodb_buffer_pool_read_ahead_rnd | 11 || Innodb_buffer_pool_read_ahead_seq | 1042 || Innodb_buffer_pool_read_requests | 29087218162 || Innodb_buffer_pool_reads | 118486 || Innodb_buffer_pool_wait_free | 0 || Innodb_buffer_pool_write_requests | 37933475 || Innodb_data_fsyncs | 2246159 || Innodb_data_pending_fsyncs | 0 || Innodb_data_pending_reads | 0 || Innodb_data_pending_writes | 0 || Innodb_data_read | 2603061248 || Innodb_data_reads | 125581 || Innodb_data_writes | 3889764 || Innodb_data_written | 90657998848 || Innodb_dblwr_pages_written | 2343443 || Innodb_dblwr_writes | 37545 || Innodb_log_waits | 0 || Innodb_log_write_requests | 27518465 || Innodb_log_writes | 2157491 || Innodb_os_log_fsyncs | 2180480 || Innodb_os_log_pending_fsyncs | 0 || Innodb_os_log_pending_writes | 0 || Innodb_os_log_written | 13859563008 || Innodb_page_size | 16384 || Innodb_pages_created | 3131 || Innodb_pages_read | 158745 || Innodb_pages_written | 2343443 || Innodb_row_lock_current_waits | 0 || Innodb_row_lock_time | 24120 || Innodb_row_lock_time_avg | 3 || Innodb_row_lock_time_max | 177 || Innodb_row_lock_waits | 7448 || Innodb_rows_deleted | 101632 || Innodb_rows_inserted | 277513 || Innodb_rows_read | 14181145169 || Innodb_rows_updated | 2102680 || Key_blocks_not_flushed | 0 || Key_blocks_unused | 90713 || Key_blocks_used | 16459 || Key_read_requests | 5830242 || Key_reads | 16854 || Key_write_requests | 91577 || Key_writes | 89004 || Last_query_cost | 0.000000 || Max_used_connections | 401 || Ndb_cluster_node_id | 0 || Ndb_config_from_host | || Ndb_config_from_port | 0 || Ndb_number_of_data_nodes | 0 || Not_flushed_delayed_rows | 0 || Open_files | 81 || Open_streams | 0 || Open_tables | 1425 || Opened_tables | 0 || Prepared_stmt_count | 0 || Qcache_free_blocks | 16190 || Qcache_free_memory | 93146896 || Qcache_hits | 9162852 || Qcache_inserts | 9730832 || Qcache_lowmem_prunes | 9640 || Qcache_not_cached | 2749121 || Qcache_queries_in_cache | 20430 || Qcache_total_blocks | 57626 || Questions | 28299337 || Rpl_status | NULL || Select_full_join | 0 || Select_full_range_join | 0 || Select_range | 0 || Select_range_check | 0 || Select_scan | 1 || Slave_open_temp_tables | 0 || Slave_retried_transactions | 0 || Slave_running | OFF || Slow_launch_threads | 0 || Slow_queries | 0 || Sort_merge_passes | 0 || Sort_range | 0 || Sort_rows | 0 || Sort_scan | 0 || Ssl_accept_renegotiates | 0 || Ssl_accepts | 0 || Ssl_callback_cache_hits | 0 || Ssl_cipher | || Ssl_cipher_list | || Ssl_client_connects | 0 || Ssl_connect_renegotiates | 0 || Ssl_ctx_verify_depth | 0 || Ssl_ctx_verify_mode | 0 || Ssl_default_timeout | 0 || Ssl_finished_accepts | 0 || Ssl_finished_connects | 0 || Ssl_session_cache_hits | 0 || Ssl_session_cache_misses | 0 || Ssl_session_cache_mode | NONE || Ssl_session_cache_overflows | 0 || Ssl_session_cache_size | 0 || Ssl_session_cache_timeouts | 0 || Ssl_sessions_reused | 0 || Ssl_used_session_cache_entries | 0 || Ssl_verify_depth | 0 || Ssl_verify_mode | 0 || Ssl_version | || Table_locks_immediate | 22830484 || Table_locks_waited | 1209 || Tc_log_max_pages_used | 0 || Tc_log_page_size | 0 || Tc_log_page_waits | 0 || Threads_cached | 43 || Threads_connected | 22 || Threads_created | 5738 || Threads_running | 8 || Uptime | 92523 |±----------------------------------±------------+

my problem is, i get every 30-50mins a 100% connections usage, with more then 400 used connections! all Clients get a Service unnavailabla error @ this time.
if all is normal, i get like 5-8% connection usage.

maybe someone can help me out here!
we can talk about ssh access too.

regards!
S.

What does it look like if you run SHOW PROCESSLIST at the moment when this happens?
What does Command, Time and State on the most processes say?

Hi, thanks for reply!

When i list processes at the moment it happens, i got around 200 slow queryies in the list.

but i think that is because the server cant handle all the requests.because the queries are standard queries wich take 0,2sec usually.

Now, when there are only 300-400 users logged in, all is very smooth!
im not sure if it is mysql or apache problem, or code :\

So what are the command and state for the threads?

Do you have any that have state Locked?

Some different theories:
1.
You have one query that is seldom used but that takes a lot of time and locks tables for the other threads which results in a rapid incline of new threads since the old once doesn’t finnish due to the lock.

Something wrong with your applications connection pooling (if you are using this) so that it sometimes doesn’t reuse old connections.

Or just test with a higher setting:
max_connections=1000
And see if that solves it.

hi and thanks again!

i cant remember exactly but i think the state was “copying to tmp table” with a select command.

to 1: all tables are InnoDB, as far as i remember, does MyISAM lock tables, but InnoDB handles that in a different way? pls correct my if i am wrong!

to 2: i have to talk to our coder, if it uses pconnect or not.
what are the best settigs for that?
timeout etc?

thank you very much!

oh and btw, i dont set max_connections up to 1000, because i think it will rapidly go up to that max and crashes the box with 100% cpu usage!

Sascha

sorry for the late reply!
we dont use pconnect in our php application.

the state on the most threads is “sending data” or “copying to tmp table”

now i have this problem like 5-10 times a day.

currently all is fine with like 20 avg connections. and servers idle.

i dont think its a slow query. i tested most of them and they only take max 4 sec.

you can checkout my MUNIN graphs if you want. ill send you the url

regards,