simple query occasionally stalls

I’ve been trying to resolve this problem off and on for years. I have a table with about 17 million records. A simple query like select count(*) from Users where UserID=‘XXX’; usually takes less then 0.01 seconds. Sometimes it’ll take 2-5 seconds. I’ve turned on log-slow-queries and it has logged 92 longer then 1 second in the last few hours:

Count: 92 Time=2.99s (275s) Lock=0.00s (0s) Rows=1.0 (92)
select count(*) from Users where UserID=‘S’

My server is a dual Quad Core Xeon E5345, with 8 gigs ram, sas 15k drives. It is CentOS5 running under xen virtualization, with 4gigs of ram and 4 cores assigned to it. The other xen guests on this machine shouldn’t be doing anything to cause these slow downs. We had similar stalls before moving to xen.

The question I have is how to I go about trying to determine the cause of these stalls?

Users table create query:
CREATE TABLE Users (
ID int(11) NOT NULL auto_increment,
UserID char(64) NOT NULL default ‘0’,
DateAdded datetime default NULL,
DateLastEvent datetime default NULL,
FoundFirst tinyint(3) unsigned default ‘0’,
PRIMARY KEY (ID),
KEY UserID (UserID)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

my.cnf:
[client]
port= 3306
socket= /var/run/mysqld/mysqld.sock

[mysqld_safe]
socket= /var/run/mysqld/mysqld.sock
nice= 0

[mysqld]
server-id=32
user= mysql
pid-file= /var/run/mysqld/mysqld.pid
socket= /var/run/mysqld/mysqld.sock
port= 3306
log-error= /var/log/mysql/mysql.err
basedir= /usr
datadir= /var/lib/mysql
tmpdir= /tmp
language= /usr/share/mysql/english
skip-external-locking
key_buffer= 16M
max_allowed_packet= 16M
thread_stack= 128K
query_cache_limit= 1048576
query_cache_size = 26214400
query_cache_type = 1
sort_buffer_size = 256M
key_buffer_size = 1024M
table_cache = 256
thread_cache_size = 32
log-slow-queries= /var/lib/mysql/db-slow.log
long_query_time = 1
log-bin= /var/lib/mysql/mysql-bin.log

[mysqldump]
quick
quote-names
max_allowed_packet= 16M

[isamchk]
key_buffer= 16M

[myisamchk]
key_buffer_size = 256M
sort_buffer_size = 256M

Thanks for any help!

The first question that comes to my mind is: why is your key_buffer set to 16MB when your box has 8GB? Set it to 1GB and see what happens.

The other values also don’t look reasonable - they seem to be default values.

It looks like key_buffer is a left over from older versions of mysql. I set key_buffer_size to 1024M a little lower. I tried setting key_buffer with this line:
set global key_buffer=1024M;

and got this error:
ERROR 1193 (HY000): Unknown system variable ‘key_buffer’

Thanks for spotting that though!

Sorry - I just had a brief look…

I think you probably can raise the key_buffer_size.

And your sort_buffer_size is to big - this is a per connection setting.

But we cannot give you real hints without more details:

  • show global status; (or mysqlreport)
  • vmstat or top
    when the server is under load.

Cheers

I changed key_buffer_size to 2048M and sort_buffer_size to 64M. But I’m still see the stalls.

[root@db spades]# vmstatprocs -----------memory---------- —swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 8 0 40 287432 153536 3115944 0 0 4 58 7 2 3 4 92 1 0

[root@db spades]# toptop - 10:40:40 up 17 days, 19:32, 2 users, load average: 1.78, 0.76, 0.64 Tasks: 107 total, 6 running, 101 sleeping, 0 stopped, 0 zombie Cpu0 : 8.1%us, 12.0%sy, 0.0%ni, 77.1%id, 1.4%wa, 0.1%hi, 1.1%si, 0.1%st Cpu1 : 1.0%us, 1.2%sy, 0.0%ni, 96.7%id, 0.9%wa, 0.0%hi, 0.2%si, 0.0%st Cpu2 : 0.9%us, 1.0%sy, 0.0%ni, 97.3%id, 0.7%wa, 0.0%hi, 0.1%si, 0.0%st Cpu3 : 0.9%us, 1.0%sy, 0.0%ni, 97.3%id, 0.7%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 4194472k total, 3912168k used, 282304k free, 153640k buffers Swap: 1044184k total, 40k used, 1044144k free, 3119000k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 18257 lobby 15 0 58848 40m 3048 S 12 1.0 3729:40 spades1 10897 mysql 15 0 2210m 290m 4732 S 8 7.1 1:35.29 mysqld 20558 lobby 15 0 32864 15m 2984 R 3 0.4 723:30.54 hearts1 20617 lobby 15 0 36932 19m 2976 R 2 0.5 585:48.75 euchre1 14829 lobby 15 0 29228 9356 4000 R 0 0.2 2:03.77 solitaire1 27292 lobby 15 0 125m 107m 2996 R 0 2.6 233:15.73 backgammon1 1 root 15 0 2060 644 544 S 0 0.0 0:13.23 init 2 root RT 0 0 0 0 S 0 0.0 0:04.28 migration/0 3 root 34 19 0 0 0 S 0 0.0 0:00.42 ksoftirqd/0

mysql> show global status;±----------------------------------±---------+| Variable_name | Value |±----------------------------------±---------+| Aborted_clients | 6 || Aborted_connects | 49 || Binlog_cache_disk_use | 0 || Binlog_cache_use | 0 || Bytes_received | 12255911 || Bytes_sent | 42885115 || Com_admin_commands | 102 || Com_alter_db | 0 || Com_alter_table | 0 || Com_analyze | 0 || Com_backup_table | 0 || Com_begin | 0 || Com_change_db | 15409 || Com_change_master | 0 || Com_check | 0 || Com_checksum | 0 || Com_commit | 24 || Com_create_db | 0 || Com_create_function | 0 || Com_create_index | 0 || Com_create_table | 0 || Com_dealloc_sql | 0 || Com_delete | 35 || Com_delete_multi | 0 || Com_do | 0 || Com_drop_db | 0 || Com_drop_function | 0 || Com_drop_index | 0 || Com_drop_table | 0 || Com_drop_user | 0 || Com_execute_sql | 0 || Com_flush | 0 || Com_grant | 0 || Com_ha_close | 0 || Com_ha_open | 0 || Com_ha_read | 0 || Com_help | 0 || Com_insert | 11750 || Com_insert_select | 20 || Com_kill | 0 || Com_load | 0 || Com_load_master_data | 0 || Com_load_master_table | 0 || Com_lock_tables | 0 || Com_optimize | 0 || Com_preload_keys | 0 || Com_prepare_sql | 0 || Com_purge | 0 || Com_purge_before_date | 0 || Com_rename_table | 0 || Com_repair | 0 || Com_replace | 0 || Com_replace_select | 0 || Com_reset | 0 || Com_restore_table | 0 || Com_revoke | 0 || Com_revoke_all | 0 || Com_rollback | 19 || Com_savepoint | 0 || Com_select | 24480 || Com_set_option | 687 || Com_show_binlog_events | 0 || Com_show_binlogs | 0 || Com_show_charsets | 0 || Com_show_collations | 2 || Com_show_column_types | 0 || Com_show_create_db | 0 || Com_show_create_table | 0 || Com_show_databases | 1 || Com_show_errors | 0 || Com_show_fields | 40 || Com_show_grants | 0 || Com_show_innodb_status | 0 || Com_show_keys | 10 || Com_show_logs | 0 || Com_show_master_status | 0 || Com_show_ndb_status | 0 || Com_show_new_master | 0 || Com_show_open_tables | 0 || Com_show_privileges | 0 || Com_show_processlist | 7 || Com_show_slave_hosts | 3 || Com_show_slave_status | 0 || Com_show_status | 13 || Com_show_storage_engines | 0 || Com_show_tables | 31 || Com_show_triggers | 0 || Com_show_variables | 13 || Com_show_warnings | 1 || Com_slave_start | 0 || Com_slave_stop | 0 || Com_stmt_close | 0 || Com_stmt_execute | 0 || Com_stmt_fetch | 0 || Com_stmt_prepare | 0 || Com_stmt_reset | 0 || Com_stmt_send_long_data | 0 || Com_truncate | 0 || Com_unlock_tables | 0 || Com_update | 25069 || Com_update_multi | 0 || Com_xa_commit | 0 || Com_xa_end | 0 || Com_xa_prepare | 0 || Com_xa_recover | 0 || Com_xa_rollback | 0 || Com_xa_start | 0 || Compression | OFF || Connections | 6593 || Created_tmp_disk_tables | 40 || Created_tmp_files | 5 || Created_tmp_tables | 137 || Delayed_errors | 0 || Delayed_insert_threads | 4 || Delayed_writes | 7998 || Flush_commands | 1 || Handler_commit | 0 || Handler_delete | 12 || Handler_discover | 0 || Handler_prepare | 0 || Handler_read_first | 24 || Handler_read_key | 47251 || Handler_read_next | 3870660 || Handler_read_prev | 0 || Handler_read_rnd | 450 || Handler_read_rnd_next | 39869350 || Handler_rollback | 0 || Handler_savepoint | 0 || Handler_savepoint_rollback | 0 || Handler_update | 18420 || Handler_write | 17598 || Innodb_buffer_pool_pages_data | 19 || Innodb_buffer_pool_pages_dirty | 0 || Innodb_buffer_pool_pages_flushed | 0 || Innodb_buffer_pool_pages_free | 493 || Innodb_buffer_pool_pages_latched | 0 || Innodb_buffer_pool_pages_misc | 0 || Innodb_buffer_pool_pages_total | 512 || Innodb_buffer_pool_read_ahead_rnd | 1 || Innodb_buffer_pool_read_ahead_seq | 0 || Innodb_buffer_pool_read_requests | 77 || Innodb_buffer_pool_reads | 12 || Innodb_buffer_pool_wait_free | 0 || Innodb_buffer_pool_write_requests | 0 || Innodb_data_fsyncs | 3 || Innodb_data_pending_fsyncs | 0 || Innodb_data_pending_reads | 0 || Innodb_data_pending_writes | 0 || Innodb_data_read | 2494464 || Innodb_data_reads | 25 || Innodb_data_writes | 3 || Innodb_data_written | 1536 || Innodb_dblwr_pages_written | 0 || Innodb_dblwr_writes | 0 || Innodb_log_waits | 0 || Innodb_log_write_requests | 0 || Innodb_log_writes | 1 || Innodb_os_log_fsyncs | 3 || Innodb_os_log_pending_fsyncs | 0 || Innodb_os_log_pending_writes | 0 || Innodb_os_log_written | 512 || Innodb_page_size | 16384 || Innodb_pages_created | 0 || Innodb_pages_read | 19 || Innodb_pages_written | 0 || Innodb_row_lock_current_waits | 0 || Innodb_row_lock_time | 0 || Innodb_row_lock_time_avg | 0 || Innodb_row_lock_time_max | 0 || Innodb_row_lock_waits | 0 || Innodb_rows_deleted | 0 || Innodb_rows_inserted | 0 || Innodb_rows_read | 0 || Innodb_rows_updated | 0 || Key_blocks_not_flushed | 0 || Key_blocks_unused | 1833280 || Key_blocks_used | 22402 || Key_read_requests | 925022 || Key_reads | 22402 || Key_write_requests | 57383 || Key_writes | 57113 || Last_query_cost | 0.000000 || Max_used_connections | 40 || Not_flushed_delayed_rows | 0 || Open_files | 218 || Open_streams | 0 || Open_tables | 103 || Opened_tables | 109 || Qcache_free_blocks | 403 || Qcache_free_memory | 20899056 || Qcache_hits | 3540 || Qcache_inserts | 21207 || Qcache_lowmem_prunes | 0 || Qcache_not_cached | 3390 || Qcache_queries_in_cache | 4207 || Qcache_total_blocks | 8845 || Questions | 87683 || Rpl_status | NULL || Select_full_join | 0 || Select_full_range_join | 0 || Select_range | 377 || Select_range_check | 0 || Select_scan | 3910 || Slave_open_temp_tables | 0 || Slave_retried_transactions | 0 || Slave_running | OFF || Slow_launch_threads | 0 || Slow_queries | 41 || Sort_merge_passes | 0 || Sort_range | 4 || Sort_rows | 859 || Sort_scan | 93 || Ssl_accept_renegotiates | 0 || Ssl_accepts | 0 || Ssl_callback_cache_hits | 0 || Ssl_cipher | || Ssl_cipher_list | || Ssl_client_connects | 0 || Ssl_connect_renegotiates | 0 || Ssl_ctx_verify_depth | 0 || Ssl_ctx_verify_mode | 0 || Ssl_default_timeout | 0 || Ssl_finished_accepts | 0 || Ssl_finished_connects | 0 || Ssl_session_cache_hits | 0 || Ssl_session_cache_misses | 0 || Ssl_session_cache_mode | NONE || Ssl_session_cache_overflows | 0 || Ssl_session_cache_size | 0 || Ssl_session_cache_timeouts | 0 || Ssl_sessions_reused | 0 || Ssl_used_session_cache_entries | 0 || Ssl_verify_depth | 0 || Ssl_verify_mode | 0 || Ssl_version | || Table_locks_immediate | 81801 || Table_locks_waited | 226 || Tc_log_max_pages_used | 0 || Tc_log_page_size | 0 || Tc_log_page_waits | 0 || Threads_cached | 21 || Threads_connected | 23 || Threads_created | 40 || Threads_running | 4 || Uptime | 1692 |±----------------------------------±---------+

One idea I first had is that these stalls could be a locking problems - but you have very few Table_locks_waited. Still it could be…

Your applications seem to be write intensive - the query cache is not really useful. You could try to disable it, but it should not be responsible for the stalls.

How much memory do your databases take in total?

Ya I have a slave database server that is for longer reads to avoid locking the master. Also it seems if other queries were locking the table they would show up in the slow queries log.

I’ll try turning off the query cache.

doing a du -h /var/lib/mysql is 19gigs. Most of the data is archived logs. The Users table that the query stalls on is 1.5 gigs MYD + 729M MYI.