InnoDB performance is nasty - please help!

Hi. I get about 100,000 users per minute on my site.

We use InnoDB. The DB server is a 32GB server with 2x quad core servers on RAID 10. Containing 5 disks of SCSI 15k RPM. Top quality hardware!

We have about 150 tables in 11 databases. The table with the highest number of records is 500,000. Not rocket science really. From my judgment this is a fairly small amount of data?!

And still the website crawls under this heavy load!

We’re optimizing our slow SQLs from the log (only a few) but I also want to make sure our config is geared to take this kind of traffic.

Would love to hear from the gurus on this forum! My conf is below, as is my SHOW INNODB STATUS. Thanks!

safe-show-databasebind-address = 10.16.30.70port = 3306datadir = /var/lib/mysqlset-variable=max_allowed_packet=512Mlog-error = /var/log/mysqld.log # Logging enabledinnodb_buffer_pool_size = 11Ginnodb_flush_method=O_DIRECT # Remove double bufferinginnodb_support_xa = 0innodb_log_buffer_size=8Minnodb_commit_concurrency = 0innodb_concurrency_tickets = 500innodb_flush_log_at_trx_commit = 0innodb_lock_wait_timeout = 8innodb_thread_concurrency = 16innodb_log_group_home_dir = /var/lib/mysql/innodb_log_arch_dir = /var/lib/mysql/innodb_additional_mem_pool_size = 20Mskip-bdbskip-lockingskip-name-resolvequery_cache_limit=2Gquery_cache_size=200M # Was 2G, limit and tryquery_cache_type=1table_cache=10Mthread_cache_size=256thread_concurrency=16 max_connections=5000 #max_user_connections=1500wait_timeout=10interactive_timeout=15key_buffer_size=9Gjoin_buffer=512M #read_rnd_buffer_size=32M #max_heap_table_size=2Gtmp_table_size=512sort_buffer_size=1G #max_connect_errors=10character-set-server=utf8default-collation=utf8_unicode_cilog-queries-not-using-indexes=1log_slow_queries=/var/log/mysql-slow.loglong_query_time=3#[myisamchk]key_buffer = 9Gsort_buffer = 1Gread_buffer = 32M

For show status I’ll update this thread shortly. We have just restarted our DB. But my concern is that the mysql-tuning.pl script is showing me that my maximum allocated memory is about 2000% more than what I have available!! Why?

The show status command after about 20 mins of running. Not the very peak time, but hope this shows some direction:

±----------------------------------±----------+| Variable_name | Value |±----------------------------------±----------+| Aborted_clients | 23 | | Aborted_connects | 4 | | Bytes_received | 94 | | Bytes_sent | 81 | | Com_show_status | 1 | | Connections | 49416 | | Created_tmp_disk_tables | 1 | | Created_tmp_files | 5 | | Created_tmp_tables | 1 | | Flush_commands | 1 | | Handler_read_rnd_next | 11 | | Handler_write | 143 | | Innodb_buffer_pool_pages_data | 7879 | | Innodb_buffer_pool_pages_dirty | 83 | | Innodb_buffer_pool_pages_flushed | 3913 | | Innodb_buffer_pool_pages_free | 711726 | | Innodb_buffer_pool_pages_misc | 1291 | | Innodb_buffer_pool_pages_total | 720896 | | Innodb_buffer_pool_read_ahead_rnd | 6 | | Innodb_buffer_pool_read_ahead_seq | 65 | | Innodb_buffer_pool_read_requests | 69517673 | | Innodb_buffer_pool_reads | 3479 | | Innodb_buffer_pool_write_requests | 69280 | | Innodb_data_fsyncs | 694 | | Innodb_data_read | 130486272 | | Innodb_data_reads | 3695 | | Innodb_data_writes | 2873 | | Innodb_data_written | 131098624 | | Innodb_dblwr_pages_written | 3913 | | Innodb_dblwr_writes | 69 | | Innodb_log_write_requests | 5512 | | Innodb_log_writes | 501 | | Innodb_os_log_fsyncs | 556 | | Innodb_os_log_written | 2849280 | | Innodb_page_size | 16384 | | Innodb_pages_created | 48 | | Innodb_pages_read | 7831 | | Innodb_pages_written | 3913 | | Innodb_rows_deleted | 1696 | | Innodb_rows_inserted | 6255 | | Innodb_rows_read | 145166462 | | Innodb_rows_updated | 3706 | | Key_blocks_unused | 7730956 | | Key_blocks_used | 282 | | Key_read_requests | 1506223 | | Key_reads | 312 | | Key_write_requests | 22042 | | Key_writes | 70 | | Max_used_connections | 24 | | Open_files | 117 | | Open_tables | 181 | | Qcache_free_blocks | 450 | | Qcache_free_memory | 194586848 | | Qcache_hits | 387867 | | Qcache_inserts | 28846 | | Qcache_not_cached | 4581 | | Qcache_queries_in_cache | 9532 | | Qcache_total_blocks | 19650 | | Questions | 765259 | | Select_scan | 1 | | Table_locks_immediate | 74474 | | Table_locks_waited | 31 | | Threads_cached | 18 | | Threads_connected | 6 | | Threads_created | 24 | | Threads_running | 1 | | Uptime | 542 | | Uptime_since_flush_status | 542 | ±----------------------------------±----------+

[B]pkiula[/B]

For show status I’ll update this thread shortly. We have just restarted our DB. But my concern is that the mysql-tuning.pl script is showing me that my maximum allocated memory is about 2000% more than what I have available!! Why?

It says that because some of the variables like sort_buffer_size etc are limitations _per_ _connection_ which means that:

sort_buffer_size * max_connections = a very high number

Generally you don’t have all 5000 connections performing a sort in the magnitude of 1GB at the same time. But if it should happen even one time then the server would probably grind to a halt due to all RAM being consumed and swapped out.

Looking at your status variables it looks like your actual database size is pretty small.
BTW, have you stripped out a few of the status variables or which version of MySQL are you running?

I’m missing some of them and the Com_xxx and the Select_xxx ones can also tell a lot about the queries that are issued against the database.

What does the OS status look like during operation?
Is it 100% cpu or?
Because it doesn’t look like IO should be the bottleneck in your case.

BTW:
The sections in the my.cnf:
[myisamchk]
should be there because they serve a purpose.

And in your case the:

#[myisamchk]key_buffer = 9Gsort_buffer = 1Gread_buffer = 32M

Is very confusing since the myiasmchk section is commented out.
And the variables key_buffer, sort_buffer and read_buffer has nothing to do with the myisamchk and in fact they are redundant since you have already set it previously since key_buffer_size is the same thing.

Hi. Thanks for the thoughts!

MYSQL is 5.0.67.

Should I reduce the sort_buffer_size? Isn’t this where the sorting of tables is stored? It’s currently 1GB, but that’s a high value and this blog suggests this high value is not good: http://www.mysqlperformanceblog.com/2007/08/18/how-fast-can- you-sort-data-with-mysql/

Any thoughts?

The variables below are from my command “SHOW STATUS WHERE VALUE > 0”. No point seeing stuff where the val is zero?

±----------------------------------±------------+| Variable_name | Value |±----------------------------------±------------+| Aborted_clients | 17241 | | Aborted_connects | 6 | | Bytes_received | 94 | | Bytes_sent | 81 | | Com_show_status | 1 | | Connections | 3637766 | | Created_tmp_disk_tables | 1 | | Created_tmp_files | 5 | | Created_tmp_tables | 1 | | Flush_commands | 1 | | Handler_read_rnd_next | 11 | | Handler_write | 143 | | Innodb_buffer_pool_pages_data | 18577 | | Innodb_buffer_pool_pages_dirty | 40 | | Innodb_buffer_pool_pages_flushed | 295096 | | Innodb_buffer_pool_pages_free | 699783 | | Innodb_buffer_pool_pages_misc | 2536 | | Innodb_buffer_pool_pages_total | 720896 | | Innodb_buffer_pool_read_ahead_rnd | 8 | | Innodb_buffer_pool_read_ahead_seq | 116 | | Innodb_buffer_pool_read_requests | 10200101445 | | Innodb_buffer_pool_reads | 8073 | | Innodb_buffer_pool_write_requests | 7289966 | | Innodb_data_fsyncs | 48920 | | Innodb_data_read | 243126272 | | Innodb_data_reads | 8654 | | Innodb_data_writes | 207449 | | Innodb_data_written | 9947728896 | | Innodb_dblwr_pages_written | 295096 | | Innodb_dblwr_writes | 5026 | | Innodb_log_write_requests | 553439 | | Innodb_log_writes | 34713 | | Innodb_os_log_fsyncs | 38868 | | Innodb_os_log_written | 275868672 | | Innodb_page_size | 16384 | | Innodb_pages_created | 3871 | | Innodb_pages_read | 14706 | | Innodb_pages_written | 295096 | | Innodb_row_lock_time | 20728646 | | Innodb_row_lock_time_avg | 1754 | | Innodb_row_lock_time_max | 9998 | | Innodb_row_lock_waits | 11816 | | Innodb_rows_deleted | 168032 | | Innodb_rows_inserted | 688969 | | Innodb_rows_read | 48130827884 | | Innodb_rows_updated | 308445 | | Key_blocks_unused | 7730787 | | Key_blocks_used | 857 | | Key_read_requests | 67700779 | | Key_reads | 20213 | | Key_write_requests | 2103719 | | Key_writes | 2621 | | Max_used_connections | 275 | | Open_files | 238 | | Open_tables | 509 | | Qcache_free_blocks | 32982 | | Qcache_free_memory | 114633544 | | Qcache_hits | 34299325 | | Qcache_inserts | 2109536 | | Qcache_lowmem_prunes | 227832 | | Qcache_not_cached | 1559417 | | Qcache_queries_in_cache | 83155 | | Qcache_total_blocks | 199717 | | Questions | 63865453 | | Select_scan | 1 | | Table_locks_immediate | 8397936 | | Table_locks_waited | 4704 | | Threads_cached | 249 | | Threads_connected | 7 | | Threads_created | 275 | | Threads_running | 1 | | Uptime | 42114 | | Uptime_since_flush_status | 42114 | ±----------------------------------±------------+73 rows in set (0.02 sec)

Btw the phpmyadmin has a page called “Status”. I check this often too. It shows me this stuff in red, which means attention items:

Innodb_buffer_pool_reads 2,684 The number of logical reads that InnoDB could not satisfy from buffer pool and had to do a single-page read. Handler_read_rnd 11 k The number of requests to read a row based on a fixed position. This is high if you are doing a lot of queries that require sorting of the result. You probably have a lot of queries that require MySQL to scan whole tables or you have joins that don’t use keys properly.Handler_read_rnd_next 48 M The number of requests to read the next row in the data file. This is high if you are doing a lot of table scans. Generally this suggests that your tables are not properly indexed or that your queries are not written to take advantage of the indexes you have. Select_full_join 429 The number of joins that do not use indexes. If this value is not 0, you should carefully check the indexes of your tables.Select_full_range_join 2 The number of joins that used a range search on a reference table.Select_range 1,119 The number of joins that used ranges on the first table. (It’s normally not critical even if this is big.)Select_range_check 0 The number of joins without keys that check for key usage after each row. (If this is not 0, you should carefully check the indexes of your tables.)Select_scan 2,732 The number of joins that did a full scan of the first table. Open_tables 156 The number of tables that are open.Opened_tables 162 The number of tables that have been opened. If opened tables is big, your table cache value is probably too small.Table_locks_immediate 21 k The number of times that a table lock was acquired immediately.Table_locks_waited 7 The number of times that a table lock could not be acquired immediately and a wait was needed. If this is high, and you have performance problems, you should first optimize your queries, and then either split your table or tables or use replication.

Any hints here?

[B]Quote:[/B]

The variables below are from my command “SHOW STATUS WHERE VALUE > 0”. No point seeing stuff where the val is zero?

No not really but I didn't know that you had filtered them like that hence why I asked.

But no this does not reveal anything more.

I’m repeating the question:
Do you have a high CPU load during these problems?

Yes according to me you should definitely reduce the sort_buffer_size, see below.

The thing is that looking at these:

| Innodb_buffer_pool_pages_free | 699783 | | Innodb_buffer_pool_pages_total | 720896 |

I figure that your database is 720896-699783 = 21113 * 16k = ~330MB.

And this for MyISAM:

| Key_blocks_unused | 7730787 | | Key_blocks_used | 857 |

Basically means that you have never used more than 1MB of your 9GB key_buffer_size.

So all in all I figure your database is pretty small and especially compared to your amount of RAM.

And according to this:

| Qcache_free_memory | 114633544 |

You are using less than half of your 200MB QCache buffer.

That is why when looking at a lot of your server variables they just look ridiculously huge.

Your variables and some comments:

query_cache_limit=2G

This is very strange since it says that the largest result set that should be stored in the qcache is 10 times the size of the qcache size.
But the 200MB qcache_size is a good value.

table_cache=10M

This is also ridiculously huge since

  1. it is about 20 times larger than the highest possible value and
  2. According to your status value:

| Open_tables | 509 |

You don’t have that many tables open.

max_connections=5000 #max_user_connections=1500

Looking at your connections/uptime ratio which amounts to about 86 per second. And even if we should say for a sake of argument that your average query takes as long time as 3 seconds you would never have more than 258 concurrent connections.

key_buffer_size=9G

See above

join_buffer=512M

Remove this and let it be default.
This is only used when a join can’t use an index, which basically means that you are lacking proper indexes in the database.
Much better to fix the indexes than increasing this buffer.

tmp_table_size=512

Strange values since you are saying the max size is 512 bytes.
I don’t know if you need a bigger value here, but you should not need more than about 32-64M.

sort_buffer_size=1G

Also ridiculously large.
This buffer is used to store the result set of one query while it is sorted. Now if your entire database is about 350MB, do you have any query that returns all rows from all tables and needs to sort them all?
Usually a sensible value according to me is somewhere around 10-20MB but Peter seems to prove me wrong (the link you provided). So you have to try it out.

So my suggestion is that you reduce a lot of the settings above that I have pointed out.
Focus mostly on finding the slow queries and definitely reduce and set:

long_query_time=1

Then you start by filtering through these queries.
Somewhere here you should have one or more queries that you can speed up by writing it differently or adding indexes to your table.

As a comparison I can say that I work with a DB server that runs at about 1500 qps average with a 1.5 GB database at 5% cpu usage on dual quad core xeon: Intel® Xeon® Processor E5345 (8M Cache, 2.33 GHz, 1333 MHz FSB).

So it is definitely not impossible.

Good luck!