Analyzing InnoDB Statistics from Benchmark Tests

I have been conducting a number of benchmark tests on a MySQL database. I am in the process of analyzing the results, and have a question about a big increase that I am seeing in the ‘Pages written’ value from SHOW INNODB STATUS between tests of two different sizes.

BACKGROUND

We have a database with a single InnoDB table that is expected to grow to between 100 and 250 million records. The table design is quite simple, with three fields: shortHash bigint, fullHash binary(20), and dataHash binary(20). A primary key is defined on (shortHash, fullHash).

All calls to the database will be made through a single stored procedure. The stored procedure will accept shortHash, fullHash, and dataHash values as input parameters. If a record matching the three input parameters already exists, the stored procedure will take no action. If a record matching the shortHash and fullHash parameters exists but with a different dataHash value, the stored procedure will update the record. If no record exists for the shortHash and fullHash values, the procedure will insert a record.

The application and database will be run on a dedicated machine. The application will be run constantly, which will result in a consistent stream of stored procedure calls to the database.

TEST CONFIGURATION

I have been running benchmark tests on a machine with the following hardware:

  • CPU: Quadcore Intel Xeon CPU 3.00GHz
  • Memory: 8GB RAM
  • Hard Disk: Six 15k-rpm 73-GB Drives in a RAID 5 Array
  • OS: Fedora Release 9 64-bit
  • MySQL: 5.1.35-log MySQL Community Server (GPL)

The my.cnf file contains the following settings:

innodb_buffer_pool_size 5500Minnodb_log_file_size 256Minnodb_log_buffer_size 4Minnodb_flush_log_at_trx_commit 2innodb_thread_concurrency 0innodb_additional_mem_pool_size 20Mtransaction-isolation READ-COMMITTED

For each test, I pre-populate the InnoDB table with a certain number of records. I then use a test harness to make 1 million stored procedure calls… the data used for the calls results in
333,334 inserts, 333,333 updates, and 1 million reads. During the test, I execute SHOW INNODB STATUS every 15 seconds and send the results to a file.

TEST RESULTS

When running the test on a database with 100 million records and with 16 harness threads making concurrent calls to the database, a rate of 800 stored procedure calls per second is achieved. When running the test on a database with 170 million records and with 16 threads, a rate of only 475 stored procedure calls per second is achieved. I have analyzed the SHOW INNODB STATUS logs to try to understand what is causing the decrease in performance.

Contention – The semaphore data looks good in both cases, with OS Waits/sec in the hundreds and Spin Rounds/sec under ten thousand.

File I/O – For the 100 million test, the Pending Async I/O Reads value was non-zero for 60% of the test, with an average non-zero value of 65. The other I/O values looked good. For the 170 million test, the Pending Async I/O Reads value was non-zero for 65% of the test, with an average non-zero value of 107. In addition, the Pending Flushes (fsync) Buffer Pool value become one 50 times over the course of the test. So, the 170 million test seems to have been more I/O bound than the 100 million case.

CPU – The 100 million test had an average CPU percentage of 75%, while the 170 million test had an average CPU percentage of 40%.

Buffer Pool – In the 100 million test, the percentage of dirty pages in the buffer pool climbed to and then leveled out at 72%. From the point that the dirty page percentage reached 72% until the end of the test, the average page writes per second was 200. In the 170 million test, the percentage of dirty pages in the buffer pool climbed to and then leveled out at 62%. From the point that the dirty page percentage reached 62% until the end of the test, the average page writes per second was 200. So, although the dirty page percentage was higher in the 100 million test, the average page writes per second was consistent across both tests.

Pages Written – The Pages Written value (from the Buffer Pool and Memory section) at the end of the 100 million test was ~130,000, while the Pages Written value at the end of the 170 million test was ~330,000.

MY QUESTION

Given that the average page writes per second was the same across both tests, it seems that the performance discrepancy is due to the fact that 2.5 times as many pages were written during the 170 million test as during the 100 million test. My question is WHY? Given that the same number of records were updated and inserted, I am surprised to see such a big jump in the number of pages written. I know that there is a logical explanation for this, but I am missing it. Any insight into my question would be greatly appreciated.

Thank you!

5.1.35 is old! There many newer versions, also percona builds, with lots of performance improvements.

Because your PK is semi-random, data will be distributed uniformly across pages. There will be no distinction between ‘hot’ and ‘cold’ data pages. If you have such little memory, it could very well be possible that you get much higher performance if you add a surrogate auto_increment primary key, but that depends on how your data will be queried.

Do you use INSERT … ON DUPLIcATE KEY UPDATE? It could save some cpu.

Consider innodb_flush_method = O_DIRECT.

You seem to be limited too 200 page flushes per second. Make sure your RAID has a BBU and consider switching to SSD’s.

As you can read on http://www.mysqlperformanceblog.com/2008/11/13/adaptive-chec kpointing/ there are several reasons why dirty pages are flushed. You seem to hit the ‘amount’ limit.

1 record takes 48 bytes (data) + 18 bytes (InnoDB overhead). That means that your buffer pool can hold at most 100M records, but with pages 70% filled and your buffer pool 70% filled, it only has space for 50M records. So with 200M records your cache miss rate is almost three times higher than with 100M records. Cache miss means ‘read page from disk and make sure there is space availabe in the buffer pool’. I think this explains the 2.5x difference. Since memory is cheap, and since you get much higher performance with more memory, get more memory.

Thank you so much for the feedback! It was very helpful.

Since the data will always be queried by shortHash and fullHash, I did not see a benefit to adding a surrogate key. Was I wrong in my thinking on that?

I do use INSERT… ON DUPLICATE KEY UPDATE.

Your explanation about the cache miss rate makes sense.

I will consider your other suggestions as well.

Thank you again!

Since the data will always be queried by shortHash and fullHash, I did not see a benefit to adding a surrogate key. Was I wrong in my thinking on that?

Yes, you could very well benefit if it limits the number of actively queried pages.