Caching for benchmarking

Hi all.

I’m running some benchmarks at the moment. I’m running on a smaller subset of (what will likely be) a “real” production data set, for a couple of reasons: 1. It doesn’t take days to load, 2. My benchmark queries can complete in an overnight run (as opposed to taking weeks).

The problem is that, although I scale down the size of the MySQL caches so that only a realistic proportion of my data fits in the cache, the disk cache just keeps growing, and growing (the machine has 32GB of RAM, so it uses most of that). This means that my benchmark tests are unrealistic, because an unrealistic amount of data fits in memory.

Does anyone know of a way to limit the disk cache? I’ve tried dropping it entirely (echo 1 > /proc/sys/vm/drop_caches), but this really kills performance, and is unrealistic in a different way.

Alternatively does anyone know of a way around this problem? What do people typically do when they want realistic benchmarks, but they don’t want to use huge data sets?

At this point I’m considering buying a new test server and removing RAM from it.




You could try writing a small C program that allocates a chunk of memory and locks it into RAM so it’s not paged out?

malloc( lots_of_memory );

sleep( a_very_long_time );

Not actually tried this so buyer beware :wink:


Why do that? Just use a virtual machine with a limited amount of RAM.

I really like Sun’s VirtualBox, and it’s free.

Thanks for the replies.

My suspicion is that a VM won’t perform exactly the same as an real machine (though I haven’t tested this, it’s merely based on anecdotal evidence) so it’s probably best not to use a VM for benchmarking purposes.

My solution in the end was to use InnoDb (which, luckily, we were moving to anyway) and set innodb_flush_method=O_DIRECT.