storage engine & hardware to use with memcached

Background:
The goals of the system (online community) to be designed are being able to deal with

  • 1.000.000 hits per hour (via https)
  • 250MB/day of incoming data (text comments…)
  1. I am trying to find the optimal storage engine combining MySQL and memcached. All reading operations would be done from memcached. MySQL would cover writing operations only (hoping that I get to manage to cache all the necessary data). Would it make sense in this case to choose MyIsam as storage engine for all the tables?

  2. Calculating the hardware requirements, I would go for

  • a quadcore, 16GB RAM for the Apache
  • x times dual cores with 4-8GB each for memcached
  • a quadcore, 4-8GB RAM, RAID-1, 15.000rpm HD, for MySQL
    Especially the DB-Server is of concern. On one hand it is clear that after a couple of weeks the data wont fit into the RAM anymore. On the other hand, a memcached failure should not lead to a complete deasaster.

Does anyone have any experience to share regarding these questions?

For a community site, I’d go with InnoDB over MyISAM. First, when your cache does fail over, your database will need to deal with high read-write concurrency, which InnoDB will be better at. My second reason is that you’ll get transactions, which if you need to update more than one table at once, will be a good idea. The only thing I could see using MyISAM for is full text search, but you might be better off going outside of MySQL to solve that problem.

As a side note, I’ve seen a lot of people over-build systems based on theoretical scaling numbers that their sites might never actually see. It’s pretty hard to grow a community site from scratch to 24M hits/day. In most cases, that kind of usage doesn’t sneak up on you, but grows gradually over time. This will allow you to grow your solution over time as the usage patterns and growth rates emerge. If you’re small, stay flexible and get a good solution out quicker, rather than building for a goal that might be years out. Obviously, I don’t know your situation, but that’s my two cents.