Query Cache and Splitting Data Across More Tables

If I understand the MySQL query cache correctly, it functions on a table level. So if I have Tables X, Y, and Z, and MySQL has cached queries on all three, then as soon as Table X is updated in any way (insert, update, or delete), the cached queries for Table X are all invalidated, but tables Y and Z still have their valid cached queries.

If that’s the case, then would it be better to split one big table into multiple, smaller tables? For example, let’s assume I have a blog with a 1,000,000 comments over 1,000 blog posts. All of those comments are in the same Table X, but everytime someone adds a comment, all of the cached query results are invalidated.

So if I divided the system so that Table X is split into 50 tables (each table containing all the comments for 20 blog posts), would the query cache then be more effective by allowing more of the cached queries to stay valid longer (by spreading the comments out over multiple tables)?

Yes, you are correct about how the query cache functions and the approach you suggest could lead to a higher cache hit ratio, but it all depends on access patterns. In your blog example, the “hot” posts which are accessed more frequently are also the ones being invalidated most often. For those, you’re not buying very much, and those should be the ones you care about most.

Also, extra tables are not free, and if you were talking about 1000s of tables instead of 10s, you may start running into memory issues with InnoDB.

IMHO, the query cache is a nifty bonus feature that certainly helps, but I wouldn’t invest in it. If you’re at the point when your application load is taxing your well-tuned database, it’s time to look into application-level caching.

The question is more theoretical at this point - no specific application in mind right now. I like to plan ahead instead of abruptly reactive later. )

That’s a good point about table quantities. I suppose it depends on structures and how much memory each table takes up. I’m unfamiliar with the memory requirements / usage on the simple table level. I had assumed that the table listings were very lightweight (a basic list in memory with some pointers to the physical data file) - is there an explanation somewhere of the memory usage associated with 1000’s of tables?

OOC, do different engines (e.g. MyISAM) face the same problem with table quantities?