Stuck With Bad Category Filter Performance

Hey folks,

I’m a little stuck. I got the following tables:

CREATE TABLE items ( id int(10) unsigned NOT NULL auto_increment, unlocked tinyint(1) NOT NULL default ‘0’, name varchar(200) NOT NULL, meta_cat_type enum(‘blah’,‘foo’,‘bar’) default NULL, post_time int(10) unsigned NOT NULL default ‘0’, unlock_time int(10) unsigned NOT NULL, allow_comments tinyint(1) NOT NULL default ‘1’, posted_by int(10) unsigned NOT NULL, PRIMARY KEY (id), UNIQUE KEY name (name), KEY post_time (post_time), KEY meta_cat_type (meta_cat_type)) ENGINE=InnoDB DEFAULT CHARSET=utf8;-- --------------------------------------------------------CREATE TABLE item_cats ( item_id int(10) unsigned NOT NULL, cat_id smallint(5) unsigned NOT NULL, KEY item_id (item_id), KEY cat_id (cat_id)) ENGINE=InnoDB DEFAULT CHARSET=utf8;-- --------------------------------------------------------CREATE TABLE item_filters ( user_id int(10) unsigned NOT NULL, cat_id smallint(5) unsigned NOT NULL, meta_cats set(‘blah’,‘foo’,‘bar’) NOT NULL, filter_type enum(‘overview’,‘global’) NOT NULL, UNIQUE KEY id (user_id,cat_id,filter_type)) ENGINE=MyISAM DEFAULT CHARSET=utf8;

So each item can belong to many categories. Pretty much standard stuff.

Now I have a typical view-items-with-pagination-and-stuff thing set up, so I need to get the total number of items and the items on the current page. My current query to get these looks like this, but it’s rather slow as the sub query has to be executed once for every row in the items table (which currently has about 50,000 rows and is growing continuously).

SELECT COUNT( * )FROM itemsWHERE unlocked =1AND ( NOT EXISTS ( SELECT * FROM item_filters f, item_cats c WHERE f.user_id = “823” AND f.cat_id = c.cat_id AND c.item_id = AND f.filter_type = “overview” AND FIND_IN_SET( items.meta_cat_type, meta_cats ))

Any help on how to speed this up is greatly appreciated! Thanks in advance. )

Just a quick observation is that you have no primary key for item_cats.

Try creating a PK:

PRIMARY KEY (item_id, cat_id)

Possible storage reductions:

INT + SMALLINT = 6 bytes

INDEX(cat_id) (for tag clouds I assume?): 2 bytes

Total: 6 + 2 = 8


6 bytes (InnoDB auto PK) + 4 bytes (INT) + 2 bytes (SMALLINT) = 12 bytes

Lookups on a “clustered” primary key are also more efficient.

Other than that, I don’t know how you can really improve the performance of the query without making some design changes.

Hey, thanks for PK hints, that made sense and eliminated a “using where” in the EXPLAIN output. )

And other than that…? I’m not averse to table/database design changes as long as they’ll deliver better performance. Unfortunately, I’m a pretty clueless as to how these could look like. )