Hi,
I’ve a really simple database, with only one table, which contains 2.24 million rows of image information, and who took the picture etc.
keyData:
±---------±-----------------±-----±----±--------±------+| Field | Type | Null | Key | Default | Extra |±---------±-----------------±-----±----±--------±------+| picID | int(10) unsigned | NO | PRI | | || keywords | varchar(4300) | YES | MUL | NULL | || phCode | varchar(100) | YES | MUL | NULL | || phName | varchar(100) | YES | MUL | NULL | || trCode | varchar(1000) | YES | MUL | NULL | |±---------±-----------------±-----±----±--------±------+
The indexes are setup as follows; with a btree index on the primary key (picID) and fulltext indexes on all the others.
±--------±-----------±---------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |±--------±-----------±---------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+| keyData | 0 | PRIMARY | 1 | picID | A | 2246297 | NULL | NULL | | BTREE | || keyData | 1 | kWord | 1 | keywords | NULL | 1 | NULL | NULL | YES | FULLTEXT | || keyData | 1 | pCode | 1 | phCode | NULL | 1 | NULL | NULL | YES | FULLTEXT | || keyData | 1 | pName | 1 | phName | NULL | 1 | NULL | NULL | YES | FULLTEXT | || keyData | 1 | tCode | 1 | trCode | NULL | 1 | NULL | NULL | YES | FULLTEXT | |±--------±-----------±---------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+
Now the select I’m doing is as follows:
select picID from keyData where match(keywords) against(‘+new +york’ in boolean mode) order by picID desc limit 14;
Now this can take anything from 6 seconds to 17, depending on whether it’s cached etc.
I’ve tried increasing the myisam_sort_buffer_size and the key cache and query cache, before realising the bottle neck appears to be the sorting (yeah; slow on the up-take)
Now an explain of this query offers the following
±—±------------±--------±---------±--------------±------±--------±-----±-----±----------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |±—±------------±--------±---------±--------------±------±--------±-----±-----±----------------------------+| 1 | SIMPLE | keyData | fulltext | kWord | kWord | 0 | | 1 | Using where; Using filesort |±—±------------±--------±---------±--------------±------±--------±-----±-----±----------------------------+
Having read what the filesort algorithm actually does, I’m not surprised it’s slow.
Is there a way of forcing this to not use this approach? I’ve done some searching on this site of similar problems, and it seems combining the primary key with another key would help, but I don’t have another numeric key to combine with.
Any help would be fantastic, thanks in advance.