Query takes 14 seconds, filesort seems to be the problem

gatsby · March 6, 2007, 6:16am

Hi,

I’ve a really simple database, with only one table, which contains 2.24 million rows of image information, and who took the picture etc.

keyData:
±---------±-----------------±-----±----±--------±------+| Field | Type | Null | Key | Default | Extra |±---------±-----------------±-----±----±--------±------+| picID | int(10) unsigned | NO | PRI | | || keywords | varchar(4300) | YES | MUL | NULL | || phCode | varchar(100) | YES | MUL | NULL | || phName | varchar(100) | YES | MUL | NULL | || trCode | varchar(1000) | YES | MUL | NULL | |±---------±-----------------±-----±----±--------±------+

The indexes are setup as follows; with a btree index on the primary key (picID) and fulltext indexes on all the others.

±--------±-----------±---------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |±--------±-----------±---------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+| keyData | 0 | PRIMARY | 1 | picID | A | 2246297 | NULL | NULL | | BTREE | || keyData | 1 | kWord | 1 | keywords | NULL | 1 | NULL | NULL | YES | FULLTEXT | || keyData | 1 | pCode | 1 | phCode | NULL | 1 | NULL | NULL | YES | FULLTEXT | || keyData | 1 | pName | 1 | phName | NULL | 1 | NULL | NULL | YES | FULLTEXT | || keyData | 1 | tCode | 1 | trCode | NULL | 1 | NULL | NULL | YES | FULLTEXT | |±--------±-----------±---------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+

Now the select I’m doing is as follows:

select picID from keyData where match(keywords) against(‘+new +york’ in boolean mode) order by picID desc limit 14;

Now this can take anything from 6 seconds to 17, depending on whether it’s cached etc.

I’ve tried increasing the myisam_sort_buffer_size and the key cache and query cache, before realising the bottle neck appears to be the sorting (yeah; slow on the up-take)

Now an explain of this query offers the following

±—±------------±--------±---------±--------------±------±--------±-----±-----±----------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |±—±------------±--------±---------±--------------±------±--------±-----±-----±----------------------------+| 1 | SIMPLE | keyData | fulltext | kWord | kWord | 0 | | 1 | Using where; Using filesort |±—±------------±--------±---------±--------------±------±--------±-----±-----±----------------------------+

Having read what the filesort algorithm actually does, I’m not surprised it’s slow.

Is there a way of forcing this to not use this approach? I’ve done some searching on this site of similar problems, and it seems combining the primary key with another key would help, but I don’t have another numeric key to combine with.

Any help would be fantastic, thanks in advance.

Peter · March 6, 2007, 9:15am

How many rows it returns ?

How many rows match to each of the keywords you’re using ?

In this case as you’re using order by LIMIT will not help as all result set will need to be generated from full text search index which is rather slow on bigger data volumes.

I’ve give Sphinx a try it usually works 10-100 times faster for queries like this (because it internally optimizes sorting and limit among other things)

gatsby · March 6, 2007, 9:30am

[B]Peter wrote on Tue, 06 March 2007 14:45[/B]

How many rows it returns ?

How many rows match to each of the keywords you’re using ?

In this case as you’re using order by LIMIT will not help as all result set will need to be generated from full text search index which is rather slow on bigger data volumes.

I’ve give Sphinx a try it usually works 10-100 times faster for queries like this (because it internally optimizes sorting and limit among other things)

It returns 274396 rows.

If I understand your question correctly, 331678 matches ‘new’ and 278870 matches ‘york’.

What is this Sphinx you speak of? Does it work with MySQL?

Thanks for your help and speedy response.

Topic		Replies	Views
Order of results from WHERE LIKE Other MySQL® Questions	6	617	June 18, 2007
MySQL: Avoid filesort when using ORDER BY LIMIT Other MySQL® Questions	3	3984	April 15, 2015
Query resorting to filesort? Other MySQL® Questions	6	565	May 1, 2010
Using Indexes to increase sort performance Other MySQL® Questions	2	415	August 13, 2006
using filesort, bad thing here? Other MySQL® Questions	0	381	January 16, 2008

Query takes 14 seconds, filesort seems to be the problem

Related topics