Fulltext index not being used

I have a Product table with a fulltext index on the name field and the description field. I want to do a select for products that match text with the name being more important than the description - hence the order by clause. They also need to fit into certain categories. I also only want to show Products that have an offer in the Offers table.

The Product table has over 500 000 rows and the ProductCategory has the same number of rows which just maps the Products to a Category.

Here is what I’ve tried, but the number of rows scanned is far too much, and I can’t think of any other info to make an index on.

SELECT DISTINCT p.id, MATCH (p.name) AGAINST (‘the colour purple’) AS name_score, MATCH (p.description) AGAINST (‘the colour purple’) AS description_score
FROM Product as p
INNER JOIN ProductCategory as pc on p.id=pc.product_id
INNER JOIN Category as c on c.id=pc.category_id
WHERE ( pc.category_id=‘1’
OR pc.category_id IN ( ‘18’ , ‘20’ , ‘21’ , ‘22’ , ‘23’ , ‘24’ , ‘25’ , ‘26’ , ‘27’ , ‘28’ , ‘29’ , ‘30’ , ‘31’ , ‘32’ , ‘33’ , ‘34’ , ‘35’ , ‘36’ , ‘37’ , ‘38’ , ‘39’ , ‘40’ , ‘41’ , ‘42’ , ‘43’ , ‘44’ , ‘45’ , ‘46’ , ‘47’ , ‘48’ , ‘49’ , ‘50’ , ‘51’ , ‘52’ , ‘53’ , ‘54’ , ‘55’ , ‘56’ , ‘57’ , ‘58’ , ‘59’ , ‘60’ , ‘61’ , ‘62’ , ‘63’ , ‘64’ , ‘65’ , ‘66’ , ‘67’ , ‘19’ , ‘68’ , ‘69’ , ‘70’ , ‘71’ , ‘72’ , ‘73’ , ‘74’ , ‘821’ , ‘822’ )
)
AND ( (
(MATCH (p.name) AGAINST (‘the colour purple’))
OR (MATCH (p.description) AGAINST (‘the colour purple’))
)
)
AND p.id IN (
SELECT product_id
FROM Offer AS o
WHERE o.price > 0
AND o.deleted = 0
)
AND p.needs_verification=0
ORDER BY name_score DESC, description_score DESC
LIMIT 0, 20;

±—±-------------------±------±---------------±------- --------------------±-------------------±--------±------- ----------------------------±-------±--------------------- ------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
±—±-------------------±------±---------------±------- --------------------±-------------------±--------±------- ----------------------------±-------±--------------------- ------------------------+
| 1 | PRIMARY | p | ref | PRIMARY,needs_verification | needs_verification | 1 | const | 463119 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | pc | eq_ref | product_id,category_id | product_id | 3 | priceche_pricecheck.p.id | 1 | Using where; Distinct |
| 1 | PRIMARY | c | eq_ref | PRIMARY | PRIMARY | 3 | priceche_pricecheck.pc.category_id | 1 | Using index; Distinct |
| 2 | DEPENDENT SUBQUERY | o | index_subquery | product_id,price,deleted | product_id | 3 | func | 2 | Using index; Using where |
±—±-------------------±------±---------------±------- --------------------±-------------------±--------±------- ----------------------------±-------±--------------------- ------------------------+

Hello there,

Here are a couple of observations that I can make, without knowing your tables…

1 - You probably want the number of times a pattern is found in either name or description instead of its relevance percentage, so I’ve added the COUNT.

2 - MySQL Query Optimizer uses only one index for query for table. And you want to have a separated count of occurrences, by name or description. So I’d rather have one query for each MATCH and explicitly call the specific index for that column. That will also spare you that OR condition. OR are famous to confuse the optimizer for the number of rows needed. I assume you have created the indexes for name and description separately.

3 - I included the category_id 1 inside your IN list, and took the ’ from the number, assuming they are integers.

SELECT DISTINCT id, name_score, description_score FROM(SELECT p.id, COUNT(IF(MATCH (p.name) AGAINST (‘the colour purple’),1,NULL) AS name_scoreFROM Product as p USE INDEX (name), ProductCategory as pc, Category as cWHERE p.id=pc.product_id AND on c.id=pc.category_id AND pc.category_id IN (1, 18 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , 19 , 68 , 69 , 70 , 71’ , 72 , 73 , 74 , 821 , 822 ) AND MATCH (p.name) AGAINST (‘the colour purple’) AND p.needs_verification = 0)UNION(SELECT p.id, COUNT(IF(MATCH (p.description) AGAINST (‘the colour purple’),1,NULL) AS description_scoreFROM Product as p USE INDEX (description), ProductCategory as pc, Category as cWHERE p.id=pc.product_id AND on c.id=pc.category_id AND pc.category_id IN (1, 18 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , 19 , 68 , 69 , 70 , 71’ , 72 , 73 , 74 , 821 , 822 ) AND MATCH (p.name) AGAINST (‘the colour purple’) AND p.needs_verification = 0)WHEREid IN ( SELECT product_id FROM Offer AS o WHERE o.price > 0 AND o.deleted = 0 )ORDER BY name_score DESC, description_score DESCLIMIT 0, 20;

Best wishes,


Marcello

Hi Marcello,

Thnaks so much for this… I had to make some modifications, because I am in fact looking for the relevance. But, the query you gave me is so much more efficient!!! like 600000X more efficient )

I’ve modified it like this:

SELECT DISTINCT t1.id, t1.name_score, t1.description_score FROM((SELECT p.id, MATCH (p.name) AGAINST (‘the color purple’) AS name_score, 0 as description_scoreFROM Product as p USE INDEX (title_search), ProductCategory as pc, Category as cWHERE p.id=pc.product_id AND c.id=pc.category_id AND pc.category_id IN (1, 18 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , 19 , 68 , 69 , 70 , 71 , 72 , 73 , 74 , 821 , 822 ) AND MATCH (p.name) AGAINST (‘the color purple’) AND p.needs_verification = 0 GROUP BY p.id)UNION(SELECT p.id, 0 as name_score, MATCH (p.description) AGAINST (‘the color purple’) AS description_scoreFROM Product as p USE INDEX (description), ProductCategory as pc, Category as cWHERE p.id=pc.product_id AND c.id=pc.category_id AND pc.category_id IN (1, 18 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , 19 , 68 , 69 , 70 , 71 , 72 , 73 , 74 , 821 , 822 ) AND MATCH (p.description) AGAINST (‘the color purple’) AND p.needs_verification = 0 GROUP BY p.id))as t1WHEREid IN ( SELECT product_id FROM Offer AS o WHERE o.price > 0 AND o.deleted = 0 )ORDER BY t1.name_score DESC, t1.description_score DESCLIMIT 0, 20;

However, I can’t figure out how to get the name_score and the description_score for each product_id. As it is, it returns results as follows: first the products with name_scores, then products with description_scores, but I would like them combined

±-------±-----------------±------------------+| id | name_score | description_score |±-------±-----------------±------------------+| 85913 | 13.5209255218506 | 0 || 278269 | 13.5209255218506 | 0 || 562550 | 13.2236194610596 | 0 || 563006 | 10.31312084198 | 0 || 585079 | 9.44657039642334 | 0 || 563359 | 9.04449462890625 | 0 || 564192 | 9.04449462890625 | 0 || 563912 | 9.03781032562256 | 0 || 561789 | 8.9131498336792 | 0 || 563430 | 8.9131498336792 | 0 || 563538 | 8.9131498336792 | 0 || 564222 | 8.9131498336792 | 0 || 585593 | 8.80795288085938 | 0 || 564747 | 8.74496555328369 | 0 || 577924 | 8.74496555328369 | 0 || 574275 | 8.42759227752686 | 0 || 574276 | 8.42759227752686 | 0 || 564467 | 8.29081344604492 | 0 || 563892 | 8.15531826019287 | 0 || 561622 | 7.94574594497681 | 0 |±-------±-----------------±------------------+20 rows in set (1.80 sec)

Thanks,
Kevin