Cannot get MySQL to use indexes for query

Hello!

My problem has been driving me crazy already quite long and till this time I have tried to add memory or ssd disks to server to get over the problem by force. But it does not take me far :).

Basically I have four tables:

customer_keywords (customer_id,keyword_id)
articles_keywords (article_id,keyword_id)
articles (article_id,feed_id,created,title,body,…)
customer_feeds (customer_id, feed_id)

one article can have multiple keywords and one customer can have multiple feeds

These tables have all the necessary indexes as much as I can think out (i’m not of course creating indexes without thinking but only those are used).

Basically I have a customers who have some active keywords (basically these are news topics) and customers also have some active feeds (basically these are newsfeeds like we all know them).

Now I want to query newest articles that are from feeds that are active in this customer and also have one or more keywords active.

I would do so :

select * from articles a join articles_keywords ak on ak.article_id=a.article_id join customer_feeds cf on cf.feed_id=a.feed_id join customer_keywords ck on ck.keyword_id=ak.keyword_id where cf.customer_id=123 and ck.customer_id=123 group by a.article_id order by a.created desc

Now the problem is that whatever I do I cannot get MySQL to use indexes fully. At the same time I feel that this kind of thing is quite standard query in some application.

I have tried to leave out customer_keywords and firsthand query the keyword_id-s and then give them to query using in list “keyword_id in (1,2,3,4)”. But it does not make things better.

It’s really stupid to think about the situation where I have SDD disc, 16GB memory, and only about 60000 articles in this database. Already now it takes 1+ second for queries. But database will grow to about 2million articles in one year and so on.

I have sometimes tried with de-normalization. eg put some fields to multiple tables to try to leave out some tables from joining. But this time I would like to do it like a pro ;).

I really appreciate all the answers.

Greetings
Toomas
Estonia

Add one multicolumn index on (created,article_id) and change the GROUP BY to GROUP BY created,article_id.

If this does not speed up your query, provide me the output of EXPLAIN.

Thank you very much for reply! I do not know how good or bad this explain result feels:

explain SELECT * FROM articles a
JOIN customer_feeds cf on cf.feed_id = a.feed_id
JOIN articles_keywords ak ON ak.article_id = a.id
JOIN customer_keywords ck on ck.keyword_id=ak.keyword_id
WHERE cf.customer_id=2 AND ck.customer_id=2
group by pubdatetime,a.id order by pubdatetime limit 0,1000

±—±------------±------±-------±---------------------- ----------------±-------------------±--------±----------- ---------------------±-----±------------------------------ ---------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
±—±------------±------±-------±---------------------- ----------------±-------------------±--------±----------- ---------------------±-----±------------------------------ ---------------+
| 1 | SIMPLE | ck | ref | PRIMARY | PRIMARY | 4 | const | 2 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | ak | ref | IDX_articles_keywords,Index_2,Index_3 | Index_3 | 4 | mediamonitoring.ck.keyword_id | 63 | Using where |
| 1 | SIMPLE | a | eq_ref | PRIMARY,IDX_Articles_feed_id,Index_6 | PRIMARY | 4 | mediamonitoring.ak.article_id | 1 | |
| 1 | SIMPLE | cf | eq_ref | IDX_customer_feeds,Index_2 | IDX_customer_feeds | 8 | const,mediamonitoring.a.feed_id | 1 | Using index |
±—±------------±------±-------±---------------------- ----------------±-------------------±--------±----------- ---------------------±-----±------------------------------ ---------------+

I’m worried about “using temprary” and “using filesort” even if the “rows” field does not show large numbers.

BTW. indexes for articles table are here (see index_8):

mysql> show indexes from articles;
±---------±-----------±---------------------±----------- --±--------------±----------±------------±---------±— ----±-----±-----------±--------±--------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
±---------±-----------±---------------------±----------- --±--------------±----------±------------±---------±— ----±-----±-----------±--------±--------------+
| articles | 0 | PRIMARY | 1 | id | A | 71881 | NULL | NULL | | BTREE | | |
| articles | 1 | IDX_Articles_link | 1 | link | A | 71881 | 255 | NULL | | BTREE | | |
| articles | 1 | IDX_Articles_feed_id | 1 | feed_id | A | 368 | NULL | NULL | | BTREE | | |
| articles | 1 | IDX_Articles_created | 1 | created | A | 71881 | NULL | NULL | YES | BTREE | | |
| articles | 1 | Index_6 | 1 | feed_id | A | 154 | NULL | NULL | | BTREE | | |
| articles | 1 | Index_6 | 2 | pubdatetime | A | 71881 | NULL | NULL | | BTREE | | |
| articles | 1 | IDX_commenturl | 1 | commenturl_id | A | 35940 | NULL | NULL | | BTREE | | |
| articles | 1 | IDX_pubdatetime | 1 | pubdatetime | A | 71881 | NULL | NULL | | BTREE | | |
| articles | 1 | Index_8 | 1 | pubdatetime | A | 71881 | NULL | NULL | | BTREE | | |
| articles | 1 | Index_8 | 2 | id | A | 71881 | NULL | NULL | | BTREE | | |
±---------±-----------±---------------------±----------- --±--------------±----------±------------±---------±— ----±-----±-----------±--------±--------------+
1