My problem has been driving me crazy already quite long and till this time I have tried to add memory or ssd disks to server to get over the problem by force. But it does not take me far :).
Basically I have four tables:
customer_feeds (customer_id, feed_id)
one article can have multiple keywords and one customer can have multiple feeds
These tables have all the necessary indexes as much as I can think out (i’m not of course creating indexes without thinking but only those are used).
Basically I have a customers who have some active keywords (basically these are news topics) and customers also have some active feeds (basically these are newsfeeds like we all know them).
Now I want to query newest articles that are from feeds that are active in this customer and also have one or more keywords active.
I would do so :
select * from articles a join articles_keywords ak on ak.article_id=a.article_id join customer_feeds cf on cf.feed_id=a.feed_id join customer_keywords ck on ck.keyword_id=ak.keyword_id where cf.customer_id=123 and ck.customer_id=123 group by a.article_id order by a.created desc
Now the problem is that whatever I do I cannot get MySQL to use indexes fully. At the same time I feel that this kind of thing is quite standard query in some application.
I have tried to leave out customer_keywords and firsthand query the keyword_id-s and then give them to query using in list “keyword_id in (1,2,3,4)”. But it does not make things better.
It’s really stupid to think about the situation where I have SDD disc, 16GB memory, and only about 60000 articles in this database. Already now it takes 1+ second for queries. But database will grow to about 2million articles in one year and so on.
I have sometimes tried with de-normalization. eg put some fields to multiple tables to try to leave out some tables from joining. But this time I would like to do it like a pro ;).
I really appreciate all the answers.