So - I have 3 tables: feed_datastore, feed_index_site, and site.
feed_datastore has uuid | ts | data (which is a text column with serialized data)
feed_index_site is site_ID | uuid
site has site_ID | xxxx (bunch of other columns - eg one being country_ID).
feed_datastore has data from each site and feed_index_site connects site_ID to uuid
What I’m trying to do is extract uuid, data from feed_datastore based on the country_ID in site. I get something like this:
SELECT a.ts,a.uuid,a.data FROM feed_index_site b, feed_datastore a, site c WHERE a.uuid=b.uuid AND b.site_ID=c.site_ID AND c.country_ID=4 ORDER BY ts DESC LIMIT 6
The resulting EXPLAIN has “Using index; Using temporary; Using filesort” on table c (site) - which is killing performance. If I remove the ORDER BY ts the temporary/filesort disappear.
I’ve tried a bunch of indexes which I thought would work, and even altered the feed_datastore table so that it is ordered by ts DESC but nothing works - anyone have a clue how I can make this run smoother?
I went a key,value approach - so feed_datastore’s key is ‘uuid’, and the feed_index_site relates/connects feed_datastore and site
The reason I cannot add the country_ID value to feed_datastore is because the country value could change in the site table, requiring the system to then go back and update that value in feed_datastore anytime country_ID was changed in site
That sucks. You might try this, which will only be fast for countries appearing often.
Add an index on table c on (site_ID,country_ID)
Add an index on table a on (ts)
SELECT a.ts,a.uuid,a.data
FROM feed_datastore a FORCE INDEX(name of index on the ts_field)
STRAIGHT_JOIN feed_index_site b ON(uuid)
STRAIGHT_JOIN site c ON(site_ID)
WHERE c.country_ID=4 ORDER BY a.ts DESC LIMIT 6
If this is not fast, provide the output of EXPLAIN.
Wanted to quickly thank you for trying to help here gmouse.
I had both indexes already created. Unfortunately when I try to run your query, I get an ambiguous error on both straight_joins (something I’ve never used before). I read up on it and modified your query to this:
EXPLAIN SELECT a.ts, a.uuid, a.data
FROM feed_datastore a FORCE INDEX (ts)
STRAIGHT_JOIN feed_index_site b ON ( a.uuid = b.uuid )
STRAIGHT_JOIN site c ON (b.site_ID=c.site_ID)
WHERE c.country_ID =51
ORDER BY a.ts DESC
LIMIT 6
id select_type table type possible_keys key key_len ref rows Extra1 SIMPLE a index NULL ts 4 NULL 113512 1 SIMPLE b ref PRIMARY,site_ID PRIMARY 42 data.a.uuid 1 Using index1 SIMPLE c eq_ref PRIMARY,country_ID,country_ID_2,site_ID,site_ID_2,… PRIMARY 3 data.b.site_ID 1 Using where
The problem is its basically going through the entire feed_datastore table - not so bad at 115k rows, but that will bog down really quickly (at peak it will be growing 100k a day).
(I have some extra indexes as I’ve been trying to figure out what would work).