Slow mysql query when using subquery

Hey there!

I’m really having a strange problem here.

I have the following 2 tables:

CREATE TABLE rainbowcrack_parts (
partid int(11) NOT NULL auto_increment,
tableid int(11) NOT NULL,
userid int(11) default NULL,
chainStart bigint(22) unsigned NOT NULL,
chainCount int(11) NOT NULL,
ip varchar(255) default NULL,
status int(11) default NULL,
starttime datetime default ‘0000-00-00 00:00:00’,
finishedtime datetime default ‘0000-00-00 00:00:00’,
clientid int(10) unsigned NOT NULL default ‘0’,
creditvalue float unsigned NOT NULL default ‘0’,
aggregated tinyint(1) unsigned NOT NULL default ‘0’,
PRIMARY KEY (partid),
KEY tableid (tableid),
KEY userid (userid),
KEY creditvalue (creditvalue),
KEY finishedtime (finishedtime),
KEY status (status),
KEY aggregated (aggregated),
KEY clientid (clientid)
) ENGINE=MyISAM AUTO_INCREMENT=1985956 DEFAULT CHARSET=latin1

CREATE TABLE rainbowcrack_clients (
clientid int(10) unsigned NOT NULL auto_increment,
lastseen datetime NOT NULL,
userid int(10) unsigned NOT NULL,
version varchar(255) NOT NULL,
name varchar(255) NOT NULL,
last7days float NOT NULL,
PRIMARY KEY (clientid),
KEY lastseen (lastseen),
KEY last7days (last7days)
) ENGINE=MyISAM AUTO_INCREMENT=16491 DEFAULT CHARSET=latin1

What i want is a list of how much each client has done, belonging to a specific user.

So i made the query
SELECT p.clientid, SUM(chainCount) AS numchains FROM rainbowcrack_parts p WHERE p.clientid IN (SELECT clientid
FROM rainbowcrack_clients c
WHERE userid = 2 AND DATE_SUB(NOW(), INTERVAL 14 DAY) < lastseen)
GROUP BY p.clientid
;

It takes almost 2 seconds to run. Using a EXPLAIN statement, i get

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY p ALL NULL NULL NULL NULL 497468 Using where; Using temporary; Using filesort
2 DEPENDENT SUBQUERY c unique_subquery PRIMARY,lastseen PRIMARY 4 func 1 Using where

The subquery returns 2 rows which is

clientid
14483
14644

If i try to run the query
SELECT p.clientid, SUM(chainCount) AS numchains FROM rainbowcrack_parts p WHERE p.clientid IN (14483, 14644)
GROUP BY p.clientid

it takes 15 ms. Using a EXPLAIN it returns
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE p range clientid clientid 4 NULL 1113 Using where

So its a much better query. But isn’t it the same query in the end?

Edit:

I also tried out
EXPLAIN SELECT p.clientid, SUM(chainCount) AS numchains FROM rainbowcrack_parts p
INNER JOIN rainbowcrack_clients c ON p.clientid = c.clientid
WHERE c.userid = 2
GROUP BY p.clientid
;

which returns (slow query)
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE p ALL clientid NULL NULL NULL 497468 Using temporary; Using filesort
1 SIMPLE c eq_ref PRIMARY PRIMARY 4 rainbowtables-distrrtgen.p.clientid 1 Using where

Whats the performance like for…

SELECT clientid
FROM rainbowcrack_clients c
WHERE userid = 2 AND DATE_SUB(NOW(), INTERVAL 14 DAY) < lastseen

Have you tried using an inner join instead of subquery ?

SELECT
p.clientid, SUM(chainCount) AS numchains
FROM
rainbowcrack_parts p
INNER JOIN rainbowcrack_clients c ON (p.clientid = c.clientid)
WHERE
p.userid = 2 AND
DATE_SUB(NOW(), INTERVAL 14 DAY) < c.lastseen
GROUP BY p.clientid

Actually, im guessing you need a composite index on (c.userid, c.lastseen)

You have two seperate indices at the moment, but no compound index. I think MySQL 5.0 is able to do an index merge, but its best not to rely on it where possible

I found that adding a index to rainbowcrack_clients.userid fixed my problem. (I must have missed it somehow)
Thanks for your help