JOIN & GROUP BY Optimization problem

We have three tables(datacount, resourcetable, daterange) with one of them(count) having million+ rows in it. Resourcetable will have around 100 rows and daterange arnd 10.

We want to get the sum(cnt) from count table grouped by values in resource and/or daterange;

The structure is as follows :

create table resourcetable(resourceid char(36), resourcename varchar(255)) engine=InnoDB;
create table datacount( cnt integer, resourceid char(36), cntdate timestamp) engine=InnoDB;
create table daterange(datemin date, datemax date) engine=InnoDB;

We tried to optimize the performance by creating index on resourceid, cntdate and both of them but it doesn’t work.

All the queries are taking more than 30 seconds :

  1. select sum(cnt), datemin, datemax from datacount, resourcetable, daterange where date(cntdate) between datemin and datemax and resourcename like ‘% 1%’ group by datemin, datemax;

  2. select sum(cnt), datemin, datemax, resourcetable.resourceid from datacount, resourcetable, daterange where date(cntdate) between datemin and datemax and resourcename like ‘% 1%’
    group by datemin, datemax, resourcetable.resourceid;

  3. select sum(cnt), resourcetable.resourceid from datacount, resourcetable, daterange where date(cntdate) between datemin and datemax and resourcename like ‘% 1%’
    group by resourcetable.resourceid;

Scripts to insert data in tables :
(You can change the rowfactor in the INSERTCOUNT procedure to insert more rows. As of now it will insert around 200K rows)

insert into daterange values(‘2008-04-01’,‘2008-04-05’);
insert into daterange values(‘2008-04-06’,‘2008-04-10’);
insert into daterange values(‘2008-04-11’,‘2008-04-15’);
insert into daterange values(‘2008-04-16’,‘2008-04-20’);
insert into daterange values(‘2008-04-21’,‘2008-04-25’);
insert into daterange values(‘2008-04-26’,‘2008-04-30’);

delimiter //
CREATE PROCEDURE INSERTRESOURCE()
BEGIN
DECLARE itr INT DEFAULT 10;

WHILE itr < 99 DO
INSERT INTO resourcetable VALUES(concat(“00000000-0000-0000-0000-0000000000”,itr),concat( "Resource ",itr) );
SET itr = itr + 1;
END WHILE;
END;
//

CREATE PROCEDURE INSERTCOUNT()
BEGIN
DECLARE itr INT DEFAULT 10;
DECLARE citr INT DEFAULT 0;
DECLARE rowfactor INT DEFAULT 2000;
DECLARE intvl INT DEFAULT 2592000/rowfactor;
DECLARE stts TIMESTAMP DEFAULT ‘2008-04-01 00:00:00’;

SET AUTOCOMMIT=0;

OL : WHILE citr < rowfactor DO
SET itr = 10;
IL: WHILE itr < 99 DO
INSERT INTO datacount VALUES(3, concat(“00000000-0000-0000-0000-0000000000”,itr), stts );
SET itr = itr + 1;
END WHILE IL;
COMMIT;
SET citr = citr + 1;
SET stts = TIMESTAMPADD(SECOND, intvl, stts);
END WHILE OL;
END;
//

delimiter ;

call INSERTRESOURCE();

call INSERTCOUNT();

create index resource_index on datacount(resourceid(36));
create index time_index on datacount(cntdate);

Any help would be really appreciated.

Can you please post EXPLAIN for each of queries which are slow?

Here are the results for the EXPLAIN stmt.

mysql> explain select sum(cnt), datemin, datemax from datacount, resourcetable,
daterange where date(cntdate) between datemin and datemax and resourcename like
‘% 1%’
-> group by datemin, datemax;
±—±------------±--------------±-----±--------------± -----±--------±—
–±-------±--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref
| rows | Extra |
±—±------------±--------------±-----±--------------± -----±--------±—
–±-------±--------------------------------+
| 1 | SIMPLE | daterange | ALL | NULL | NULL | NULL | NUL
L | 6 | Using temporary; Using filesort |
| 1 | SIMPLE | resourcetable | ALL | NULL | NULL | NULL | NUL
L | 89 | Using where |
| 1 | SIMPLE | datacount | ALL | NULL | NULL | NULL | NUL
L | 178275 | Using where |
±—±------------±--------------±-----±--------------± -----±--------±—
–±-------±--------------------------------+
3 rows in set (0.70 sec)

mysql> explain select sum(cnt), datemin, datemax, resourcetable.resourceid from
datacount, resourcetable, daterange where date(cntdate) between datemin and date
max and resourcename like ‘% 1%’
-> group by datemin, datemax, resourcetable.resourceid;
±—±------------±--------------±-----±--------------± -----±--------±—
–±-------±--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref
| rows | Extra |
±—±------------±--------------±-----±--------------± -----±--------±—
–±-------±--------------------------------+
| 1 | SIMPLE | daterange | ALL | NULL | NULL | NULL | NUL
L | 6 | Using temporary; Using filesort |
| 1 | SIMPLE | resourcetable | ALL | NULL | NULL | NULL | NUL
L | 89 | Using where |
| 1 | SIMPLE | datacount | ALL | NULL | NULL | NULL | NUL
L | 178275 | Using where |
±—±------------±--------------±-----±--------------± -----±--------±—
–±-------±--------------------------------+
3 rows in set (0.00 sec)

mysql> explain select sum(cnt), resourcetable.resourceid from datacount, resourc
etable, daterange where date(cntdate) between datemin and datemax and resourcena
me like ‘% 1%’
-> group by resourcetable.resourceid;
±—±------------±--------------±-----±--------------± -----±--------±—
–±-------±--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref
| rows | Extra |
±—±------------±--------------±-----±--------------± -----±--------±—
–±-------±--------------------------------+
| 1 | SIMPLE | daterange | ALL | NULL | NULL | NULL | NUL
L | 6 | Using temporary; Using filesort |
| 1 | SIMPLE | resourcetable | ALL | NULL | NULL | NULL | NUL
L | 89 | Using where |
| 1 | SIMPLE | datacount | ALL | NULL | NULL | NULL | NUL
L | 178275 | Using where |
±—±------------±--------------±-----±--------------± -----±--------±—
–±-------±--------------------------------+
3 rows in set (0.00 sec)