yet another query performance issue (maybe related to subselect?)

cdupont · November 29, 2007, 11:01am

Hello everyone, I’m having a lot of problem with this query that runs much too slow.

The table contains about 120 000 records and the query is returning 150 results.
Removing the distinct does not really help increase the speed and without it the query returns 1500 results.

The query takes about 10 seconds to execute.
Strange thing is that increasing the available ram (the mysql server is running on a vmware server) from 1Gb to 3Gb the query was even more slow (taking about 25 seconds).

I have tried many different index without good results.

At this point I’m not sure if the problem is with the query or the MySQL server configuration. Mysql version is 4.1.20.

Any help is greatly appreciated,
Regards
Carol

select distinct q1.timestamp as timestamp,
q1.agent_channel as agent_channel,
q1.event as event, q2.event_parameter1 as agent
from queue_logs q1, queue_logs q2
where (q1.timestamp >= ‘1184413563’ and
q1.timestamp < ‘1184499963’ and
q2.timestamp >= ‘1184413563’ and
q2.timestamp < ‘1184499963’) and
(
(
(q1.event = ‘PAUSE’ or
q1.event = ‘UNPAUSE’ or
q1.event = ‘AGENTLOGOFF’) and
q1.agent_channel = q2.agent_channel and
q2.event = ‘AGENTLOGIN’ and
q2.timestamp =

(
select max(q3.timestamp)
from queue_logs q3
where q3.timestamp >= ‘1184413563’ and
q3.timestamp < ‘1184499963’ and
q3.timestamp < q1.timestamp and
q3.event = ‘AGENTLOGIN’ and
q3.agent_channel = q1.agent_channel
)
)
or (q1.event = ‘AGENTLOGIN’ and q1.id = q2.id)
)

EXPLAIN

id 1
select_type PRIMARY
table q1
type range
possible_keys PRIMARY,ql_t_index,ql_eat_index,ql_e_index,ql_ea_index
key ql_t_index
key_len 4
ref NULL
rows 1996
Extra Using where; Using temporary

id 1
select_type PRIMARY
table q2
type range
possible_keys PRIMARY,ql_t_index,ql_eat_index,ql_e_index,ql_ea_index ql_t_index
key ql_t_index
key_len 4
ref NULL
rows 1996
Extra Using where

id 2
select_type DEPENDENT SUBQUERY
table q3
type range
possible_keys ql_t_index,ql_eat_index,ql_e_index,ql_ea_index ql_e_index
key ql_e_index
key_len 56
ref NULL
rows 810
Extra Using where

CREATE TABLE queue_logs (
id int(11) NOT NULL auto_increment,
timestamp int(10) unsigned NOT NULL default ‘0’,
call_identifier tinytext,
queue_name tinytext,
agent_channel tinytext,
event text NOT NULL,
event_parameter1 tinytext,
event_parameter2 tinytext,
event_parameter3 tinytext,
PRIMARY KEY (id),
KEY ql_t_index (timestamp),
KEY ql_eat_index (event(18),agent_channel(12),timestamp),
KEY ql_e_index (event(18)),
KEY ql_ea_index (event(18),agent_channel(12))
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

sterin · November 30, 2007, 10:05am

Your problem is definitely the query.

I’m not exactly sure what your query are returning but whenever you have a query that uses DISTINCT and it returns 10 times as many rows when you remove it it is not good and can usually be written in a much better way.

Right now it looks like you are performing a cross join between q1 and q2 since you only have the two timestamp scopes as conditions.

My suggestions is that:
1.
Rewrite the query using the better JOIN syntax:

…FROM tableA[INNER|LEFT] – Whichever you need JOIN tableB

Because it makes it so much easier to read.

Rethink what your purpose with this query is and rewrite it.
Because I don’t think that you will be able to get any decent speed out of this one.

cdupont · November 30, 2007, 12:58pm

Thanks for the hint,

I’ll see how if can rewrite the query using joins and go from there. The query is pretty easy to understand, just hard to get it done with my actual sql knowledge )

the idea behind this was that the table contains that kind of data :

id",“timestamp”,“agent_channel”,“event”,“event_parameter1 "
107,1195650303,“SIP/1100”,“AGENTLOGIN”,“1100”
108,1195651063,“SIP/3201”,“AGENTLOGIN”,“3201”
125,1195657255,“SIP/3201”,“PAUSE”,NULL
130,1195657436,“SIP/3201”,“UNPAUSE”,”"

and I’d like the query to replace all NULL event_parameter1 values with the event_parameter1 value of the first AGENTLOGIN event for each_channel found :

“id”,“timestamp”,“agent_channel”,“event”,"event_parameter1 "
107,1195650303,“SIP/1100”,“AGENTLOGIN”,“1100”
108,1195651063,“SIP/3201”,“AGENTLOGIN”,“3201”
125,1195657255,“SIP/3201”,“PAUSE”,“3201”
130,1195657436,“SIP/3201”,“UNPAUSE”,“3201”

Regards,
Carol

sterin · December 1, 2007, 5:38pm

OK here’s a my suggestion:

I changed a bit in your table layout and indexes (but it should be changed a bit more, read below):

CREATE TABLE queue_logs (id int(11) NOT NULL auto_increment,timestamp int(10) unsigned NOT NULL default ‘0’,call_identifier tinytext,queue_name tinytext,agent_channel VARCHAR(20),event VARCHAR(20) NOT NULL,event_parameter1 tinytext,event_parameter2 tinytext,event_parameter3 tinytext,PRIMARY KEY (id),KEY ql_ix_timestamp (timestamp),KEY ql_ix_ev_ac_ts_id (event,agent_channel,timestamp, id),KEY ql_e_index (id, event)) ENGINE=MyISAM;

Reason is that “text” columns and BLOB columns are treated differently internally compared to CHAR or VARCHAR columns.

Then my suggestion for your query:

SELECT q1.timestamp as timestamp, q1.agent_channel as agent_channel, q1.event as event, IF( q1.event_parameter1 IS NULL OR q1.event_parameter1 = ‘’, ( SELECT q2.event_parameter1 FROM queue_logs q2 WHERE q2.id = ( SELECT MAX(q3.id) FROM queue_logs q3 WHERE q3.agent_channel = q1.agent_channel AND q3.event = ‘AGENTLOGIN’ AND q3.timestamp < q1.timestamp ) ), q1.event_parameter1 ) AS event_parameter1FROM queue_logs q1WHERE q1.timestamp >= 1195650303 AND q1.timestamp < 1195657500ORDER BY timestamp

This query should solve your speed problem.

BTW 1: You should avoid quotes around INT values like the timestamp column. This to avoid unnecessary implicit type conversionts. A numeric value with quotes around it is actually a string.

BTW 2: you should avoid using column names like timestamp since it is a reserved word in mysql.

cdupont · December 3, 2007, 4:08pm

Hello,

thank you very much for the suggestions, that is very much appreciated. I’ll try that ASAP. I was able to come with a more elegant and faster query using joins (with some help) but yours might be even better. The one I have right now could still be faster when querying for data for date intervals greater than 2 weeks.

As for having a column named timestamp, I know this is a bad move, I’ll get around to changing that when the application is ready, I just couldnt think of a better named for that yet

Thank you very much,
Carol

Topic		Replies	Views
Bad performance with simple join Other MySQL® Questions	5	626	March 27, 2008
performance issue Other MySQL® Questions	4	554	March 21, 2009
Query Optimization - Indexes Other MySQL® Questions	5	535	October 18, 2009
query taking too long Other MySQL® Questions	4	528	September 26, 2008
Problem with slow mysql select count(id) Other MySQL® Questions	0	588	June 29, 2008

yet another query performance issue (maybe related to subselect?)

Related topics