Left Join Not using index (or how to index this query)?

I have the following query that is executing at a VERY slow rate. Usually takes about 6 seconds to return. I have tried several index strategies (to the best of my ability - which I admit is probably lacking) to no avail. I have copied the query below and the explain - any help in optimizing via an index(es) would be greatly appreciated.

Please be aware that this is a query produced by a boxed application ( SugarCRM ) and I have little control over the way its written (its kind of ugly) unless I dig though the PHP code. I wanted to try an index optimization first if possible.

SELECT cases.id, cases_cstm.*, cases.case_number, cases.name, accounts.name account_name1, cases.account_id, cases.priority, cases.status, cases.date_entered , cases.modified_user_id, assigned_user0.user_name modified_user_id, assigned_user1.user_name assigned_user_name, accounts.assigned_user_id account_name1_owner, ‘Accounts’ account_name1_mod, cases.assigned_user_id FROM cases left JOIN cases_cstm ON cases.id = cases_cstm.id_c left JOIN accounts accounts ON accounts.id= cases.account_id AND accounts.deleted=0 AND accounts.deleted=0 left JOIN users assigned_user0 ON assigned_user0.id=cases.modified_user_id left JOIN users assigned_user1 ON assigned_user1.id=cases.assigned_user_id where (1) AND cases.deleted=0 ORDER BY cases.case_number ASC LIMIT 0,21;

The EXPLAIN:

±—±------------±---------------±-------±------------------------------------------------±--------±--------±----------------------------------±-----±---------------------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |±—±------------±---------------±-------±------------------------------------------------±--------±--------±----------------------------------±-----±---------------------------------------------+| 1 | SIMPLE | cases | ALL | NULL | NULL | NULL | NULL | 1495 | Using where; Using temporary; Using filesort || 1 | SIMPLE | cases_cstm | ALL | NULL | NULL | NULL | NULL | 1537 | || 1 | SIMPLE | accounts | eq_ref | PRIMARY,idx_accnt_id_del,idx_accnt_assigned_del | PRIMARY | 108 | infoathand.cases.account_id | 1 | || 1 | SIMPLE | assigned_user0 | eq_ref | PRIMARY | PRIMARY | 108 | infoathand.cases.modified_user_id | 1 | || 1 | SIMPLE | assigned_user1 | eq_ref | PRIMARY | PRIMARY | 108 | infoathand.cases.assigned_user_id | 1 | |±—±------------±---------------±-------±------------------------------------------------±--------±--------±----------------------------------±-----±---------------------------------------------+

I cannot figure out how to get cases and case_cstm to use an index I setup. Strangely (or maybe not), if I change this query to use an INNER JOIN instead of a LEFT JOIN it executes in .5 sec instead of 6 secs with no change to the current indexes.

Anyhow, any help is appreciated. I can post show index statements if that helps to see the keys of each table. I appreciate any help - I have been banging my head against a wall for the last day to figure out the MySQL optimizer.

EDIT: jsut wanted to mention the MySQL version is 5.0.22 running on Ubuntu Dapper Server - using MyISAM

Another interesting twist - if I trim the above query down to one LEFT JOIN where the join is on 2 primary keys in the 2 tables cases and case_cstm, MySQL will NOT use the keys. Why?

SELECT cases.id , cases_cstm.*, cases.case_number , cases.name , cases.priority , cases.status , cases.date_entered , cases.modified_user_id,cases.assigned_user_id FROM cases left JOIN cases_cstm ON cases.id = cases_cstm.id_c where cases.deleted=0

EXPLAIN:

±—±------------±-----------±-----±--------------±-----±--------±-----±-----±------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |±—±------------±-----------±-----±--------------±-----±--------±-----±-----±------------+| 1 | SIMPLE | cases | ALL | NULL | NULL | NULL | NULL | 1495 | Using where || 1 | SIMPLE | cases_cstm | ALL | NULL | NULL | NULL | NULL | 1537 | |±—±------------±-----------±-----±--------------±-----±--------±-----±-----±------------+

This query takes 5-6 seconds to complete. Change it to an INNER join (instead of LEFT) and its done in .07 sec. The INNER join EXPLAIN is below:

±—±------------±-----------±-------±--------------±--------±--------±-----±-----±------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |±—±------------±-----------±-------±--------------±--------±--------±-----±-----±------------+| 1 | SIMPLE | cases_cstm | ALL | NULL | NULL | NULL | NULL | 1537 | || 1 | SIMPLE | cases | eq_ref | PRIMARY | PRIMARY | 108 | func | 1 | Using where |±—±------------±-----------±-------±--------------±--------±--------±-----±-----±------------+

Uses the key/index in this one?!

I have a handful of queries like the one in the first post that are really beating the server into the ground. I would like to optimize using indexes (if possible). Problem is I can’t guess what the optimizer will do. :wink: Any pointers are welcome… thanks.

Go back to your first query and try adding an index on cases(id, case_number). Maybe that helps?

Make sure any fields which are never null, are marked as NOT NULL.

You could also try adding STRAIGHT_JOIN to the query to influence the execution planner. Sadly I dont recall the exact theory behind what it does, but I do know Ive had some very good results with it.
Its often trial and error with MySQL

SELECT STRAIGHT_JOIN field1, field2 FROM etc etc

Thanks for the tips. Its really weird.

Currently the id field has a primary key index on it and the number field is a regular index. I’ll try the multi-column index you suggest and see what happens. The MySQL optimizer is hard to figure out! I’ll see if a straight join helps - in a way I hope it doesn’t because this query is generated by software so its going to be hard to force the STRAIGHT JOIN into the query (have to dig through PHP code ( ) I was hoping a few well placed indexes would get this sucker to speed up without rewriting the query.

BTW SHOW INDEX gives me the below (cases table):

±------±-----------±--------------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |±------±-----------±--------------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+| cases | 0 | PRIMARY | 1 | id | A | 3061 | NULL | NULL | | BTREE | NULL || cases | 1 | case_number | 1 | case_number | A | NULL | NULL | NULL | | BTREE | NULL || cases | 1 | idx_case_name | 1 | name | A | NULL | NULL | NULL | YES | BTREE | NULL |±------±-----------±--------------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+

[B]carpii wrote on Wed, 22 August 2007 17:40[/B]
Go back to your first query and try adding an index on cases(id, case_number). Maybe that helps?

Make sure any fields which are never null, are marked as NOT NULL.

You could also try adding STRAIGHT_JOIN to the query to influence the execution planner. Sadly I dont recall the exact theory behind what it does, but I do know Ive had some very good results with it.
Its often trial and error with MySQL

SELECT STRAIGHT_JOIN field1, field2 FROM etc etc

No luck (

I tried adding the cases(id, case_number) index and using the STRAIGHT_JOIN syntax - same results. Slow, slow, slow.

what about an index just on cases.case_number ?
Does the explain change at all?

Whats the primary key on cases. Is it really 108 bytes?

According to the SHOW INDEX above, I already have an index on cases.case_number. So it doesn’t appear to do anything. How do I find the index length? SHOW INDEX doesn’t tell me key length, I guess from the EXPLAIN in a previous post it states that the “key_len” is 108. IS there another command I can run to display this?

Are you sure there is a primary key / index on cases_cstm.id_c?

how about a simple

EXPLAIN SELECT * FROM cases_cstm WHERE id_c = number

to figure this out.

[B]chriswest wrote on Thu, 23 August 2007 11:31[/B]
Are you sure there is a primary key / index on [B]cases_cstm.id_c[/B]?

how about a simple

EXPLAIN SELECT * FROM cases_cstm WHERE id_c = number

to figure this out.

Here is the EXPLAIN:

mysql> explain select * from cases_cstm where id_c = ‘f211ee71-2d3f-9db0-99d1-45e448a63c99’;±—±------------±-----------±------±--------------±--------±--------±------±-----±------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |±—±------------±-----------±------±--------------±--------±--------±------±-----±------+| 1 | SIMPLE | cases_cstm | const | PRIMARY | PRIMARY | 36 | const | 1 | |±—±------------±-----------±------±--------------±--------±--------±------±-----±------+1 row in set (0.03 sec)

Below are the table schema for cases and cases_cstm (SHOW CREATE TABLE results):

Cases table:

CREATE TABLE cases ( id char(36) NOT NULL, case_number int(11) NOT NULL auto_increment, date_entered datetime NOT NULL, date_modified datetime NOT NULL, modified_user_id char(36) NOT NULL, assigned_user_id char(36) default NULL, created_by char(36) default NULL, effort_actual double default NULL, effort_actual_unit varchar(20) default NULL, travel_time double default NULL, travel_time_unit varchar(20) default NULL, arrival_time varchar(30) default NULL, cust_req_no varchar(30) default NULL, cust_contact_id char(36) default NULL, cust_phone_no varchar(30) default NULL, date_closed date default NULL, date_billed date default NULL, vendor_rma_no varchar(30) default NULL, vendor_svcreq_no varchar(30) default NULL, contract_id char(36) default NULL, asset_id char(36) default NULL, asset_serial_no varchar(100) default NULL, category varchar(40) default NULL, type varchar(40) default NULL, deleted tinyint(1) NOT NULL default ‘0’, name varchar(255) default NULL, account_name varchar(100) default NULL, account_id char(36) default NULL, status varchar(25) default NULL, priority varchar(25) default NULL, description text, resolution text, PRIMARY KEY (id), KEY case_number (case_number), KEY idx_case_name (name)) ENGINE=MyISAM DEFAULT CHARSET=utf8 |

Cases_cstm:

CREATE TABLE cases_cstm ( id_c char(36) NOT NULL, mcs_steps_to_reproduce_c text, mcs_applications_multi_c text NOT NULL, mcs_supportcase_source_c varchar(150) default NULL, mcs_legacy_tt_number_c int(11) default NULL, PRIMARY KEY (id_c)) ENGINE=MyISAM DEFAULT CHARSET=latin1 |

Also the SHOW INDEX from cases_cstm for completeness (the SHOW INDEX for cases is in the previous post):

±-----------±-----------±---------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |±-----------±-----------±---------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+| cases_cstm | 0 | PRIMARY | 1 | id_c | A | 3136 | NULL | NULL | | BTREE | NULL |±-----------±-----------±---------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+1 row in set (0.00 sec)

When you are using a LEFT JOIN you are forcing the DBMS to perform the join in order left to right.

When you are changing to use an INNER JOIN the optimizer can choose the join order freely and that is why your query is fast with an INNER JOIN since it chooses the right to left order instead since you have a condition on the right table.

The STRAIGHT JOIN syntax is just an INNER JOIN where you are forcing the join order to left to right.

But my question is if you have an index on cases_cstm.id_c?
Since the join order is cases->cases_cstm that is the index that you need.

Well I’m no expert on the optimizer’s plans, but I suppose your key is just too long in order to be taken into account for the optimzer. Have you tried to force the use of an index?

left JOIN cases_cstm ON cases.id = cases_cstm.id_c FORCE INDEX (id_c)

[B]sterin wrote on Thu, 23 August 2007 12:07[/B]
When you are using a LEFT JOIN you are forcing the DBMS to perform the join in order left to right.

When you are changing to use an INNER JOIN the optimizer can choose the join order freely and that is why your query is fast with an INNER JOIN since it chooses the right to left order instead since you have a condition on the right table.

The STRAIGHT JOIN syntax is just an INNER JOIN where you are forcing the join order to left to right.

But my question is if you have an index on cases_cstm.id_c?
Since the join order is cases->cases_cstm that is the index that you need.

OK - thanks for the great explanation. That makes sense. To answer your question, there is a primary key index on cases_cstm.id_c as shown in the post above yours. I might have posted it at the same time you posted your response…

[B]chriswest wrote on Thu, 23 August 2007 12:13[/B]
Well I'm no expert on the optimizer's plans, but I suppose your key is just too long in order to be taken into account for the optimzer. Have you tried to force the use of an index?

left JOIN cases_cstm ON cases.id = cases_cstm.id_c FORCE INDEX (id_c)

I did the following (added the FORCE INDEX for the PRIMARY key index in cases.cases_cstm):

SELECT cases.id, cases_cstm.*, cases.case_number , cases.name , accounts.name account_name1, cases.account_id , cases.priority , cases.status , cases.date_entered , cases.modified_user_id , assigned_user0.user_name modified_user_id , assigned_user1.user_name assigned_user_name , accounts.assigned_user_id account_name1_owner , ‘Accounts’ account_name1_mod , cases.assigned_user_id FROM cases left JOIN cases_cstm FORCE INDEX (PRIMARY) ON cases.id = cases_cstm.id_c left JOIN accounts accounts ON accounts.id= cases.account_id AND accounts.deleted=0 AND accounts.deleted=0 left JOIN users assigned_user0 ON assigned_user0.id=cases.modified_user_id left JOIN users assigned_user1 ON assigned_user1.id=cases.assigned_user_id where (1) AND cases.deleted=0 ORDER BY cases.case_number ASC LIMIT 0,21;

The EXPLAIN:

±—±------------±---------------±-------±------------------------------------------------±--------±--------±----------------------------------±-----±---------------------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |±—±------------±---------------±-------±------------------------------------------------±--------±--------±----------------------------------±-----±---------------------------------------------+| 1 | SIMPLE | cases | ALL | NULL | NULL | NULL | NULL | 3087 | Using where; Using temporary; Using filesort || 1 | SIMPLE | cases_cstm | ALL | NULL | NULL | NULL | NULL | 3139 | || 1 | SIMPLE | accounts | eq_ref | PRIMARY,idx_accnt_id_del,idx_accnt_assigned_del | PRIMARY | 108 | infoathand.cases.account_id | 1 | || 1 | SIMPLE | assigned_user0 | eq_ref | PRIMARY | PRIMARY | 108 | infoathand.cases.modified_user_id | 1 | || 1 | SIMPLE | assigned_user1 | eq_ref | PRIMARY | PRIMARY | 108 | infoathand.cases.assigned_user_id | 1 | |±—±------------±---------------±-------±------------------------------------------------±--------±--------±----------------------------------±-----±---------------------------------------------+

As you can see no change… (

I just noticed something:

CREATE TABLE cases ( …) ENGINE=MyISAM DEFAULT CHARSET=utf8 |

and

CREATE TABLE cases_cstm ( … ) ENGINE=MyISAM DEFAULT CHARSET=latin1

you have different charsets for both tables - and you are joining on char columns: make those two charsets identical )

both utf-8 or both latin1

[B]chriswest wrote on Thu, 23 August 2007 12:33[/B]
I just noticed something:

CREATE TABLE cases ( …) ENGINE=MyISAM DEFAULT CHARSET=utf8 |

and

CREATE TABLE cases_cstm ( … ) ENGINE=MyISAM DEFAULT CHARSET=latin1

you have different charsets for both tables - and you are joining on char columns: make those two charsets identical )

both utf-8 or both latin1

LOL - I think that may be the solution! Didnt notice that at all!

Is it OK to just change the charset/collation on that column only? Just to be safe since for the rest of the table char columns. I was thinking of issuing this:

ALTER TABLE cases_cstm MODIFY COLUMN id_c CHAR(36) COLLATE utf8_general_ci NOT NULL

no that won’t do it:

  • collations tell MySQl how to order / sort strings in a specific charset

  • the charset specifies which character set is used to encode / decode a stored string properly

so you need to change the charset of the whole table - I do not know if you can change the charset for a specific column )

[B]chriswest wrote on Thu, 23 August 2007 12:58[/B]
no that won't do it:
  • collations tell MySQl how to order / sort strings in a specific charset

  • the charset specifies which character set is used to encode / decode a stored string properly

so you need to change the charset of the whole table - I do not know if you can change the charset for a specific column )

Was looking here:
http://dev.mysql.com/doc/refman/5.0/en/charset-conversion.ht ml

Looks like its possible. I guess I can do the following to change the charset on the column and the collation as well. I guess I should change the collation while I change charset(?) It looks like “cases.id” is set to charset ‘utf8’ and collation of ‘utf_general_ci’.

ALTER TABLE cases_cstm MODIFY id_c CHAR(36) CHARACTER SET utf8;ALTER TABLE cases_cstm MODIFY COLUMN id_c CHAR(36) COLLATE utf8_general_ci NOT NULL

Don’t want to hose anything up changing charsets for the other char columns. Not sure if this is a unnecessary fear of mine… I’ll try this in dev when I get a chance and see what comes of it.

BTW thanks a MILLION for spotting this! :smiley: