performance of query for normalized DB

I’m trying to build a webapplication where users can search for a person having a particular preference for color and material. To store this information I use the following structure (a MySQL dump can be found at the end of this post):
*table person with fields:
-persid: autoincrement id
-name: name of the person
*table material with fields:
-materialid: autoincrement id
-material: name of the material eg “wood”
*table color with fields:
-colorid: autoincrement id
-color: name of the color eg “green”
*table persmaterial with fields:
-persmatid: autoincrement id
-persid: link to table person
-materialid: link to table material
*table perscolor with fields:
-perscolorid: autoincrement id
-persid: link to table person
-colorid: link to table color

In the webapplication the search can be entered by the users as a kind of pseudo query:
(color=red OR color=blue) AND color=green AND material=iron

My question is: how can I automatically transform this pseudo query into an efficient MySQL query?
I have tried out some different options:

Option 1:
(SELECT p.persid FROM person p, perscolor pc, persmaterial pm WHERE p.persid=pc.persid AND (pc.colorid=1 OR pc.colorid=2) AND p.persid=pm.persid AND pm.materialid=2 GROUP BY p.persid HAVING (count(DISTINCT pc.colorid)=2 AND count(DISTINCT pm.materialid)=1)) UNION
(SELECT p.persid FROM person p, perscolor pc, persmaterial pm WHERE p.persid=pc.persid AND (pc.colorid=2 OR pc.colorid=3) AND p.persid=pm.persid AND pm.materialid=2 GROUP BY p.persid HAVING (count(DISTINCT pc.colorid)=2 AND count(DISTINCT pm.materialid)=1))
Remarks:
*I do not see how to turn a general pseudo query into a query like the one in option 1, except for turning the pseudo query into a sum of products form where the sulms would correspond to the UNIONs. IS there a clever way to obtain such a sum of products form from an arbitrary pseudo query?

Option 2:
SELECT persid FROM person p WHERE
(EXISTS(SELECT * FROM perscolor pc WHERE pc.colorid=1 AND p.persid=pc.persid)
OR
EXISTS(SELECT * FROM perscolor pc WHERE pc.colorid=3 AND p.persid=pc.persid))
AND
EXISTS(SELECT * FROM perscolor pc WHERE pc.colorid=2 AND p.persid=pc.persid)
AND
EXISTS(SELECT * FROM persmaterial pm WHERE pm.materialid=2 AND p.persid=pm.persid)
Remarks:
*very easy to get from pseudo query to MySQL query but what about performance?

Option 3:
SELECT p.persid FROM person p, perscolor pc, persmaterial pm WHERE
p.persid=pc.persid
AND
(pc.colorid=1 OR pc.colorid=2 OR pc.colorid=3)
AND p.persid=pm.persid
AND pm.materialid=2
GROUP BY p.persid HAVING
sum(case when pc.colorid in (‘1’,‘3’) then 1 else 0 end) >= 1
AND
sum(case when pc.colorid=‘2’ then 1 else 0 end)>=1
AND
sum(case when pm.materialid=‘2’ then 1 else 0 end)>=1
Remarks:
*this option requires the pseudo query to be turned into a product of sums form; again is their a clever way to obtain such a form;

Option 4
SELECT DISTINCT pc1.persid FROM perscolor pc1
INNER JOIN perscolor pc2
ON pc1.persid=pc2.persid AND pc2.colorid=2
INNER JOIN persmaterial pm1
ON pc1.persid=pm1.persid AND pm1.materialid=2
LEFT OUTER JOIN perscolor pc3
ON pc1.persid=pc3.persid AND pc3.colorid=1
LEFT OUTER JOIN perscolor pc4
ON pc1.persid=pc4.persid AND pc4.colorid=3
WHERE COALESCE(pc3.persid,pc4.persid) IS NOT NULL
Remarks:
*this option requires the pseudo query to be turned into a product of sums form

Option 5:
SELECT p.persid FROM person p, persmaterial pm,perscolor pc1,perscolor pc2,perscolor pc3 WHERE p.persid=pm.persid AND p.persid=pc1.persid AND p.persid=pc2.persid AND p.persid=pc3.persid AND (pc1.colorid=1 OR pc2.colorid=3) AND pc3.colorid=2 AND pm.materialid=2 GROUP BY p.persid
Remarks:
*very easy to get from pseudo query to MySQL query but what about performance?

– phpMyAdmin SQL Dump
– version 2.6.1
http://www.phpmyadmin.net

– Host: localhost
– Generation Time: Oct 19, 2006 at 01:13 PM
– Server version: 4.1.9
– PHP Version: 4.3.10

– Database: aston



– Table structure for table color

CREATE TABLE color (
colorid int(11) NOT NULL auto_increment,
color varchar(30) NOT NULL default ‘’,
PRIMARY KEY (colorid)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=5 ;


– Dumping data for table color

INSERT INTO color VALUES (1, ‘red’);
INSERT INTO color VALUES (2, ‘green’);
INSERT INTO color VALUES (3, ‘blue’);
INSERT INTO color VALUES (4, ‘yellow’);



– Table structure for table material

CREATE TABLE material (
materialid int(11) NOT NULL auto_increment,
material varchar(30) NOT NULL default ‘’,
PRIMARY KEY (materialid)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=3 ;


– Dumping data for table material

INSERT INTO material VALUES (1, ‘wood’);
INSERT INTO material VALUES (2, ‘iron’);



– Table structure for table perscolor

CREATE TABLE perscolor (
perscolorid int(11) NOT NULL auto_increment,
persid int(11) NOT NULL default ‘0’,
colorid int(11) NOT NULL default ‘0’,
PRIMARY KEY (perscolorid)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=7 ;


– Dumping data for table perscolor

INSERT INTO perscolor VALUES (1, 1, 1);
INSERT INTO perscolor VALUES (2, 1, 2);
INSERT INTO perscolor VALUES (3, 2, 1);
INSERT INTO perscolor VALUES (5, 3, 3);
INSERT INTO perscolor VALUES (6, 3, 2);



– Table structure for table persmaterial

CREATE TABLE persmaterial (
persmatid int(11) NOT NULL auto_increment,
persid int(11) NOT NULL default ‘0’,
materialid int(11) NOT NULL default ‘0’,
PRIMARY KEY (persmatid)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=6 ;


– Dumping data for table persmaterial

INSERT INTO persmaterial VALUES (1, 1, 1);
INSERT INTO persmaterial VALUES (2, 1, 2);
INSERT INTO persmaterial VALUES (3, 2, 1);
INSERT INTO persmaterial VALUES (5, 3, 2);



– Table structure for table person

CREATE TABLE person (
persid int(11) NOT NULL auto_increment,
name varchar(30) NOT NULL default ‘’,
PRIMARY KEY (persid)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=4 ;


– Dumping data for table person

INSERT INTO person VALUES (1, ‘john’);
INSERT INTO person VALUES (2, ‘emily’);
INSERT INTO person VALUES (3, ‘liz’);

I added some more persons and preferences to the different tables so that there where about 10000 persons with random color and material preferences in the database; the example query resulted in 1900 persons; here are the timing results in seconds:
option 1: 0.2
option 2: 0.005
option 3: 0.003
option 4: 0.007
option 5: 0.006
option 6: 0.002

where option 6 is a newly added option:
SELECT pm.persid FROM persmaterial pm WHERE pm.materialid=2 AND pm.persid IN (SELECT pc.persid FROM perscolor pc WHERE pc.colorid=2 AND pc.persid IN (SELECT pc.persid FROM perscolor pc WHERE pc.colorid=1 OR pc.colorid=3))

I was a bit surprised by the good timing results for option 2; is there any one who can explain the way MySQL tackles the query in option 2 (I suppose MySQL is not going over all records of table “person” launching a subquery for each record of table “person”?);

I tend to conclude that option 2 is the way to go since it is so easy to translate the pseudo query into the MySQL query and on top of that the resulting MySQL query is rather fast!

I added some more persons and preferences to the different tables so that there where about 10000 persons with random color and material preferences in the database; the example query resulted in 1900 persons; here are the timing results in seconds:
option 1: 0.2
option 2: 0.005
option 3: 0.003
option 4: 0.007
option 5: 0.006
option 6: 0.002

where option 6 is a newly added option:
SELECT pm.persid FROM persmaterial pm WHERE pm.materialid=2 AND pm.persid IN (SELECT pc.persid FROM perscolor pc WHERE pc.colorid=2 AND pc.persid IN (SELECT pc.persid FROM perscolor pc WHERE pc.colorid=1 OR pc.colorid=3))

I was a bit surprised by the good timing results for option 2; is there any one who can explain the way MySQL tackles the query in option 2 (I suppose MySQL is not going over all records of table “person” launching a subquery for each record of table “person”?);

I tend to conclude that option 2 is the way to go since it is so easy to translate the pseudo query into the MySQL query and on top of that the resulting MySQL query is rather fast!