Hi everyone. First let me say I’m a long time visitor of the site, I thoroughly appreciate the information provided on it.
I posted this post on the mysql forums, and in an attempt to gain better understanding, I’m asking the mysql high performance gurus.
I’ve been working with large databases lately and I am working on optimizing them to the max. All of my production databases are in 3NF or 2NF. My main problem is with the number of rows, ie 5 million +, on 3-4 column tables.
In my ventures, I tested using an index on an innodb table versus not using on on a myisam table. For small tables, an index was preferred. However, I have noticed some pecularities that have left me flabbergasted. Below is a snippet of a test case I ran.
The following was run on a zone in solaris 10. Nothing else (other than basic services) are running on this box. The table named emailmonitor.mailstats was used to quickly get me date strings (it already had 1.3 million dates in it). the Value column is nothing special. Please note the table names when looking at the time stamps & other commands, as they are similar.
==BEGIN MYSQL==mysql> create table ino_index(id int primary key auto_increment, date char(14), value int) engine=InnoDB;mysql> create table ino_no_index(id int primary key auto_increment, date char(14), value int) engine=InnoDB;mysql> insert into ino_index select null,date,23 from emailmonitor.mailstats;mysql> insert into ino_no_index select null,date,23 from emailmonitor.mailstats;mysql> select count() from ino_no_index;±---------+| count() |±---------+| 139877 |±---------+mysql> create index date_index on ino_index(date);mysql> show index from ino_no_index;±-------------±-----------±---------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |±-------------±-----------±---------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+| ino_no_index | 0 | PRIMARY | 1 | id | A | 140183 | NULL | NULL | | BTREE | |±-------------±-----------±---------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+1 row in set (0.01 sec)mysql> show index from ino_index;±----------±-----------±-----------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |±----------±-----------±-----------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+| ino_index | 0 | PRIMARY | 1 | id | A | 140183 | NULL | NULL | | BTREE | || ino_index | 1 | date_index | 1 | date | A | 46727 | NULL | NULL | YES | BTREE | |±----------±-----------±-----------±-------------±------------±----------±------------±---------±-------±-----±-----------±--------+mysql> explain select AVG(value),date from ino_no_index group by date order by date;±—±------------±-------------±-----±--------------±-----±--------±-----±-------±--------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |±—±------------±-------------±-----±--------------±-----±--------±-----±-------±--------------------------------+| 1 | SIMPLE | ino_no_index | ALL | NULL | NULL | NULL | NULL | 140429 | Using temporary; Using filesort |±—±------------±-------------±-----±--------------±-----±--------±-----±-------±--------------------------------+1 row in set (0.00 sec)mysql> explain select AVG(value),date from ino_index group by date order by date;±—±------------±----------±------±--------------±-----------±--------±-----±-------±------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |±—±------------±----------±------±--------------±-----------±--------±-----±-------±------+| 1 | SIMPLE | ino_index | index | NULL | date_index | 15 | NULL | 140258 | |±—±------------±----------±------±--------------±-----------±--------±-----±-------±------+mysql> select AVG(value),date from ino_index group by date order by date;[…]19652 rows in set (2.58 sec)[[and then i repeated 4 times:]]19652 rows in set (2.53 sec)19652 rows in set (2.66 sec)19652 rows in set (2.68 sec)19652 rows in set (2.71 sec)mysql> select AVG(value),date from ino_no_index group by date order by date;19652 rows in set (2.05 sec)[[and then i repeated 4 times:]]19652 rows in set (2.05 sec)19652 rows in set (1.92 sec)19652 rows in set (1.92 sec)19652 rows in set (1.92 sec)==END MYSQL==
SO, as you can see, these are certainly not desirable results. It seems that using an index for the date column has slowed things down. These results scale to the much larger tables i’m working with.
Can anyone provide some insight as to why this is happening?