CREATE TABLE WikiParagraphs2 ( ID int(9) unsigned NOT NULL auto_increment, Paragraph text collate utf8_swedish_ci NOT NULL, PRIMARY KEY (ID)) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_swedish_ci AUTO_INCREMENT=73035277 ;mysql> select count(ID) from WikiParagraphs2 where ID < 9999999;±----------+| count(ID) |±----------+| 7375964 | ±----------+1 row in set (44.79 sec)mysql> explain select count(ID) from WikiParagraphs2 where ID < 9999999;±—±------------±----------------±------±--------------±--------±--------±-----±--------±-------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |±—±------------±----------------±------±--------------±--------±--------±-----±--------±-------------------------+| 1 | SIMPLE | WikiParagraphs2 | range | PRIMARY | PRIMARY | 4 | NULL | 7997361 | Using where; Using index | ±—±------------±----------------±------±--------------±--------±--------±-----±--------±-------------------------+1 row in set (0.05 sec)
It says it’s using an index so I don’t understand why it would take 45 seconds to get the count…any ideas?
But you should also consider that even though it is using an index it has to examine 7,997,361 rows.
Which means that each row is handled in about 5 microseconds or 177,719 rows per second, which isn’t a half bad figure.
The important thing and content of this is that you shouldn’t expect miracles even though it is using an index.
when you run a COUNT(col1) it means that is has to check the value of col1 and determine if it is zero or NULL before we are addint it to the count as normal.
The key_buffer_size is depending a bit on how much memory other applications (apache in our case) needs.
But the general rule of thumb is 25% of the available RAM should be used as key_cache.
The rest should be left unused so that the OS can use it for file cache which is what MyISAM tables are relying on since they don’t have an internal data cache.
Like for example InnoDB where indexes and data is cached internally in the same memory pool.
BTW: I should have noticed it earlier, your query should be written with SELECT count(*) FROM … instead of SELECT count(ID) FROM … because you are after the number of rows and not the nr of ID’s.
It makes it easier to read the query and understand what you are after.
And possibly it could have an effect on performance as well, but I’m not sure.