slow SELECT COUNT(*) on VPS server

Hi !

This is my first port here, so I hope I’ll do it right :roll:

Actually, I have:

  • a VPS server with 256mb of ram only.
  • a site with about 30 concurent users
  • a datebase with one table that have 300,000 rows, and is 500mb big.

This table contain user messages, so I need to generate a pagination. Pagination require to know the total of message for a user.

From my log files, I can see this:

Reading mysql slow query log from /var/log/mysql/mysql-slow.logCount: 109 Time=10.02s (1092s) Lock=0.98s (107s) Rows=1.0 (109), user[user]@localhost SELECT count(id) FROM cc3_he WHERE display=N AND IDPerso=N

(N mean Number)
So this is the problem: all my pages are taking 10sec+ to load !

The fields:

  • id int(12)
  • IDPerso smallint(5)
  • display smallint(1)
  • date int(10)
  • from varchar(25)
  • from_id int(5)
  • from_id2 int(5)
  • from_description text
  • from_description2 text
  • type varchar(20)
  • msg longtext
  • msg_size int(10)

My indexes on the table are:

  • PRIMARY BTREE Oui Non id 312679 A
  • IDPerso BTREE Non Non IDPerso 379 A
  • date BTREE Non Non date 312679 A
  • display BTREE Non Non display 2 A

Someone have an idea ?
Thanks for reading )

Extend your index in IDPerso to include display also. And if you use MyISAM, change count(id) to count(*).

But it is better to keep this count in the usertable and update it when a user makes a post.

Thanks for the reply, I’m trying this right now.

But can you explain me why MyISAM is faster with COUNT(*) , and what it the difference in creating 1 index with 2 fields instead of 2 indexes ?

Thanks !

Edit:

[B]Quote:[/B]
But it is better to keep this count in the usertable and update it when a user makes a post.
Really ? This is what GMail or Hotmail are doing ? They *never* COUNT the messages ?

But can you explain me why MyISAM is faster with COUNT(*)
data for the id field does not need to be retrieved

and what it the difference in creating 1 index with 2 fields instead of 2 indexes ?

http://www.mysqlperformanceblog.com/2009/09/12/3-ways-mysql- uses-indexes/

I might be a little dumb, but I still don’t understand what is different between 2 indexes and 1 index of 2 fields.

The whole article talk about using 1 index with 2 fiels… confused:

Ahhh ok !

So if I have a table with a PRIMARY key that I use to control 3 keys as UNIQUE:

[page_id, zone_id, user_id], content, foo, bar, etc

And I want to find all the zone_id of a user page:

SELECT zone_id FROM table WHERE page_id=? AND user_id=?;

I should add another index:
[page_id, user_id]

… because my primary key can’t be used, and the query will do a full table scan ?

Edit:
By the way, is it true that MySQL read the WHERE statement from Right to Left ? (The right statement must be the one that discard the most result)

Nope. Since you are supplying page_id, MySQL will use the primary key, because you have specified left part of the key.

An index is kinda like a table that’s already sorted. So your primary key would look a lot like the result of this query:

SELECT page_id, zone_id, user_id FROM mytable ORDER BY page_id, zone_id, user_id;

If you look at the results of that query, it should be easy for you to visually find all the rows where page_id = 1, right? They’re all clumped together nicely.

But if you’re just looking for user_id = 1 without a page_id, well, all of the sudden its not very easy to do visually, because they seem to be all over the place, because they are sorted first by page_id, which you aren’t asking about, and then again by zone_id.

So if you asked for page_id = 1 and zone_id = 2, visually, you’d be able to find the rows easily, because you’d first find where the rows for page_id = 1 start in your results, and then start looking for zone_id = 2. The same goes if you add in user_id.

Does that make sense?

Yes, that make sense. Just to be sure, I’ve modified the senario by swapping fields, could you tell me if #1 and #2 are OK or not ?

#1:

[B]Quote:[/B]

So if I have a table with a PRIMARY key that I use to control 3 keys as UNIQUE:

[zone_id, page_id, user_id], content, foo, bar, etc

And I want to find all the zone_id of a user page:

SELECT zone_id FROM table WHERE page_id=? AND user_id=?;

#2:

[B]Quote:[/B]

So if I have a table with a PRIMARY key that I use to control 3 keys as UNIQUE:

[page_id,zone_id, user_id], content, foo, bar, etc

And I want to find all the zone_id of a user page:

SELECT zone_id FROM table WHERE user_id=? AND page_id=?;

Thanks !

Scenario #1 doesn’t use the primary key* and scenario #2 will use the page_id part of the key.

*Not to confuse you more, but technically this query in scenario #1 will be evaluated only using the primary key, since the key contains all the columns required to produce the result set without any table lookups. If you selected out a column not in the index, then a table scan would result instead. You can actually see this in the output of EXPLAIN when “using index” shows up.

Ok, now I get it (well I think) )

#1
It would use the primary key, but as it’s all scrambled, it would be slower.

#2
The order of the WHERE fields are not usefull.

Just to be completely clear let’s hit a few scenarios. Say we have

[zone_id, page_id, user_id], content, foo, bar, etc

as you mentioned.

Here are queries, going from fastest to slowest:

SELECT * FROM mytable WHERE zone_id=? AND page_id=? AND user_id=?;

The order the columns appear in the where clause is irrelevant, so this is just as fast:

SELECT * FROM mytable WHERE user_id=? AND zone_id=? AND page_id=?;

Now slower:

SELECT * FROM mytable WHERE zone_id=? AND page_id=?

Slower still:

SELECT * FROM mytable WHERE zone_id=? AND user_id=?

just as slow as the above

SELECT * FROM mytable WHERE zone_id=?

slower still

SELECT * FROM mytable WHERE user_id=?

just as slow as the above

SELECT * FROM mytable WHERE page_id=?

etc.

Wow, really, I’ve learned alot. So in a way, I’m really greatfull to you. But in another way, I kind of hate you for all the work I must now do to fix my databases )