How much RAM memory do you have on the server and how much have you reserved for the InnoDB cache (innodb_buffer_size)?
Because this is very important to get speed with innodb.
Apart from that I have a suggestion:
select Paragraph from WikiParagraphs where id = 1UNIONselect Paragraph from WikiParagraphs where id = 4UNIONselect Paragraph from WikiParagraphs where id = 6
It will give const searches for each select instead of a range scan which might go faster if your smallest and biggest value in the IN() part is far apart and you don’t want to retrieve that many rows.
mysql> explain select Paragraph from WikiParagraphs where id = 3 → UNION → select Paragraph from WikiParagraphs where id = 4 → UNION → select Paragraph from WikiParagraphs where id = 6 → ;±—±-------------±---------------±------±--------------±--------±--------±------±-----±------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |±—±-------------±---------------±------±--------------±--------±--------±------±-----±------+| 1 | PRIMARY | WikiParagraphs | const | PRIMARY | PRIMARY | 4 | const | 1 | || 2 | UNION | WikiParagraphs | const | PRIMARY | PRIMARY | 4 | const | 1 | || 3 | UNION | WikiParagraphs | const | PRIMARY | PRIMARY | 4 | const | 1 | ||NULL | UNION RESULT | <union1,2,3> | ALL | NULL | NULL | NULL | NULL | NULL | |±—±-------------±---------------±------±--------------±--------±--------±------±-----±------+4 rows in set (0.00 sec)
It’s not faster if you break it into 50 or 100.
The only advantage is that you could break when you have got all data instead of reading in all 1000 rows.
But you are only using 134MB for InnoDB_buffer_size.
How much memory do you have available on the server?
And are you using the server for something else like webserver or something?
Because I would suggest that you increase the innodb_buffer_pool_size to 80% of available memory.
I’m actually not using transactions at all right now nor do I expect to be with this data.
The only reason I added innodb was because figured it would help to not lock the entire table when creating these 13-25gb tables. Now that they’re created the point is kinda moot but converting it back to myisam (reimporting the data) is likely to take a week or more unless you know a quick way to do it?
Fulltext is done via the sphinx engine so I’m not using that part of myisam either.
Are you performing a lot of updates against the table?
Or is it basically only selects?
I’m guessing selects since it looks like a wiki but you never know. )
This is the quickest way (but you don’t have any control over it):
ALTER TABLE yourTable ENGINE=MyISAM;
The only question is how long your server will take to convert it.
So that’s why I suggest that you create a smaller table with maybe 500MB data and test on that one instead so you can get a feel for how long it takes.
And depending on the time estimate from your test it could be better to perform the move yourself:
1.
Create a myisam copy of your table. Without indexes. Those will be added later.
2.
INSERT INTO yourMyISAMCopy SELECT * FROM yourInnoDBoriginal
3.
Create the indexes. And make sure that you give mysql a lot of memory for this step. Because create index can work in two ways and one is very slow when it as to work against disk all he time.
Step 2 could be split into portions by
… WHERE id BETWEEN 1001 AND 2000.
… WHERE id BETWEEN 2001 AND 3000.
Or some appropriate value for how many you want to move at a time if you can’t take all in one go.
I really wouldnt recommend the alter table, talking from experiance (50 gig table) it will be slow, and crummy!
I do queries similar to:
select Paragraph from WikiParagraphs where ID in (1,2,3,4, *1000)
and im looking at subsecond queries!
I think you should improve the buffers your applying to these tables sufficiently!
One thing that will help you, is do all the matching with sphinx, do any phpside ordering of the data and then only pull the data you actually need! This should help greatly, in terms of speed at least.
ALSO, i might add i tested with MYISAM and INNODB seperately, and saw pretty much no difference between loadtimes selecting 1000 rows using SELECT … FROM … WHERE … IN (1,2 …1000)
Just giving you the heads up, id say its HIGHLY related to your buffer pool.
ALSO, i might add i tested with MYISAM and INNODB seperately, and saw pretty much no difference between loadtimes selecting 1000 rows using SELECT ... FROM ... WHERE ... IN (1,2 ...1000)
Just giving you the heads up, id say its HIGHLY related to your buffer pool.
Yeah, I figure you’re right but having the memory eaten by myisam as well as innodb is kind of a waste. So I’m gonna switch to myisam across the board to make sure that I can optimize for one engine instead of two! )
select Paragraph from WikiParagraphs where id = 1UNIONselect Paragraph from WikiParagraphs where id = 4UNIONselect Paragraph from WikiParagraphs where id = 6
Its better to use UNION ALL when you know the resultset will not contain any duplicates.
By default UNION implies DISTINCT (ie, it attempts to remove all duplicate rows from the resultset)
I doubt this would make much of a difference in this case, but I thought it was worth mentioning )
What did you actually mean by “It drops about 40 million rows…”?
The INSERT INTO … SELECT * FROM …;
Is very straight forward and I have never experienced it to fail.
I might take a long time to perform it sometimes and that can be a problem sometimes. But that is why I suggested that you move it in portions where you select a range of primary keys that you transfer each time you perform the query.