13GB DB - Performance Question

Ooh. wait. Just noticed that UNION there.

That’s rather interesting. I would have never thought of doing that.

I’ll give that a shot and see how it performs.

1000 rows was a bit more than I thought but I hope it works for you. )

To get the info about server settings you can use this query:

show variables like ‘innodb%’;

Well, if it’s faster, I could break it down into chunks of 50 or 100 until I have enough data I guess…

±--------------------------------±------------------------------+| Variable_name | Value |±--------------------------------±------------------------------+| have_innodb | YES || innodb_additional_mem_pool_size | 12582912 || innodb_autoextend_increment | 8 || innodb_buffer_pool_awe_mem_mb | 0 || innodb_buffer_pool_size | 134217728 || innodb_data_file_path | innodb/ibdata1:10M:autoextend || innodb_data_home_dir | || innodb_fast_shutdown | ON || innodb_file_io_threads | 4 || innodb_file_per_table | OFF || innodb_flush_log_at_trx_commit | 1 || innodb_flush_method | || innodb_force_recovery | 0 || innodb_lock_wait_timeout | 30 || innodb_locks_unsafe_for_binlog | OFF || innodb_log_arch_dir | || innodb_log_archive | OFF || innodb_log_buffer_size | 8388608 || innodb_log_file_size | 33554432 || innodb_log_files_in_group | 2 || innodb_log_group_home_dir | ./ || innodb_max_dirty_pages_pct | 90 || innodb_max_purge_lag | 0 || innodb_mirrored_log_groups | 1 || innodb_open_files | 300 || innodb_table_locks | ON || innodb_thread_concurrency | 8 |±--------------------------------±------------------------------+

The server has 4GB of memory.

It’s not faster if you break it into 50 or 100.
The only advantage is that you could break when you have got all data instead of reading in all 1000 rows.

But you are only using 134MB for InnoDB_buffer_size.

How much memory do you have available on the server?

And are you using the server for something else like webserver or something?

Because I would suggest that you increase the innodb_buffer_pool_size to 80% of available memory.

Hmmm…

This server has 4GB of memory. I’m using as the MYISAM, INNODB and web server all in one - LOL. Too much for one server to do?

G-Man

Do you have a reason for using both MyISAM and InnoDB?

That is, are you using both FULLTEXT and transactions?
Or have you run into the update/select performance problem with MyISAM?

Otherwise I would suggest that you choose to use only one storage engine and tune the server for that one.

I’m actually not using transactions at all right now nor do I expect to be with this data.

The only reason I added innodb was because figured it would help to not lock the entire table when creating these 13-25gb tables. Now that they’re created the point is kinda moot but converting it back to myisam (reimporting the data) is likely to take a week or more unless you know a quick way to do it?

Fulltext is done via the sphinx engine so I’m not using that part of myisam either.

Are you performing a lot of updates against the table?
Or is it basically only selects?
I’m guessing selects since it looks like a wiki but you never know. )

This is the quickest way (but you don’t have any control over it):

ALTER TABLE yourTable ENGINE=MyISAM;

The only question is how long your server will take to convert it.

So that’s why I suggest that you create a smaller table with maybe 500MB data and test on that one instead so you can get a feel for how long it takes.

And depending on the time estimate from your test it could be better to perform the move yourself:
1.
Create a myisam copy of your table. Without indexes. Those will be added later.
2.
INSERT INTO yourMyISAMCopy SELECT * FROM yourInnoDBoriginal
3.
Create the indexes. And make sure that you give mysql a lot of memory for this step. Because create index can work in two ways and one is very slow when it as to work against disk all he time.

Step 2 could be split into portions by
… WHERE id BETWEEN 1001 AND 2000.
… WHERE id BETWEEN 2001 AND 3000.

Or some appropriate value for how many you want to move at a time if you can’t take all in one go.

I really wouldnt recommend the alter table, talking from experiance (50 gig table) it will be slow, and crummy!

I do queries similar to:
select Paragraph from WikiParagraphs where ID in (1,2,3,4, *1000)

and im looking at subsecond queries!

I think you should improve the buffers your applying to these tables sufficiently!

One thing that will help you, is do all the matching with sphinx, do any phpside ordering of the data and then only pull the data you actually need! This should help greatly, in terms of speed at least.

Well, since it’s a static table and my other tables are in myisam I’ve decided to move over to myisam.

The problem that I’m having tho is when I try this query:

insert into WikiParagraphs2 select * from WikiParagraphs

It drops about 40 million rows…

I tried a mysql dump but there are odd chars that make it barf.

Any ideas?

You could try forcing the charsets?

Are both tables in the same character set anyway?

Oh wait, are you SURE it dropped the rows? innoDB does not keep a record of the actual amount of rows in the table, so it may be that?

The only way (i can think of) is to do something similar to:

SELECT count(*) FROM innoDBtable

But that may take a long time, you could try two seperate count()'s from a boolean field (do a search for true and one for false) may be quicker.

When I saw the number of rows not matching up, I did a count(*).

It took about 30 minutes to do that.

I’ve tried keeping the charsets the same as well as changing charsets.

Neither option worked.

I suppose I could write a php script to do this by hand but good lord that’ll be slowwwwwww )

Honestly i cant see why it didnt work. It should of from what i can tell its a pretty simple query.

Some of the more proficient mysql dba’s here may be able to assist you more thouhg )

ALSO, i might add i tested with MYISAM and INNODB seperately, and saw pretty much no difference between loadtimes selecting 1000 rows using SELECT … FROM … WHERE … IN (1,2 …1000)

Just giving you the heads up, id say its HIGHLY related to your buffer pool.

[B]bluesaga wrote on Thu, 24 May 2007 13:56[/B]
ALSO, i might add i tested with MYISAM and INNODB seperately, and saw pretty much no difference between loadtimes selecting 1000 rows using SELECT ... FROM ... WHERE ... IN (1,2 ...1000)

Just giving you the heads up, id say its HIGHLY related to your buffer pool.

Yeah, I figure you’re right but having the memory eaten by myisam as well as innodb is kind of a waste. So I’m gonna switch to myisam across the board to make sure that I can optimize for one engine instead of two! )

[B]sterin wrote on Wed, 23 May 2007 16:54[/B]

Apart from that I have a suggestion:

select Paragraph from WikiParagraphs where id = 1UNIONselect Paragraph from WikiParagraphs where id = 4UNIONselect Paragraph from WikiParagraphs where id = 6

Its better to use UNION ALL when you know the resultset will not contain any duplicates.
By default UNION implies DISTINCT (ie, it attempts to remove all duplicate rows from the resultset)

I doubt this would make much of a difference in this case, but I thought it was worth mentioning )

What did you actually mean by “It drops about 40 million rows…”?

The INSERT INTO … SELECT * FROM …;

Is very straight forward and I have never experienced it to fail.
I might take a long time to perform it sometimes and that can be a problem sometimes. But that is why I suggested that you move it in portions where you select a range of primary keys that you transfer each time you perform the query.

When I say it dropped rows I mean that about 40 million rows never made it to the myisam db.

count(*) doesn’t match on both db’s.

I’m doing it through php right now by selecting one row at a time and inserting it. Should only take me 24 hours or so assuming it works.

With regards to the same charset, I tried it with the same charset and without to no avail.

I’m hopeful that going through php and doing it one select/insert at a time will work even if it is a bit slower )

Hello,

I’m in a habit to play with 50GB tables so I will give you my own advice for what it’s worth :wink:

  1. There’s a better way to move large data into MySQL. Don’t do “INSERT INTO … SELECT * FROM” it’s just plain inefficient.
    Dump your data into a textfile. “SELECT * FROM t1 INTO OUTFILE” will do the trick.
    There you can do external editing (removing bogus chars, converting charsets…) as you wish.
    Then insert your data in your new table using “LOAD DATA INFILE”
    This should be much faster.

  2. About your performance problem. As someone said, 128MB is EXTREMELY INEFFECTIVE for InnoDB. InnoDB likes memory, and for that you need at last 70% of available memory allocated to the buffer pool (or if you’re on a 32-bit system, 2GB will do).

  3. I see that you will be using MyISAM. Can you post your key_buffer_size variable? MyISAM buffer only stores indexes so the size will be inferior to InnoDB’s.

  4. Partitioning could be a key response to your performance problems. Can you use MySQL 5.1? If then, I recommend you have a look at the Partitioning stuff. You could easily partition your data by date and improve access times. Otherwise, if you stick with 5.0 you can try to divide your data in smaller tables, then use MERGE tables for the largest span queries.

Best Regards

If the query exceeds the maximum amount of memory available, it will fail. I’ve had this happen before, in fact, i have it happen nightly at least once for a query I have yet to optimize as MySQL is running its backups processes :wink: