Hey, we explicitly create temporary tables in order to use indexes in some of our stored procedures. Right now we do not specify storage engine, so it defaults to InnoDB. Im realizing that if we explicitly use the MEMORY storage engine we’ll likely get much better performance. The question is what happens if the tables grow to big (individually or cumulatively) will Mysql automatically revert to using disk based tables in those instances (even though we explicitly create the temporary table as MEMORY) or will Mysql start returning errors?
I ran into exactly this earlier today.
By default, MEMORY tables are limited to 16MB. You can change this in the configuration file by setting max_heap_table_size.
If an operation would cause a MEMORY table to exceed that limit, MySQL returns an error.
The trick is to limit the amount of data actually brought in your memory table. Here is a stored proceedure I’m using to calculate ranks of data for users.
DELIMITER //create procedure updateranks ()beginCREATE TEMPORARY TABLE rank_temp
(id
BIGINT UNSIGNED NOT NULL,rank1
INT NOT NULL ,data1
DOUBLE NOT NULL ,rank2
INT NOT NULL ,data2
DOUBLE NOT NULL ,PRIMARY KEY ( id
) ,INDEX ( data1
) ,INDEX ( data2
)) ENGINE = MEMORY;INSERT INTO rank_temp (id, rank1, data1, rank2, data2) SELECT id, 0, data1, 0, data2 FROM rankme;set @i = 0;UPDATE rank_temp SET rank1 = (@i:=(@i+1)) ORDER BY data1 ASC;set @i = 0;UPDATE rank_temp SET rank2 = (@i:=(@i+1)) ORDER BY data2 ASC;UPDATE rank_temp, rankme SET rankme.rank1 = rank_temp.rank1, rankme.rank2 = rank_temp.rank2 WHERE rankme.id = rank_temp.id;DROP TABLE rank_temp;end;//delimiter ;call updateranks();
What I do here is pull only the information needed to calculate the new ranks into RAM, keeping the table organized by id. The rest of the data stays on disk. I then update all the data in memory and update it to disk in primary key order. I did this to avoid InnoDB updating in non primary key order, which is a total killer.
You could probably do whatever you need to do in a MEMORY table still, and then join to a disk table to pull the full information you need.
With 1,000,000 rows, the code above uses less than 100 MB of MEMORY table. It executes in about 15 seconds on my development machine when no disk updates are needed (no records changed), or about 2 minutes if a full disk update is needed (all records changed).
xaprb, you seem to have overlooked his signature spam