Storage engine for a datmart

I’m currently designing a data mart using MySQL. My updates and inserts will occur rarely (at most hourly) and they will always be made in bulk by first loading the data into temporary tables and then merging it with the destination table. The selects will be far more numerous, coming from a set of tightly defined end-user reporting queries. Right now I’m using MyISAM, mostly because I don’t need the transaction support of InnoDB, and I’m hoping to gain a performance advantage when loading data.

I’ve read some things that suggest that as the tables and indexes grow large, InnoDB may perform better at index caching and table joins than MyISAM. Anyone out there have any experience using either MyISAM or InnoDB for a data mart or data warehouse?

Data warehousing is a storage heavy practice. InnoDB consumes substantially more storage per than MyISAM for the same data.

The reason InnoDB can perform faster in some cases is due to the adaptive hash indexes, which creates a hash index smartly depending on the frequency of data hits.

Unless you can store a good percentage of your data in RAM I doubt it will be much benefit. Not to forget that InnoDB not only caches index data in its buffer, but also normal row data.

One of the main advantages InnoDB has over MyISAM is clustered indexes, which is often underestimated as there is often (usually?) one prime access path for reports, such as dates.

Plus, with sensible table optimization, the size advantage of MyISAM over InnoDB is often more like 60% to 100%.

For more information, and to completely plug my own blog: -reporting.html