I am working on hosting solution for a popular open source script, where users can sign up and get an own forum. We use InnoDB, and each user has an own set of tables (around 30); there are currently around 500 users registered. Despite the large count of tables, database is small, around 50 MB.
Server is dual Xeon with 4GB of RAM and MySQL uses all memory, available on the server, so there’s even nothing left for Apache.
As the user base is growing, what steps would you suggest me to do, in order to keep good database performance and make it more efficient on RAM usage? The former is more important.
First - I think you configured MySQL to consume too much memory this is why it takes all memory available. check your my.cnf
Regarding your structure - if you will get very large amount of users you will get into trouble - Innodb does not like too many tables. Thousands are OK but if it starts to be tens-hundreds of thousands you’re in trouble.
It is best to charge software so it can share tables.
First - I think you configured MySQL to consume too much memory this is why it takes all memory available. check your my.cnf
There was some misunderstanding between and the server admin, MySQL is actually taking 700 MB in memory; and looks like the problem wasn’t it. Sorry for taking your take with that.
[B]Quote:[/B]
Regarding your structure - if you will get very large amount of users you will get into trouble - Innodb does not like too many tables. Thousands are OK but if it starts to be tens-hundreds of thousands you're in trouble.
Around hundred users sign up for the service daily, so as I understand this can become an issue soon. Wouldn’t splitting database into several databases, or over several servers help in this case?
[B]Quote:[/B]
It is best to charge software so it can share tables.
Well, the software is quite huge, so that’s not really an option. I understand the approach I use is “incorrect” from the database theory standpoint, and much from the performance but it’s much harder to do it using a single table set. Are there any other options, besides switching to a shared [for all users] tables model, which would increase performance?
You obviously can split it to many MySQL servers, the question is rather efficiency at which you can handle it. Ie having 10.000 or 100.000 users per server can be a big difference
Two problems you should watch out id Innodb internal table dictionary which is never purged - so all open tables stay in memory forever and estimating stats which is done on table open first time after MySQL restart.
it can become memory issue up to the point it will take more memory than you have at all - ie 1.000.000 of tables 4K each take 4GB of RAM which is serious waste.
An idea, I came up with recently: what about setting up own views for each user with “on update” and “on delete” handlers, which would perform like they are separate tables, but make the requested changes / fetch data from a single table using a key field? The problem that comes up is that MySQL doesn’t allow to set the data change/delete handlers for views. Of course, we can use empty tables with “before” triggers instead of using views, but then it wouldn’t make much sense. Is there a workaround for this or any other implementation of this approach?
The views should be updatable in such case - if you only restrict things by user_id. What exactly fails ?
I don't know how to define "on delete/update/insert" and "on change" hadlers for views. Or can "before insert/change/delete" triggers be used for views as well?
As you mention you simply filter by user_id it should be the case, so I'm curious what exactly does not work.
Not that it doesn't work, but I do not know how to do it technically. How can I set an additional field to a fixed value when an insert/update/delete query is being issued?
It sounds to me that each of your users has a copy of the same set of tables.
Surely the best way to manage this is to just have one of each table and add a ‘user id’ field. This would reduce number of tables, and I am sure it would give you just as good performance if properly indexed.
Continuing talking about using views, would it produce a lot of overhead in comparison with adding the proper condition to queries in the application, like:
select post_id, post_title from posts where post_id = 5 and user_id = 7
?
Are there any side effects I should consider when using views?
Your views are going to be simple and so they should basically do same rewrite as you’re mentioning. If it is enough I would fix the application though.