I was just wondering what everyones views are on the optimal structure of a database for storing a forum? I know that is a very generic request, so I’ll try and be a bit more specific:
Traditionally most forum packages I’ve seen use the ‘child / parent’ approach. So a board will contain a thread, a thread will contain 1+ messages. Messages only ever belong to one thread, and one thread only ever belongs to one board. When new posts are made in the thread, it ‘floats to the top’ of the board.
This is a pretty standard set-up, and one I’ve happily coded and built, but now the forum is getting popular and we’ve in excess of 2 million messages posted by 60k active users, and it’s starting to feel the pain.
I have indexed as much as I feel I can, and swapped over some core functions to Stored Procedures, but can’t help but feel that the core issue is the original structure in the first place.
I’ve been reading up on Joe Celko’s Nested Set theory, which I feel may go some way to helping the speed when viewing an actual thread (so I could check the left/right values instead of doing a LIMIT). But perhaps this is just complicating things for the sake of it?
The latest blog entry on ‘view counters’ was fascinating, as I do update the ‘views’ record on each hit to a thread. So I’ll certainly try that technique too.
But back to my original question - how would you approach the design of a database for storing a forum system? (or if you want to make it simple, just a ‘comments’ system for a blog entry). I’m very happy to hear just your theory on this rather than specific SQL dumps, but whatever you want to share will be great - as this is quite a common requirement for web apps (comments finding themselves everywhere these days), I expect it is one a lot of readers would benefit from. Who knows, perhaps it would make a great blog entry too ) Hell, I’d pay good money for an article on optimal forum/comment system DB design in MySQL 5!