I’m looking for opinions on how to efficiently build a structure from retrieving multiple tables from MySQL, where some tables are a one-to-one relationship and others are a one-to-many relationship.
I am optimizing a few c++ programs written by someone else. One of the programs needs a structure like:
(You can see a JPG relationship diagram of these tables at: http://www.filecrunch.com/fileDownload.php?sub=9507ebc166634 2f01b534b08565b232a&fileId=144790)
vector
…TableD
…TableD(another)
…vector
…TableE
…TableH
…TableF
…TableC
…TableG
Right now, the program retrieves information from each table separately, using no joins. It builds such a structure by the following pseudo-code:
Get all TableA from MySQL
Foreach TableA retrieved
…Get TableD from MySQL related to TableA
…Get TableD(another) from MySQL related to TableA
…Get all TableB from MySQL related to TableA
…Foreach TableB retrieved
…Get TableE from MySQL related to TableB
…Get TableH from MySQL related to TableE
…Get TableF from MySQL related to TableB
…Get TableC from MySQL related to TableA
…Get TableG from MySQL related to TableC
This causes the number of MySQL queries to be: NumberOfTableA * (5 + NumberOfTableB * (4)). There can be around 4000 TableA’s in the query, and around 5 TableB’s for each TableB in the query, so there can be about 100,000 MySQL queries.
What’s the best way to build the structure given above? My initial impression is that all I can do is implement a few joins to reduce the number of MySQL queries, changing to the following pseudo-code (where + indicates a join):
Get all TableA + TableD + TableD(another) + TableC + TableG from MySQL
Foreach join retrieved
…Get all TableB related to TableA
…Foreach TableB retrieved
…Get all TableE + TableH + TableF from MySQL
This would have the number of MySQL queries to be: NumberOfTableA * (1 + NumberOfTableB * 2), which would reduce the number of MySQL queries from 100,000 to 44,000.
My initial impression is that this is far as I can go with optimizing this situation. I realize I could create one massive join through one MySQL query, but that would create so much redundant information in columns that I can’t see that being a good idea. Although the number of MySQL queries would be only one, the total size of the transmission from the server to the client would be much much much larger. Additionally, the c++ program would need to on each result figure out if it had moved into a different TableA or TableB result, which doesn’t seem like good coding at all.