I am working on redesigning an XML-type database, due to confidentiality restrictions I cannot post the exact table descriptions but here’s the problem.
with these tables:
These are very simple interpretations of the actual tables; in reality there’s 400+ tables, 10mil+ rows, with varying relationships and more nesting.
By design, a person can have any number or none at all, houses, cars, jobs. Each house, car, job, can belong to any number of persons.
When I do the join,
select * from person
join lives using (personid)
join drives using (personid)
join works using (personid)
join house using (houseid)
join car using (carid)
join job using (jobid)
I end up with every possible combination of person,house,car,job; which is exploding in memory. This works ok for selecting persons who lives,drives,or works single objects using a group by person.
jon placeA carA officeA
jon placeA carA officeB
jon placeA carB officeA
jon placeA carB officeB
jon placeB carA officeA
jon placeB carA officeB
jon placeB carB officeA
jon placeB carB officeB
I need to do a variety of queries where I can select the persons who drives carA and carB, lives in houseA or houseB, works at jobA and not jobB.
With this design I am forced to do many count() group by, union, distinct, where not in, query reinterpretation, etc.
Eventually this will be used by lower level data entry employees who will need to perform but cannot write complicated queries.
At this stage, I am about to begin writing a query interpreter function that changes a simple query into the expanded count() group by, changing or’s/and’s, headache.
What suggestions, if any, do you guys have? I am open to redesigning or trying any new ideas.
This is a web-based portal with LAMP.