XML-type Database Design

I am working on redesigning an XML-type database, due to confidentiality restrictions I cannot post the exact table descriptions but here’s the problem.

with these tables:

table person(
personid int,
name varchar)

table lives(
personid int,
houseid int)

table drives(
personid int,
carid int)

table works(
personid int,
jobid int)

table house(
houseid int,
name varchar)

table car(
carid int,
name varchar)

table job(
jobid int,
name varchar)

These are very simple interpretations of the actual tables; in reality there’s 400+ tables, 10mil+ rows, with varying relationships and more nesting.

By design, a person can have any number or none at all, houses, cars, jobs. Each house, car, job, can belong to any number of persons.

When I do the join,
select * from person
join lives using (personid)
join drives using (personid)
join works using (personid)
join house using (houseid)
join car using (carid)
join job using (jobid)

I end up with every possible combination of person,house,car,job; which is exploding in memory. This works ok for selecting persons who lives,drives,or works single objects using a group by person.

jon placeA carA officeA
jon placeA carA officeB
jon placeA carB officeA
jon placeA carB officeB
jon placeB carA officeA
jon placeB carA officeB
jon placeB carB officeA
jon placeB carB officeB

I need to do a variety of queries where I can select the persons who drives carA and carB, lives in houseA or houseB, works at jobA and not jobB.

With this design I am forced to do many count() group by, union, distinct, where not in, query reinterpretation, etc.

Eventually this will be used by lower level data entry employees who will need to perform but cannot write complicated queries.

At this stage, I am about to begin writing a query interpreter function that changes a simple query into the expanded count() group by, changing or’s/and’s, headache.

What suggestions, if any, do you guys have? I am open to redesigning or trying any new ideas.

This is a web-based portal with LAMP.

Hi there,

I’m really not sure this is a good idea. At first, in the beginning maybe, but later this is going to cost too much memory to you. And of course you are wasting memory and resources.

I recommend you to reconsider the design and try to look from your applications point. Find the queries will be executed, try to build a design which fit to serve your application. Btw don’t forget indexes.

In these case for this example you gave a simple table would be much better, faster and reliable.

But of course, I can’t see your application which means I could be wrong. Not just could be wrong, but I’m sure you have a reason you started to think this way. For mysql/innodb is better if you have one table and not joining a lots of smaller table all the time.