Hey,
I’m looking at a table here that is kinda big… but it doesnt need to be… I’m wondering if I should suggest something to change how its stored, inserts would have an added layer of complexity, but the queries should be faster
Big big parameter table with
parameter_id int
parameter_type_id int (points to the type of parameter here, eg. int, string, date, large string etc)
parameter_value varchar(64000)?
Data fills this table from xml, the parameter value could be as small as an int ‘1’ or as big as a large string containing log data
If a new row gets inserted and it’s only an int, it will still take up the space for varchar(64000) right?
Doing a search on for int ‘1’ will be doing a txt conversion (fulltext) etc?
should I split this table into a table per parameter_type?
as in 1 table for int values, 1 table for datetime values, 1 table for small strings, 1 table for mediumtexts etc.
something like
main table -
parameter_id int
parameter_type_id int (points to the type of parameter here, eg. int, string, date, large string etc)
int table -
int_parameter_id int
parameter_id int (same as maintable)
parameter_value int
datetime table -
datetime_parameter_id int
parameter_id int (same as maintable)
parameter_value datetime
large string table -
largestring_parameter_id int
parameter_id int (same as maintable)
parameter_value varchar(64000)
medium string table -
largestring_parameter_id int
parameter_id int (same as maintable)
parameter_value varchar(333)
so doing a query would be
select it.parameter_value, dt.parameter_value, st.parameter_value, mst.parameter_value from maintable mtleft outer join inttable it on mt.parameter_id = it.parameter_idleft outer join datetable dt on mt.parameter_id = dt.parameter_idleft outer join large_strtable st on mt.parameter_id = st.parameter_idleft outer join medium_strtable mst on mst.parameter_id = st.parameter_idwhere mt.parameter_id = 3
if the parameter is an int, then dt.parameter_value and st.parameter_value will be null
I know the extra tables will add to the speed of the query, but hmm for the drastic reduction in storage, it might be worth it?
This table is HUUGGGEEEE … disksize wise, compared to the number of rows in it, index file for the table is huge with indexes on the primary key and the value
inserts have an added layer of complexity too
but looking at things like ’ [URL=“http://doc.dev.md/mysql-4.1/optimization.html#data-size”]http://doc.dev.md/mysql-4.1/optimization.html#data-size[/URL] ’
Does anyone have a nicer way to optimize a table with different data types for data values? I know what datatype the values are, it just seems a huge waste just dumping them all in the same table and good luck trying to do a query on date ranges as it is!
Thanks )
/Bal