I am currently implementing a system that interfaces between retailers and wholesalers and needs to have logging capabilities. Problem is that there will be roughly 20,000 retailers accessing about 300 wholesalers.
Obviously, a row will need to be entered into the table for each retailer accessing each wholesaler (or a potential 20,000 x 300 = 6,000,000 rows).
Problem is that I need to log visits on a monthly basis for the last 12 months, meaning 6,000,000 x 12 = 72,000,000! However, realistically not every retailer will visit every wholesaler so I’m estimating that in the end I’m looking at about 20,000,000 rows.
There are two tables:
retailer_wholesaler_association_table:
------------------------------------------------association_id | retailer_id | wholesaler_id------------------------------------------------
retailer_monthly_visits_table:
--------------------------------------association_id | month_id | visits--------------------------------------
My question is, should I be very worried about speed? A SELECT query will look something like:
SELECT * FROM retailer_wholesaler_association LEFT JOIN retailer_monthly_visits WHERE retailer_id = %
And an UPDATE will look very similar. Obviously, not very complex statements.
Server is a Quad Core Intel Xeon Harpertown 3.0 GHz with 16GB DDR2 RAM.