finding table 'hotspots'

Question: Is there a way to identify which tables / databases on a server are getting the majority of queries? Im looking to start replicating the busier tables in order to offload queries to secondary machines.

Situation: I work in a research group where we collect behavioral data from 300+ homes nationwide(US). The data comes in a variety of forms ranging from medication-schedule adherence to mouse movements to x10 sensor signals. We record ~2 billion records per year. The study will go for 5 years with probable extension.

Setup: dedicated MySQL server. 16 3Ghz cpu ‘cores’, 16Gb RAM, 1Tb RAID 6(?) for data plus an additional 3Tb for backups, logs, etc. Linux RH enterprise 2.6.18-92.1.22.el5. MySQL v5.1.20(beta)

30+ databases divided up (generally) by project. The largest of these has 200+ tables with 3+ billion records. Tables are (almost) exclusively innodb.

I have 2 other machines available with similar but slightly diminished capacity. Im looking to start replicating some of these tables so that I can move the majority of reads to other machines. The current db server is not CPU bound but the amount of IO wait is increasing over time.

Users are only allowed to write to one database as a scratch space. Most use temporary tables to store query data. The primary form of data access is via Matlab.

Percona has got some patches and I think that there is one patch that gives statistics.

You could add query: insert into stat ( database, table ) values ( ‘x’, ‘y’ ). It could work on sqlite. After a while You could group data by concat( database, ‘-’, table ) and check which table one is the busiest.

Thanks for the reply. I’ve been reading about the google mysql add-ons. I’ll take a look at the one you mentioned too.

Not sure how I would enforce people adding their queries to a secondary db app though.

What about triggers ?

Can i trigger an external app? ( ie you mentioned storing the into using sqlite )

You can trigger tables in mysql. One trigger per table with logging query.