I’ve recently performed a rather large upgrade to a modest, EC2-hosted production mysql server. The old server was running MySQL 5.0.45 on CentOS 5.2. The new is Percona Server 5.5.16 on Ubuntu 10.04. Instance type is the same with the only major difference being a 10 volume RAID10 (md + lvm) data dir.
I had high hopes for moving to a more modern distro of MySQL and am an avid follower of Percona’s published works online and offline. I recognized that doing a straight upgrade over such a large version gap was a bit cowboy but the query usage is basic (Rails ORM and some custom SQL, no views, no triggers, no procedures, etc). I conducted the upgrade by launching the new instance, configuring it via Chef recipes, and then using XtraBackup to load it with data, then had it slave from the old server to catch up. Once it was happy I cut over the app servers and query load started.
Overall the slow query log shows improvement in slow queries and several queries we had to force indexes on now plan out fine. Some queries that used to work fine no longer finish quickly, which is part of the reason for my post. But the main reason for my post is that I’m seeing two very troubling things in New Relic’s monitoring of the app. The first are occasional long running “SHOW TABLES” calls. Sporadically they show as taking seconds to many seconds worth of time. Worrisome is their absence from the slow query log. New Relic is showing gobs of time taken by queries that MySQL is not reporting as slow. I’m at a loss to explain this as my previous experience suggested the slow query log was quite reliable.
Does anyone have a suggestion on what path I could take to understand this? Could there be a complication from the version jump? Client libraries incompatible? Network issues? Happy to provide more details and I could not be more grateful for any advice given.
Josh