I’ve been getting this error almost twice a week for the past month and haven’t been able to track down the source. I’m assuming it’s some kind of InnoDB table corruption but have not been able to track down a specific table (we have over 100k tables).
I’ve left the general log on a few times while the crash happened and then tested all the databases that were being used prior to the crash and didn’t find any issues.
I’ve also restored from a snapshot from our other master (via ec2-consistent-snapshot) and let it catch up via replication, but I’m continuing to see the assertion failures pop up. 5 days will go by without issue, taking both reads and writes, but then it’ll crash. Then after recovery it will immediately crash 2 days later. Rinse/repeat.
Can anybody point me in the right direction? Dumping the entire data set and creating a new server is not an option, as the data is > 1.4 TB. We’re running Percona Server 5.5.22 on a CentOS 6 EC2 instance.