Hi all,
We are running an old version of TokuDB (7.5.1) based on MariaDB 5.5 that we haven’t upgraded yet. We have a decent amount of data in there on a server with SSD raid-10 drives, and 260GB of ram (170GB used by TokuDB). The binary-logs are on separate raid-10 15k drives. Things have been running quite well for a long time, but lately we’ve been having weird stalls on simple queries that happen once a day at a random time. The stalls happen during non-peak times, and simple queries, that access a single key get deadlocked and stalled. It stalls for around 2-3 minutes, causing a bit of havoc, but then recovers. sar shows nothing out of the ordinary in terms of swap, i/o, cpu, network or ram usage. io utilization is negligible. I’m thinking that maybe the indexes need to be rebuilt, as the db is a number of years old, but beyond that, we’re stumped.