Since latest Percona Server for MySQL 8.0.25 upgrade, I am struggling with severe connection issues between two servers. I could track this down to
FEDERATED storage engine.
On application/client level, the sporadic errors appear like this:
Communication link failure: 1160 Got an error writing communication packets Communication link failure: 1156 Got packets out of order
It’s hard to debug as those errors don’t appear in
mysql.err, not even with
log_error_verbosity = 3. Looks like internal connections from
FEDERATED tables to server are not getting logged.
Those errors only pop up after a couple of hours. Initially the workaround was to restart MySQL on the host with
FEDERATED tables (see below, host
backend) every 2-3 hours, which made the connection dropping disappear.
Here’s my setup (all on Percona Server for MySQL 8.0.25):
backend: PHP 8.0 application (using PDO/mysqlnd) with MySQL triggers that copy data from main application db to second
mailsyncdatabase only contains tables of storage engine
FEDERATEDthat write data to remote host
mx1: 1st REPLICATION SLAVE / REPLICA for database
mx2: 2nd REPLICATION SLAVE / REPLICA for database
I have that rather complex setup running in production for 7 months now on Percona Server for MySQL 8.0, without any issues until and including MySQL 8.0.23. Data in
FEDERATED tables is accessed more or less frequently (5-10 times/hour). Only the latest upgrade to 8.0.25 broke it. The connection issues appear immediately without running into any timeout, and also affect simple
SELECT queries accessing very little data.
I suspect this has something to do with the newly introduced connection management in MySQL 8.0.24:
Connection Management Notes
Previously, if a client did not use the connection to the server within the period specified by the
wait_timeoutsystem variable and the server closed the connection, the client received no notification of the reason. Typically, the client would see Lost connection to MySQL server during query (
CR_SERVER_LOST) or MySQL server has gone away (
In such cases, the server now writes the reason to the connection before closing it, and client receives a more informative error message, The client was disconnected by the server because of inactivity. See wait_timeout and interactive_timeout for configuring this behavior. (
The previous behavior still applies for client connections to older servers and connections to the server by older clients.
As I have that same setup of 4 related MySQL servers running in production in two different companies (my own webhosting company and a similar scaled hosting company I manage the infrastructure for), I now have a great way I can prove it is really MySQL > 8.0.23 and
FEDERATED storage engine related. I have downgraded the 4 servers (by full data dump and fresh MySQL reinstall / reloading all data) of one company to Percona Server for MySQL 8.0.23 - the problem no longer pops up!
On MySQL 8.0.25 I have also found a workaround to make the problem (nearly… 2 days of testing is not enough yet) disappear:
[mysqld] interactive_timeout = 86400 wait_timeout = 86400
wait_timeout on all 3 hosts (mainly
FEDERATED table “server”, but I also raised those values on MySQL replication slaves, which was probably not needed) from default 8h to 24h is currently my best workaround. Before I tried to lower
wait_timeout to 1h, and indeed the connection issues got much more frequent, 1+ hrs after restarting MySQL. So it looks like MySQL internally keeps track of its
FEDERATED connections to the remote server and once the server with drops inactive connections after
FEDERATED still tries to re-use them and struggles with the new MySQL 8.0.24+ Connection Management. In my eyes,
FEDERATED should silently try to reconnect in case a previous server connection got dropped due to a timeout, as it was before.
Can you tell if this is a known bug that was possibly already fixed in latest MySQL 8.0.26 or if I am the first one reporting those issues?