Since latest Percona Server for MySQL 8.0.25 upgrade, I am struggling with severe connection issues between two servers. I could track this down to FEDERATED
storage engine.
On application/client level, the sporadic errors appear like this:
Communication link failure: 1160 Got an error writing communication packets
Communication link failure: 1156 Got packets out of order
It’s hard to debug as those errors don’t appear in mysql.err
, not even with log_error_verbosity = 3
. Looks like internal connections from FEDERATED
tables to server are not getting logged.
Those errors only pop up after a couple of hours. Initially the workaround was to restart MySQL on the host with FEDERATED
tables (see below, host backend
) every 2-3 hours, which made the connection dropping disappear.
Here’s my setup (all on Percona Server for MySQL 8.0.25):
-
Host
backend
: PHP 8.0 application (using PDO/mysqlnd) with MySQL triggers that copy data from main application db to secondmailsync
database.mailsync
database only contains tables of storage engineFEDERATED
that write data to remote hostmail
-
Host
mail
: REPLICATION MASTER for databasemailsync
-
Host
mx1
: 1st REPLICATION SLAVE / REPLICA for databasemailsync
-
Host
mx2
: 2nd REPLICATION SLAVE / REPLICA for databasemailsync
I have that rather complex setup running in production for 7 months now on Percona Server for MySQL 8.0, without any issues until and including MySQL 8.0.23. Data in FEDERATED
tables is accessed more or less frequently (5-10 times/hour). Only the latest upgrade to 8.0.25 broke it. The connection issues appear immediately without running into any timeout, and also affect simple SELECT
queries accessing very little data.
I suspect this has something to do with the newly introduced connection management in MySQL 8.0.24:
https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-24.html#mysqld-8-0-24-connection-management
Connection Management Notes
Previously, if a client did not use the connection to the server within the period specified by the
wait_timeout
system variable and the server closed the connection, the client received no notification of the reason. Typically, the client would see Lost connection to MySQL server during query (CR_SERVER_LOST
) or MySQL server has gone away (CR_SERVER_GONE_ERROR
).In such cases, the server now writes the reason to the connection before closing it, and client receives a more informative error message, The client was disconnected by the server because of inactivity. See wait_timeout and interactive_timeout for configuring this behavior. (
ER_CLIENT_INTERACTION_TIMEOUT
).The previous behavior still applies for client connections to older servers and connections to the server by older clients.
As I have that same setup of 4 related MySQL servers running in production in two different companies (my own webhosting company and a similar scaled hosting company I manage the infrastructure for), I now have a great way I can prove it is really MySQL > 8.0.23 and FEDERATED
storage engine related. I have downgraded the 4 servers (by full data dump and fresh MySQL reinstall / reloading all data) of one company to Percona Server for MySQL 8.0.23 - the problem no longer pops up!
On MySQL 8.0.25 I have also found a workaround to make the problem (nearly… 2 days of testing is not enough yet) disappear:
[mysqld]
interactive_timeout = 86400
wait_timeout = 86400
Raising the wait_timeout
on all 3 hosts (mainly mail
as FEDERATED
table “server”, but I also raised those values on MySQL replication slaves, which was probably not needed) from default 8h to 24h is currently my best workaround. Before I tried to lower wait_timeout
to 1h, and indeed the connection issues got much more frequent, 1+ hrs after restarting MySQL. So it looks like MySQL internally keeps track of its FEDERATED
connections to the remote server and once the server with drops inactive connections after wait_timeout
, FEDERATED
still tries to re-use them and struggles with the new MySQL 8.0.24+ Connection Management. In my eyes, FEDERATED
should silently try to reconnect in case a previous server connection got dropped due to a timeout, as it was before.
Can you tell if this is a known bug that was possibly already fixed in latest MySQL 8.0.26 or if I am the first one reporting those issues?
Thanks, Philip