MySQL 8.0.24 / 8.0.25 connection problems with `FEDERATED` storage engine (Communication link failure)

Since latest Percona Server for MySQL 8.0.25 upgrade, I am struggling with severe connection issues between two servers. I could track this down to FEDERATED storage engine.

On application/client level, the sporadic errors appear like this:

Communication link failure: 1160 Got an error writing communication packets
Communication link failure: 1156 Got packets out of order

It’s hard to debug as those errors don’t appear in mysql.err, not even with log_error_verbosity = 3. Looks like internal connections from FEDERATED tables to server are not getting logged.

Those errors only pop up after a couple of hours. Initially the workaround was to restart MySQL on the host with FEDERATED tables (see below, host backend) every 2-3 hours, which made the connection dropping disappear.

Here’s my setup (all on Percona Server for MySQL 8.0.25):

  • Host backend: PHP 8.0 application (using PDO/mysqlnd) with MySQL triggers that copy data from main application db to second mailsync database. mailsync database only contains tables of storage engine FEDERATED that write data to remote host mail
  • Host mail: REPLICATION MASTER for database mailsync
  • Host mx1: 1st REPLICATION SLAVE / REPLICA for database mailsync
  • Host mx2: 2nd REPLICATION SLAVE / REPLICA for database mailsync

I have that rather complex setup running in production for 7 months now on Percona Server for MySQL 8.0, without any issues until and including MySQL 8.0.23. Data in FEDERATED tables is accessed more or less frequently (5-10 times/hour). Only the latest upgrade to 8.0.25 broke it. The connection issues appear immediately without running into any timeout, and also affect simple SELECT queries accessing very little data.

I suspect this has something to do with the newly introduced connection management in MySQL 8.0.24:

Connection Management Notes

Previously, if a client did not use the connection to the server within the period specified by the wait_timeout system variable and the server closed the connection, the client received no notification of the reason. Typically, the client would see Lost connection to MySQL server during query (CR_SERVER_LOST) or MySQL server has gone away (CR_SERVER_GONE_ERROR).

In such cases, the server now writes the reason to the connection before closing it, and client receives a more informative error message, The client was disconnected by the server because of inactivity. See wait_timeout and interactive_timeout for configuring this behavior. (ER_CLIENT_INTERACTION_TIMEOUT).

The previous behavior still applies for client connections to older servers and connections to the server by older clients.

As I have that same setup of 4 related MySQL servers running in production in two different companies (my own webhosting company and a similar scaled hosting company I manage the infrastructure for), I now have a great way I can prove it is really MySQL > 8.0.23 and FEDERATED storage engine related. I have downgraded the 4 servers (by full data dump and fresh MySQL reinstall / reloading all data) of one company to Percona Server for MySQL 8.0.23 - the problem no longer pops up!

On MySQL 8.0.25 I have also found a workaround to make the problem (nearly… 2 days of testing is not enough yet) disappear:

interactive_timeout     = 86400
wait_timeout            = 86400

Raising the wait_timeout on all 3 hosts (mainly mail as FEDERATED table “server”, but I also raised those values on MySQL replication slaves, which was probably not needed) from default 8h to 24h is currently my best workaround. Before I tried to lower wait_timeout to 1h, and indeed the connection issues got much more frequent, 1+ hrs after restarting MySQL. So it looks like MySQL internally keeps track of its FEDERATED connections to the remote server and once the server with drops inactive connections after wait_timeout, FEDERATED still tries to re-use them and struggles with the new MySQL 8.0.24+ Connection Management. In my eyes, FEDERATED should silently try to reconnect in case a previous server connection got dropped due to a timeout, as it was before.

Can you tell if this is a known bug that was possibly already fixed in latest MySQL 8.0.26 or if I am the first one reporting those issues?

Thanks, Philip


Hi, same issue here. Are you sure that this is connected to the MySLQ update to > 8.0.24?

On Debian there were very long timeout settings by default. After changing them we also started to get these error messages on federated tables.

1 Like

this bug was identified and fixe in 8.0.28 [PS-7999] FEDERATED engine not reconnecting on wait_timeout exceeded - Percona JIRA


very nice, thanks for the link! Didn’t occur for quite some time, so I can confirm it was fixed in MySQL 8.0.28.

I have now reverted connect_timeout and interactive_timeout back from 24h to its defaults (8h). That was my workaround which probably never had a real effect.

1 Like

While this bug never popped up in MySQL 8.0.28 and 8.0.29, it now starts again bugging us since having upgraded to Percona Server for MySQL 8.0.30 ( 8.0.30-22, tested on Debian Bullseye)

so, it looks like there is a regression for [PS-7999] FEDERATED engine not reconnecting on wait_timeout exceeded in MySQL 8.0.30.
@Kamil_Holubicki Can you please reopen that issue?

1 Like

Hello @onlime
Thank you for the info.
Could you please create a new Jira ticket
Please provide detailed steps to allow us to reproduce the problem.

If possible, could you please try MySQL 8.0.31 (Oracle) and confirm you see the problem as well?

1 Like