Mysqldump: Got error: 1160: Got an error writing communication packets when using LOCK TABLES

bitone · May 22, 2024, 12:59pm

The backup worked fine the last few days. But today, I had this in the CMS log:

Seven communication errors from different statements… What’s going on here?

This is from a second CMS with another DB, but on the same server and Percona instance.

The monitoring of aborted connects doesn’t show any change in the same time frame:

I think this issue is not related to aborted connects.
I checked some of the other monitored data but there’s nothing conspicuous in any of it.

How can we get to the bottom of this problem?

bitone · May 24, 2024, 11:58am

The problem gets worse by day. Today we had the first complaints from a client.
But the monitoring of server resources still doesn’t show anything noticeable that correlates with it.

Now we also had something in the mysql/error.log for the first time:

2024-05-24T11:38:24.927349Z 0 [ERROR] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we’re sending the information to the error-log instead: MY-001160 - Got an error writing communication packets
2024-05-24T11:38:24.927380Z 0 [ERROR] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we’re sending the information to the error-log instead: MY-001156 - Got packets out of order
2024-05-24T11:38:24.927484Z 0 [ERROR] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we’re sending the information to the error-log instead: MY-001160 - Got an error writing communication packets
2024-05-24T11:38:24.927501Z 0 [ERROR] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we’re sending the information to the error-log instead: MY-001156 - Got packets out of order

I didn’t count them but it must be a few hundreds in a single minute.

And again we have an accumulation of these errors in the CMS log. But they don’t correlate with the MySQL error log:

matthewb · May 24, 2024, 6:47pm

What has changed recently in your environment? As you said above, things were working just fine. Then suddenly things stopped working. That doesn’t simply happen without other factors. All of the errors we see are client related. Have you recently upgraded any libraries, or other components of your code, or infrastructure? Even something innocuous, like zlib, or openssl, could be causing issues, even for socket-based connections as sockets still use the mysql client library and underlying libs.

bitone · May 27, 2024, 1:11pm

@matthewb Thank you again for your commitment.

These are exactly my thoughts, too.

On this (physical, not virtual) server we are still using Debian 10 with LTS. We do the regular package updates with apt. All from official sources, including Percona repository.

So IMHO it’s nothing special about this server besides the fact that the Linux distribution is a bit old.

We use the same Percona MySQL version 8.0.36-28 on a few other productive servers (virtual instances) with Debian 11. No issues there.

Then we have one Debian 10 VM for testing where we didn’t do an update for a wile. It has Percona MySQL 8.0.34-26. No issues there either.

I compared the loaded libraries on both Deb 10 server and there’s no difference:

~# pgrep mysqld
509
~# awk ‘$NF!~/.so/{next} {$0=$NF} !a[$0]++’ /proc/509/maps
/usr/lib/x86_64-linux-gnu/libnss_mdns4_minimal.so.2
/usr/lib/x86_64-linux-gnu/libnss_dns-2.28.so
/usr/lib/x86_64-linux-gnu/libresolv-2.28.so
/usr/lib/x86_64-linux-gnu/libnss_files-2.28.so
/usr/lib/x86_64-linux-gnu/libc-2.28.so
/usr/lib/x86_64-linux-gnu/libgcc_s.so.1
/usr/lib/x86_64-linux-gnu/libm-2.28.so
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25
/usr/lib/x86_64-linux-gnu/libdl-2.28.so
/usr/lib/x86_64-linux-gnu/libnuma.so.1.0.0
/usr/lib/x86_64-linux-gnu/libaio.so.1.0.1
/usr/lib/mysql/private/libprotobuf-lite.so.3.19.4
/usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
/usr/lib/x86_64-linux-gnu/libssl.so.1.1
/usr/lib/x86_64-linux-gnu/librt-2.28.so
/usr/lib/x86_64-linux-gnu/libpthread-2.28.so
/usr/lib/mysql/plugin/component_reference_cache.so
/usr/lib/x86_64-linux-gnu/ld-2.28.so

I also checked some of the package versions. libssl is exactly the same. libc is 2.28-10+deb10u2 (VM) vs 2.28-10+deb10u3 (physical). libstdc++ is the same.

I will now update the Deb 10 VM to the same versions as the physical machine. Let’s see if we start to get the same issues afterwards.

bitone · July 3, 2024, 9:40pm

Hi there

I had running the two machines, one physical productive and one virtual test server, for over a month now with the exact same distribution and packet versions installed.

The productive server continues to have these communication packet issues at least once a day. On the virtual machine there were no such issues at all.

I really can’t imagine a reason why it would have issues on a physical server while on a VM it’s running totally fine…

IMHO there’s only one significant difference between both servers, and that’s the load.
It’s hard to reproduce the real life load of a productive server on a test machine…

But then I also discovered an old issue with FEDERATED engine, which we use on those servers.
I already reported and helped to fixe an issue with this a while ago. Now it looks like it’s back!

I’m not sure yet if it’s the exact same problem and if the issue we have here is related.
But the FEDERATED engine opens a lot of remote connections (TCP) and leaves them open until the remote server closes them. With the default connection timeout of 8 hours this can lead to a lot of waiting TCP sockets.
Maybe this hits some internal limits of MySQL daemon?

Unfortunately, I can’t just disable FEDERATED on the production server to see if the problem goes away.
Therefore I’m still trying to reproduce these errors on the test server.

bitone · August 6, 2024, 5:39pm

Things didn’t get better.
The same error over and over again, in every DB client: CMS, phpMyAdmin, mysqldump… But only on this bare-metal server.

Until a few days ago, when I did two things:
I did a complete reinstall of the Percona packages, but with telemetry disabled.
And the other thing I did was disabling the Zabbix scripts for MySQL.

Since that day, the problem has disappeared.

The Zabbix surveillance has always been there. Also on other severs (VMs) which don’t have the issue.

Is it really the telemetry module that induces these problems?!

I will wait a few more days and then re-enable the Zabbix scripts.

bitone · January 15, 2025, 10:32pm

It’s been quite a while… And things got even worse.
The DB is mostly unusable. Now I get those warnings almost every minute. So everything has to be done at least twice to be successful.

In the meantime I have upgraded Debian to Bullseye. But the issue remains.

But I finally figured out what the difference is between this physical server and the VMs with exactly the same OS and package versions:
The VMs are shut down once a week for backup. The bare-metal server, however, is rarely restarted.
And indeed: After restarting the MySQL daemon on the bare-metal, the DB works for several days without those writing errors.

In my opinion, this means that something is escalating over time. Some resources may not be released, in this case most likely socket handles.

For this reason I started monitoring open sockets and system call errors. I also looked in the Percona source code for the place where system calls are made to sockets.

But I think this thread is already too long and too old and nobody reads it anymore. So I’ll either open an issue on Github or start a new thread. Or both.

Topic		Replies	Views
"error writing communication packets" on percona, but not vanilla mysql server Other MySQL® Questions	2	3348	September 25, 2012
Error Log has "Got an error reading communication packets" after patching MySQL to 5.7.30 Percona Server for MySQL 5.7 community , troubleshooting , mysql , percona , bugs	4	2540	November 10, 2020
Backup issue in percona server 5.0 on centos 6.6 Final Other MySQL® Questions	3	562	November 16, 2015
Error when running mysqldump restore Re: ERROR 1100 (HY000) at line : Table was not locked with LOCK TABLES Percona Server for MySQL 5.7	4	6488	October 28, 2021
Intermittent "Got an error reading communication packets" Other MySQL® Questions	1	2935	February 6, 2014

Mysqldump: Got error: 1160: Got an error writing communication packets when using LOCK TABLES

Related topics