Disk Divergence across Master and Slaves

We have deployed MySQL cluster version - 5.7.22-22 on Debian GNU/Linux 9, 1 Master, and 2 replicas(slaves). From the hardware, software, and MySQL configuration point of view, all the instances are identical. Both the slaves are read-only, with no direct writing happening to the slaves.

We have observed that for some tables with the same number of rows on master and slaves the .ibd file size for the table is different in master and slave. for example, if the table employee is 100GB on master it will be 120GB on slaves. The difference varies from 10-20% and even more sometimes. Generally, we have observed these issues on the partitioned schemas or schemas with column data types varchar/text/longtext.

To fix this issue, we have been running table optimization or some time rebuilding slaves from the master if the data size is more.

Sample Schema:
CREATE TABLE messages (
id int(11) NOT NULL AUTO_INCREMENT,
message_id varchar(100) NOT NULL,
relayed tinyint(1) DEFAULT NULL,
relayed_at datetime DEFAULT NULL,
exchange_name varchar(100) NOT NULL,
exchange_type varchar(10) NOT NULL DEFAULT ‘queue’,
message mediumtext,
inbound_message_id varchar(100) DEFAULT NULL,
app_id varchar(50) DEFAULT NULL,
correlation_id varchar(100) DEFAULT NULL,
group_id varchar(100) NOT NULL,
http_method varchar(10) DEFAULT NULL,
http_uri varchar(4096) DEFAULT NULL,
reply_to varchar(100) DEFAULT NULL,
reply_to_http_method varchar(10) DEFAULT NULL,
reply_to_http_uri varchar(4096) DEFAULT NULL,
txn_id varchar(100) DEFAULT NULL,
routing_key varchar(100) DEFAULT NULL,
context text,
destination_response_status int(11) DEFAULT NULL,
relay_error varchar(255) DEFAULT NULL,
retries int(11) DEFAULT ‘0’,
custom_headers text,
created_at datetime NOT NULL,
updated_at datetime NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY message_id_idx (message_id),
KEY exchange_name_idx (exchange_name),
KEY group_id_idx (group_id),
KEY relayed_idx (relayed)
) ENGINE=InnoDB AUTO_INCREMENT=671810 DEFAULT CHARSET=latin1

Does anyone else also observing the same phenomenon in your setups, want to understand what is causing this issue?

1 Like

Hi Sachin,

First of all I would suggest you run pt-table-checksum from Percona toolkit pt-table-checksum

With pt-tc you can check if there are any data inconsistencies.

Then, file difference might happen for example because of fragmentation, a table rebuild with different MySQL versions (when rebuilding a table on 5.6 and 5.7 a different algorithm is used and might differ) , a difference in “innodb_fill_factor” , data differences, extra indexes on one of the servers, etc…

Do you still have size differences after rebuilding the table with same version and configuration and data set size?

1 Like

@CTutte

As mentioned earlier also, all the nodes in the cluster are identical.
There are no differences with respect to schema, indexes, or configuration between master and slaves.

Post rebuilding slaves from the master sizes will be the same for 1 or 2 days post that size difference will start showing up for the table. and the difference will increase as and when data grows.
Here is the screenshot showing differences in “.ibd” file size between master and slave of same cluster

1 Like

Screenshot of schema on master and slave

1 Like

@sachin.japate,
Run pt-table-checksum to ensure that the actual row data is indeed the same between source and replica.
After that, I would re-create the table without the COMPRESSION=zlib and then observe file size changes. The compression algorithm could be a factor here for unknown reasons, but we need to rule it out.

1 Like

@matthewb Here is another example we have tried on the test cluster

Screenshot of schema on master snd slave

1 Like

IBD file size and records count on master and slave

1 Like

pt-table-checksum report

Sorry for spamming with multiple replying for the same case, I was not allowed to upload multiple screenshots

1 Like

Hello @sachin.japate,
Thanks for providing all of that info. I can’t think of any reasons off-hand why the files would be different. If you truly want to understand, you’ll need to use the InnoDB-Ruby tools and inspect the .ibd file at a page-level and see. Specifically, I would look compare ‘space-page-type-summary’ and ‘space-extents-illustrate’ between the two servers. Good luck!

1 Like

@matthewb Attached screenshot, Ran the innodb_space tool on both master and slave.

1 Like

Instead of using screen shots, please consider using cut & paste to include that information (and format it as “Preformatted Text”) in your replies. This way the content of your screen output is searchable, too, which might help others to find your posting when they are facing a similar issue, and it makes it easier to reply/comment on anything noteworthy in the output.

2 Likes

Why do you assume the files on disk should be the same size? There can be so many things causing the size to differ. The secondary could simply have long running read transactions blocking the purge, resulting in more page splits. As long as pt-table-checksum reports no diff, I wouldn’t care at all. In addition, compression=zlib is implemented using punch holes, not all filesystems support that.

2 Likes

@yves.trudeau Files size for the table should be the same as schema and data on master and slave are the same.
In our case we don’t use secondary(Hotstandby) for reading purposes, this node is only used for managing master failover. We provision dedicated read-replica for that purpose.

Disk usage on slave nodes in the cluster is much more than the master node. In most of the clusters where this issue is observed slave are going down as slave is 100% filled but the master is at ~80%, we will end up rebuilding all slaves in the clusters, during the course of slave rebuild will be running with one master, cluster availability is at risk.

compression=zlib is only enabled on 1-2 clusters, but the issue has been seen without compression enabled in most of the clusters we manage.

1 Like