Analyzing Discrepancies in Write I/O Between Percona 5.7 and 8.1 Replication

Hello all,
I am running Percona 8.1.0-1 on Gentoo.
The setup and intentions are pretty simple:
I’ve running replication.
Source: Percona 5.7.43-47
Destination: Percona 8.1.0-1
The source is production, and using replication, I replicate all changes to the new 8.1 server to check different aspects of the planned upgrade of our system.
On the destination, there is very little and controlled activity; therefore, I can be sure that any write I/O on the destination is a result of a write on the source.
To my surprise, the amount of I/O on 8.1 is double that on 5.7.
I am checking using iostat and PMM, comparing the I/O statistics between the source (Percona 5.7.43-47) and the destination (Percona 8.1.0-1).
For reference:
Disk operations: Write: 7.5 Kb/s average
Disk bandwidth: Write: 125 MB/s average (read not relevant as no application is running on the destination)

Disk operations: Write: 21 Kb/s average
Disk bandwidth: Write: 309 MB/s average

Any thoughts? Ideas?
Thank you very much!

Hello @egel,
Do you have PMM setup and monitoring both servers? If so, you can look at the Performance Schema dashboards in PMM which will show you what subsystems are using IO. That might help identify a misconfiguration in your 8.1 system compared to 5.7.

Yes, I have pmm setup.
Sorry, misread the answer …
No, performance schema I need to set up.
I would try and return with results.
But meanwhile, do you have some tables in mind I can do selects from to understand which subsystems are using IO?


Any tables regarding IO information would be located inside the performance_schema.

Seems I’ve found the source of additional IO.
Thank you, it was good idea to look at: file_summary_by_event_name and file_summary_by_instance

  1. The amount of I/O is only double (there recovery was running on the destination and once the replica catch up the amount of I/O dramatically dropped)
  2. I am running replication and the destination is in fact group replication cluster and, obviously, binlog enabled then every single write on source becomes two writes on destination.
    One write to relaylog and another one into binlog.
    innodb_log_file is redo, and innodb_data_file is (as per my understanding) cumulative value of all data written to innodb data files

The “double” write most probably will continue as GR using relaylog for the replication. Seems I need to prepare to have more I/O after upgrade.

| EVENT_NAME                                      | COUNT_WRITE | MB_WRITE | COUNT_READ | MB_READ |
| wait/io/file/sql/relaylog                       |  3425334320 |   304837 |  892180438 |  304843 |
| wait/io/file/sql/binlog                         |   628001778 |   312037 |    4076733 |    7280 |
| wait/io/file/innodb/innodb_temp_file            |     5317924 |   412119 |       1052 |     369 |
| wait/io/file/innodb/innodb_log_file             |   207218899 |  1167804 |       2272 |     379 |
| wait/io/file/innodb/innodb_data_file            |   202241081 |  3175202 |   11542415 | 2066085 |