MySQL table data count mismatch post DB restoration via xtrabackup 8.4

Hi Everyone,

We performed a backup and restoration of a MySQL 8.4 database using XtraBackup 8.4. During the backup process, the application was stopped, meaning no changes were made to the database.

After the restoration, we validated the data against the source database and observed discrepancies in table counts across multiple tables. However, the row counts match when queried from the information_schema.tables table. Additionally, the table status information is consistent between the source and restored databases.

We later configured replication using the filename and position recorded in the XtraBackup info file.
And later after few hours the data count got matched.

This behavior is unexpected, as the backup was effectively a cold backup (no data changes during the process), yet the restored data shows inconsistencies. This is an unusual issue that we have not encountered before.
This is an extremely unexpected behavior and something I have never seen in my DBA career

Environment details:
Database: MySQL
Version: 8.4.8
XtraBackup Version: 8.4
OS: RHEL 8.1

Any assistance on this matter would be highly appreciated.

Hi @Rahul93 Welcome to Percona community.

To understand this you performed the following:

  1. Stop application (no more writes)
  2. Take Xtrabackup
  3. Restore backup to new server
  4. Compare counts
    • count(*) - are different
    • information_schama.table ROW_COUNT matches
    • show table status - matches
  5. Configured replication
    • after sometime the count(*) matched

Now, at step 1. The apps were stopped but we surely had no left over connections? May be issue a flush tables with read lock?

at step 4. Did you try to review the max(PK) value of the table that showed mismatch in count? Did we verify if data was actually inconsistent or is it just the meta data reflecting wrong stats? Did you try to run analyze table and re-verify?

about “cold backup”, no - I will still consider this as a hotbackup since your dirty pages would still be in the buffers but again that’s not the point of concern here.

At first glance I just see this as incorrect stats problem and it is not entirely unexpected… I see that innodb_stats_persistent is on by default but sample_pages may not be enough?
After replication was set, the tables in question would have caused some purging/stats calculation etc activity bringing the counts to your expected levels.

Try to answer the question raised above and let’s see what we find out.

Thanks,

Kedar.

Hi @kedarpercona ,

Thank you for your detailed response and analysis

Pls find below details as asked.

Now, at step 1. The apps were stopped but we surely had no left over connections? May be issue a flush tables with read lock?

- → Yes, we confirmed there were no remaining connections. This was validated by monitoring sessions and the transaction log (binlog file name and position) on the source, which remained constant throughout the restoration and replication setup.

at step 4. Did you try to review the max(PK) value of the table that showed mismatch in count? Did we verify if data was actually inconsistent or is it just the meta data reflecting wrong stats? Did you try to run analyze table and re-verify?

- → Yes, we checked by reviewing the CREATE TABLE output and found that the auto-increment values were identical on both servers. We also ran OPTIMIZE TABLE and ANALYZE TABLE on a few smaller tables; however, the mismatch persisted.

@Rahul93 ,

With this information the issue is not really the data consistency but the stats difference… and your question becomes “stats mismatch post db restoration”.

As long as you’re using defaults I believe persistant stats is ON but verify that once… there are few guidelines for for this innodb-persistent-stats.

Finally why this stats are important for you?

Thanks,
K

IS (information_schema) tables are aproximates (sampling) rather than accurate data. If you want to do a consistent check then you should do a count(*).

If you do analyze table in both environments you might see some changes in stats in IS tables even though the amount of rows have not changed

Regards