MySQL Version: Percona 5.7.39-42
Issue Description:
We are managing a MySQL cluster with one Master, one HSB (Hot Standby), and multiple read replicas using GTID replication. Recently, we encountered a hardware failure causing the master virtual machine to go down. As a result, our auto-failover process kicked in, promoting the HSB to master. During this process:
- Some replicas were able to reconnect to the new master based on GTID alignment.
- Other replicas had to be rebuilt due to GTID inconsistencies.
At the time of failover, all replicas (including HSB) had more than 10 minutes of replication lag. Our auto-failover process, however, waits until the HSB lag is synced before triggering the failover.
Problem:
After the failover, some replicas that reconnected to the newly promoted master (HSB) have exhibited data inconsistencies. Specifically, for the same set of values, data is different between the newly promoted master and the replicas. We are unable to identify the cause of this inconsistency.
Details on Data Inconsistency:
We parsed the binary logs of both the newly promoted master and the affected replica:
- Master binary logs: The data appears correct.
- Replica binary logs: Even though the logs suggest that the data should match the master’s, the replica shows inconsistent data.
Additional validations showed no GTID inconsistencies or errant GTIDs between the master and replica, and there are no direct inserts into the replica (read-only is enabled).
Moreover, the table in question only involves insert transactional DMLs (i.e., no updates or deletes).
Sample Table Structure:
```CREATE TABLE analytics_data
(
start_time
bigint(20) NOT NULL,
entity_id
varchar(30) NOT NULL,
metric_1
int(11) NOT NULL DEFAULT ‘0’,
metric_2
int(11) NOT NULL DEFAULT ‘0’,
spend_amount
decimal(10,2) DEFAULT ‘0.00’,
region
varchar(30) NOT NULL,
type
varchar(30) NOT NULL,
inserted_at
timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
modified_at
datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
UNIQUE KEY index_1
(entity_id
,start_time
,region
),
KEY index_2
(entity_id
,start_time
),
KEY index_3
(modified_at
,type
),
KEY index_4
(start_time
)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
**Insert Query:**
```INSERT INTO analytics_data (start_time, entity_id, metric_1, metric_2, spend_amount, region, type)
VALUES (...)
ON DUPLICATE KEY UPDATE
metric_1 = metric_1 + VALUES(metric_1),
metric_2 = metric_2 + VALUES(metric_2),
spend_amount = spend_amount + VALUES(spend_amount);
Sample Data Difference Between Master and Replica:
+-------------+-----------+---------+-----------+-----------+--------+-------+---------------------+---------------------+
| start_time | entity_id | metric_1| metric_2 | spend_amount | region | type | inserted_at | modified_at |
+-------------+-----------+---------+-----------+-----------+--------+-------+---------------------+---------------------+
| 202409301830| xxxxxxxxx | 237294 | 4353 | 27992.68 | DC1 | BRAND_xxx | 2024-09-30 21:23:20 | 2024-09-30 23:38:38 |
| 202409301830| xxxxxxxxx | 21704 | 1019 | 7581.40 | DC2 | BRAND_xxx | 2024-09-30 21:23:23 | 2024-09-30 23:38:40 |
| 202409301830| xxxxxxxxx | 230272 | 3975 | 25546.36 | DC3 | BRAND_xxx | 2024-09-30 21:23:21 | 2024-09-30 23:38:39 |
+-------------+-----------+---------+-----------+-----------+--------+-------+---------------------+---------------------+
replica(with data inconsistency)
+-------------+-----------+---------+-----------+-----------+--------+-------+---------------------+---------------------+
| start_time | entity_id | metric_1| metric_2 | spend_amount | region | type | inserted_at | modified_at |
+-------------+-----------+---------+-----------+-----------+--------+-------+---------------------+---------------------+
| 202409301830| xxxxxxxxx | 451848 | 8534 | 54924.84 | DC1 | BRAND_xxx | 2024-09-30 21:23:20 | 2024-09-30 23:38:38 |
| 202409301830| xxxxxxxxx | 42153 | 2018 | 15016.56 | DC2 | BRAND_xxx | 2024-09-30 21:23:23 | 2024-09-30 23:38:40 |
| 202409301830| xxxxxxxxx | 438160 | 7799 | 50167.20 | DC3 | BRAND_xxx | 2024-09-30 21:23:21 | 2024-09-30 23:38:39 |
+-------------+-----------+---------+-----------+-----------+--------+-------+---------------------+---------------------+
We can see that for the same entity_id, with the same inserted_at and modified_at, the values for metric_1, metric_2, and spend_amount are different in the replica compared to the master which is strange.
Suspected Causes & Investigation:
We reviewed the following scenarios that could lead to data inconsistencies in mixed replication format:
- Half-applied transactions (when a transaction updates non-transactional tables and applies some but not all changes):
- We verified that all tables are transactional (InnoDB).
- Gaps in replication (common with multithreaded replicas):
- We have already enabled
slave_preserve_commit_order
and have multithreaded replication set correctly. slave_parallel_type=LOGICAL_CLOCK
, and bothlog-bin
andlog-slave-updates
are enabled.Reference: https://dev.mysql.com/doc/refman/5.7/en/replication-features-transaction-inconsistencies.html
- INSERT … ON DUPLICATE KEY UPDATE with multiple primary/unique keys:
- We confirmed no non-deterministic functions are used, and the query executes as expected.Reference: https://dev.mysql.com/doc/refman/5.7/en/replication-rbr-safe-unsafe.html
But below one we suspect might be related to our usecase:
Link: https://dev.mysql.com/doc/refman/5.7/en/replication-features-transaction-inconsistencies.html
6.If the last transaction in the relay log is only half-received and the multithreaded replica coordinator has started to schedule the transaction to a worker, then STOP SLAVE waits up to 60 seconds for the transaction to be received. After this timeout, the coordinator gives up and aborts the transaction. If the transaction is mixed, it may be left half-completed.
We tried to reproduce the scenario in our stage env, but was unable to.
Next Steps:
We would appreciate your insights into the following:
- Possible causes of the data inconsistency between master and replica.
- Any additional checks we can perform to narrow down the cause.
- Steps to avoid such inconsistencies in the future. (As per percona and other refered articles we have planned to moved to ROW format to avoid this in future)
Let me know if you need any further details.