Is this a BUG or an intended behavior, pt-archiver misses rows when archiving with a secondary index and option --no-delete

pt-archiver version is 3.5.4 and OS is Centos 7.9

When given a non-primary index to pt-archiver, it archives the wrong data set.

Steps to repeat:

  1. create test tables
create table t(id int not null primary key,c1 int,c2 int, index idx_c2(c2));
mysql> select * from t;
+-----+------+------+
| id  | c1   | c2   |
+-----+------+------+
| 141 |  142 |    2 |
| 142 |  143 |    2 |
| 143 |  144 |    2 |
| 144 |  145 |    2 |
| 145 |  146 |    2 |
| 146 |  147 |    2 |
| 147 |  148 |    2 |
| 148 |  149 |    2 |
| 149 |  150 |    2 |
| 150 |  151 |    2 |
| 151 |  152 |    2 |
| 152 |  153 |    2 |
| 153 |  154 |    2 |
| 154 |  155 |    2 |
| 155 |  156 |    2 |
| 156 |  157 |    2 |
| 157 |  158 |    2 |
| 158 |  159 |    2 |
| 159 |  160 |    2 |
| 160 |  161 |    2 |
| 161 |  162 |    2 |
| 162 |  163 |    2 |
| 163 |  164 |    2 |
| 164 |  165 |    2 |
| 165 |  166 |    2 |
| 166 |  167 |    2 |
| 167 |  168 |    2 |
| 168 |  169 |    2 |
| 169 |  170 |    2 |
| 170 |  171 |    2 |
| 171 |  172 |    2 |
| 172 |  173 |    2 |
| 173 |  174 |    2 |
| 174 |  175 |    2 |
| 175 |  176 |    2 |
| 176 |  177 |    2 |
| 177 |  178 |    2 |
| 178 |  179 |    2 |
| 179 |  180 |    2 |
| 180 |  181 |    2 |
| 181 |  182 |    2 |
| 182 |  183 |    2 |
| 183 |  184 |   10 |
| 184 |  185 |   10 |
| 185 |  186 |   10 |
| 186 |  187 |   10 |
| 187 |  188 |   10 |
| 188 |  189 |   10 |
| 189 |  190 |   10 |
| 190 |  191 |   10 |
| 191 |  192 |   10 |
| 192 |  193 |   10 |
| 193 |  194 |   10 |
| 194 |  195 |   10 |
| 195 |  196 |   10 |
| 196 |  197 |   10 |
| 197 |  198 |   10 |
| 198 |  199 |   10 |
| 199 |  200 |   10 |
| 200 |  201 |   10 |
| 201 |  202 |   10 |
| 202 |  203 |   10 |
| 203 |  204 |   30 |
| 204 |  205 |   30 |
| 205 |  206 |   30 |
| 206 |  207 |   30 |
| 207 |  208 |   30 |
| 208 |  209 |   30 |
| 209 |  210 |   30 |
| 210 |  211 |   30 |
| 211 |  212 |   30 |
| 212 |  213 |   30 |
| 213 |  214 |   30 |
| 214 |  215 |   30 |
| 215 |  216 |   30 |
| 216 |  217 |   30 |
| 217 |  218 |   30 |
| 218 |  219 |   30 |
| 219 |  220 |   30 |
| 220 |  221 |   30 |
| 221 |  222 |   30 |
| 222 |  223 |   30 |
| 223 |  224 |   30 |
| 224 |  225 |   30 |
| 225 |  226 |   30 |
| 226 |  227 |   30 |
| 227 |  228 |   30 |
| 228 |  229 |   30 |
| 229 |  230 |   30 |
| 230 |  231 |   30 |
| 231 |  232 |   30 |
| 232 |  233 |   30 |
+-----+------+------+
92 rows in set (0.20 sec)

create table t2 like t;
  1. archive rows with c2>11 to table t2 using the following command:
pt-archiver --source h=host,u=user,D=test,t=t,i=idx_c2 --dest t=t2 --no-delete --limit 8 --commit-each --ask-pass --where "c2>11"
  1. check the target table t2:
mysql> select * from t2;
+-----+------+------+
| id  | c1   | c2   |
+-----+------+------+
| 203 |  204 |   30 |
| 204 |  205 |   30 |
| 205 |  206 |   30 |
| 206 |  207 |   30 |
| 207 |  208 |   30 |
| 208 |  209 |   30 |
| 209 |  210 |   30 |
| 210 |  211 |   30 |
+-----+------+------+
8 rows in set (0.19 sec)

It only archive 8 rows, and this is obviously wrong, there are 30 rows matching c2>11.

I enabled the general log and these are the SELECTS pt-archiver generated:

SELECT /*!40001 SQL_NO_CACHE */ `id`,`c1`,`c2` FROM `test`.`t` FORCE INDEX(`idx_c2`) WHERE (c2>11) ORDER BY `c2` LIMIT 8

SELECT /*!40001 SQL_NO_CACHE */ `id`,`c1`,`c2` FROM `test`.`t` FORCE INDEX(`idx_c2`) WHERE (c2>11) AND (((('30' IS NULL AND `c2` IS NOT NULL) OR (`c2` > '30')))) ORDER BY `c2` LIMIT 8

And the second generated SQL is definitely wrong. I raised a related ticket at Could pt-archiver use a non-unique key as chunk index? - #3 by Luke03011

Does this mean pt-archiver can not work on non-primary(unique) indexes(with --no-delete option)?

And when I remove the option --no-delete, it can archive correctly. And it generated the following SELECTS:

SELECT /*!40001 SQL_NO_CACHE */ `id`,`c1`,`c2` FROM `test`.`t` FORCE INDEX(`idx_c2`) WHERE (c2>11) ORDER BY `c2` LIMIT 8
... deletion omitted
SELECT /*!40001 SQL_NO_CACHE */ `id`,`c1`,`c2` FROM `test`.`t` FORCE INDEX(`idx_c2`) WHERE (c2>11) AND ((('30' IS NULL OR `c2` >= '30'))) ORDER BY `c2` LIMIT 8
..
SELECT /*!40001 SQL_NO_CACHE */ `id`,`c1`,`c2` FROM `test`.`t` FORCE INDEX(`idx_c2`) WHERE (c2>11) AND ((('30' IS NULL OR `c2` >= '30'))) ORDER BY `c2` LIMIT 8
...
SELECT /*!40001 SQL_NO_CACHE */ `id`,`c1`,`c2` FROM `test`.`t` FORCE INDEX(`idx_c2`) WHERE (c2>11) AND ((('30' IS NULL OR `c2` >= '30'))) ORDER BY `c2` LIMIT 8
...
SELECT /*!40001 SQL_NO_CACHE */ `id`,`c1`,`c2` FROM `test`.`t` FORCE INDEX(`idx_c2`) WHERE (c2>11) AND ((('30' IS NULL OR `c2` >= '30'))) ORDER BY `c2` LIMIT 8

I am curious why --no-delete affects the generated SELECTs? In my understanding, they should be exactly the same.

If it’s intended, please update the doc at pt-archiver — Percona Toolkit Documentation to warn users it may cause incorrect archive - missing data or duplicated data in the destination.

Hello @Luke03011,

No, it is not wrong. Look at your command:

pt-archiver ... --limit 8

You literally told pt-archiver to only archive 8 rows. The SQL from the general log also confirmed that, LIMIT 8

Are you sure? I think the --limit option is to tell pt-archiver to fetch this many rows per statement but not the total rows to archive. And if this specifies the total rows, why the the second command without --no-delete archives 30 rows?

This is from the pt-archiver doc:

–limit

type: int; default: 1

Number of rows to fetch and archive per statement.

Limits the number of rows returned by the SELECT statements that retrieve rows to archive. Default is one row. It may be more efficient to increase the limit, but be careful if you are archiving sparsely, skipping over many rows; this can potentially cause more contention with other queries, depending on the storage engine, transaction isolation level, and options such as --for-update.