Hello!
I am using pt-online-schema-change on an AWS Aurora 2.x mysql 5.7 database trying to rebuild a table. For my staging database it seems to work, but for some reason when I run the following command against my production database, it seems to partially work, then fail with an error like this:
# trying to do a table repair
> pt-online-schema-change \
--alter "FORCE" \
D=$DATABASE,t=$TABLE \
--host=$HOST \
--user=$USER \
--password=$PASSWORD \
--execute
Cannot connect to h=10.XXX.XXX.XXX,p=...,u=awsmaster: DBI connect(';host=10.XXX.XXX.XXX;mysql_read_default_group=client','awsmaster',...) failed: Can't connect to MySQL server on '10.XXX.XXX.XXX' (111) at /usr/bin/pt-online-schema-change line 2345.
What’s strange is, the IP doesn’t match to my host. And even when I set “–host” to a hard-coded IP that I know is my database, the initial table create works (which requires a connection), but it fails at the row-copy step. Here’s more complete output of this operation, run from a bastion host that has privs to my database:
Running pt-online-schema-change on myschema.mytable
Cannot connect to h=10.XXX.XXX.XXX,p=...,u=awsmaster: DBI connect(';host=10.XXX.XXX.XXX;mysql_read_default_group=client','awsmaster',...) failed: Can't connect to MySQL server on '10.XXX.XXX.XXX' (111) at /usr/bin/pt-online-schema-change line 2345.
No slaves found. See --recursion-method if host ip-XXX-XXX-XXX-XXX has slaves.
Not checking slave lag because no slaves were found and --check-slave-lag was not specified.
Operation, tries, wait:
analyze_table, 10, 1
copy_rows, 10, 0.25
create_triggers, 10, 1
drop_triggers, 10, 1
swap_tables, 10, 1
update_foreign_keys, 10, 1
Altering `myschema`.`mytable`...
Creating new table...
Created new table myschema._mytable_new OK.
Altering new table...
Altered `myschema`.`_mytable_new` OK.
2023-08-16T20:32:57 Creating triggers...
2023-08-16T20:32:57 Created triggers OK.
2023-08-16T20:32:57 Copying approximately 3205 rows...
Cannot connect to h=10.XXX.XXX.XXX,p=...,u=awsmaster: DBI connect(';host=10.XXX.XXX.XXX;mysql_read_default_group=client','awsmaster',...) failed: Can't connect to MySQL server on '10.XXX.XXX.XXX' (111) at /usr/bin/pt-online-schema-change line 2345.
2023-08-16T20:32:57 Dropping triggers...
2023-08-16T20:32:57 Dropped triggers OK.
2023-08-16T20:32:57 Dropping new table...
2023-08-16T20:32:57 Dropped new table OK.
`myschema`.`mytable` was not altered.
(in cleanup) 2023-08-16T20:32:57 Error copying rows from `myschema`.`mytable` to `myschema`.`_mytable_new`: Threads_running=18446744073665823305 exceeds its critical threshold 200
2023-08-16T20:32:57 Dropping triggers...
2023-08-16T20:32:57 Dropped triggers OK.
`myschema`.`mytable` was not altered.
Completed myschema.mytable
All tables processed!
my version of percona tools
$ pt-online-schema-change --version
pt-online-schema-change 3.5.4
Has anyone else run into this?