Pt-online-schema-change stops after 24 hours on AWS Aurora Serverless V1 MySQL 5.7

Hello,

We are using the pt-online-schema-change (version 3.5.4) to increase the size of a column in a very large table (320 million rows) hosted on AWS RDS Aurora Serverless V1 MySQL (version 5.7).

A little after 24 hours, the command fails due to the INSERT error below…

2023-08-10T21:05:45 Error copying rows from schema.table to schema._table_new: 2023-08-10T21:05:45 DBD::mysql::st execute failed: Lost connection to MySQL server during query [for Statement "INSERT LOW_PRIORITY IGNORE INTO schema._table_new……….]

Along with the other required command line options we also use these when running the pt-online-schema-change: —statistics --progress time,120 --print --tries copy_rows:100:5

We see the following stats in the stdout….

Event Count

================= =====

INSERT 160808

connection_killed 100

On the MySQL side (looking in the general log), we see the last INSERT about 500 seconds before the above time; followed right away by a CONNECT & then another connect at 21:05:45. Nothing otherwise.

It looks that Serverless V1 terminates connections after 24 hours; which would explain why we always see the script stopping a little after that time period. But, we think the pt-online-schema-change should be able to reconnect and continue inserting…

Can we please get some guidance on what to try next?

Thank you in advance for your time.

Hello itt_sha_lkt,
First of all, thanks for posting in the Percona Community Forum.
Now, regarding your issue. Can you please share the following details:

  • Full pt-online-schema-change command.
  • Structure of the affected table (SHOW CREATE TABLE …)

Thank you.
Cheers

Hello @itt_sha_lkt,
From the documentation:

In the case of lost and killed connections, the tool will automatically reconnect.

If pt-osc is not reconnecting, we will need to see debug logs to determine why it is not reconnecting.

Thank you for your responses.

It looks like the tool does reconnect right after the INSERT stmt which errors out due to the “Lost Connection” error; but maybe the new connection object is not being used by the function that is performing the INSERT.

In any case, I am trying to reproduce the issue on a test table and will get back with the debug logs and other requested information.

Thanks again

Hi,

I have attached the debug log, the command used & table DDL as requested. The issue was reproduced on a test table.

Thank you for your time in helping us

alter_table_enlarge_id_test_percona_table.log.txt (2.0 MB)
table_ddl.txt (148 Bytes)
alter_table_enlarge_id_test_percona_table.txt (223 Bytes)

Hello,

Just following up to check if the logs etc that I provided were useful. If I can provide any other relevant information, please let me know.

Thank you

Hi,

Wanted to check if someone got a chance to look at this issue…

Thanks

Hello @itt_sha_lkt,
The forums are not an official support channel. They are best-effort and volunteer time from community members and some Percona employees. If you need focused assistance, you are encouraged to engage in a support contract.

Hello itt_sha_lkt
I looked at the files you attached to the post.
I noticed that you are trying to extend a varchar column from 64 to 70. This is actually an online operation and should only affect the metadata, this means it should be pretty fast to execute online. documentation here

The number of length bytes required by a VARCHAR column must remain the same. For VARCHAR columns of 0 to 255 bytes in size, one length byte is required to encode the value. For VARCHAR columns of 256 bytes in size or more, two length bytes are required. As a result, in-place ALTER TABLE only supports increasing VARCHAR column size from 0 to 255 bytes, or from 256 bytes to a greater size. In-place ALTER TABLE does not support increasing the size of a VARCHAR column from less than 256 bytes to a size equal to or greater than 256 bytes. In this case, the number of required length bytes changes from 1 to 2, which is only supported by a table copy (ALGORITHM=COPY ).

With that being said, I would ask you to try the following:

ALTER TABLE test_percona MODIFY COLUMN id varchar(70), ALGORITHM=INPLACE, LOCK=NONE;

If you still want to continue with pt-osc, can you confirm if you are running the pt-osc from the same database host or from a remote host?

Hi,

While MySQL 5.7 supports the in-place upgrade for the new size that we want; AWS RDS Aurora Serverless V1 MySQL (version 5.7) does not support that.

That was the reason we decided to use the Percona osc utility.

Also, we are running the script from a EC2 instance that connects to the Aurora DB. So I think that would be a remote host.

Thanks