I am archiving my RDS MySQL database to a secondary database using pt-archival from percona but during the archival activity the CPU and DB connections of the database increases. The archival activity consists of deleting data from the main database and copying it to the secondary database. Is there some way this can be reduced since this is a production database.
The pt-archive command is -
pt-archiver --source h=source-database.com,u=username,p=password,D=dbName,t=tableName --where "creation_time< current_date - interval 30 day" --dest h=dest-database.com,u=username,p=password --limit 1000 --txn-size 1000 --progress 1000 --optimize=s
Check for --sleep-coef and --sleep options which can be used to have pt-archiver to go slower and let database catchup with traffic
The sleep command just puts pt-archive to sleep for sometime right between selects, but is there anything to reduce the database connections and CPU used? Also without the sleep database traffic will not be hindered right, given there is sufficient CPU and Db connections? If you could also explain why exactly pt-archive is making so many db-connections that would be great!, as in one database it is making 7 connections and in the other it is making 100, is it based on the number of tables in the database or something?
Please provide SHOW PROCESSLIST output - what do you mean by pt-archiver creating 100 connections…
I’m not sure if this will help but this is what i would see as soon as I’d run the pt-archive command
Also I don’t think I would be able to provide the processlist command as this is in production and will not be accessible by me.I had done the same experiment on a smaller database and had seen the spike in database connections there as well but the spike was not as large. Does the size of database matter? On average how many connections would pt-archive make.
Also coming to the CPU part of the question is there anything that can be done for that? Would reducing the number of records being processed at a time(currently 1000) help in that.
This shows what there is just spike of connections… not what pt-archive has created it. I would imagine Application created those connections due to database slowness… which you can reduce by using -sleep etc flags I suggested
Also coming to the CPU part of the question is there anything that can be done for that? Would reducing the number of records being processed at a time(currently 1000) help in that. Or is there any parameter I can add which will reduce CPU consumption. This spike usually happens due to the optimize command. I understand that internally pt-archive is just running the usual mysql optimize command in the backend but is there any thing else from percona i can use to not see the spike while doing the same job optimize does?