I tried to run pt-table-checksum on a single database at a time using the --databases option and --no-check-replication-filters
command:
replication@replication:/$ pt-table-checksum --databases=SUBS_270315_9NKD3 --no-check-replication-filters --host=192.168.104.197 --port=3306 --user=root --password= --recursion-method dsn=D=percona,t=dsns
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
it is correct? It wont show the process like below output?
It should show you the output with a list of tables it checked after it is finished. When you wrote the above was the tool still running? You could add the --progress=time,300 to see if it’s still working on something. It can take quite a while for large tables.
Yes, when I type command:
pt-table-checksum --progress=time,300 --databases=subs_270315_9nkd3 --no-check-replication-filters --host=192.168.104.197 --port=3306 --user=root --password= --recursion-method dsn=D=percona,t=dsns
The output is :
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
Waiting for the --replicate table to replicate to replication…
I ran it for whole day still the same output.
any advice?
Depending on your table sizes and number of tables, it could take a long time to run. I have not seen your specific output message before, so not sure if it is stuck waiting on that indefinitely, or if it is just outputting that off and on as it is doing other work. To get a better idea of what it is doing, you could try it in debug mode, but that will output a lot of data, so be careful:
TableParser:4445 18067 SHOW TABLES FROM percona LIKE ‘checksums’
TableParser:4451 18067 DBD::mysql::db selectrow_arrayref failed: Unknown database ‘percona’ [for Statement “SHOW TABLES FROM percona LIKE ‘checksums’”] at /usr/bin/pt-table-checksum line 4448.
It looks like it is complaining about a missing percona database. The issue is probably that your percona database is not getting replicated to the slave due to your replication filters. You could try to manually create the percona database on your slave, and then run it again and see if that works. If that does not work, then what you’ll likely need to do is add “percona” to the list of databases to replicate with the replicate-do-db command in your my.cnf configuration file.
I able to run it with manually create the percona database on your slave (dsns put master IP) and add “percona” to the list of databases to replicate with the replicate-do-db command in my.cnf configuration file
06-09T19:24:50 Skipping table subs_270315_9nkd3.gl_gstreturntrack because it has problems on these replicas:
Table subs_270315_9nkd3.gl_gstreturntrack does not exist on replica replication
This can break replication. If you understand the risks, specify --no-check-slave-tables to disable this check.
06-09T19:24:50 Error checksumming table subs_270315_9nkd3.gl_gstreturntrack: DBD::mysql::db selectrow_hashref failed: Table ‘subs_270315_9nkd3.gl_gstreturntrack’ doesn’t exist [for Statement “EXPLAIN SELECT * FROM subs_270315_9nkd3.gl_gstreturntrack WHERE 1=1”] at /usr/bin/pt-table-checksum line 6521.
06-09T19:24:59 Skipping table subs_270315_9nkd3.imp_gl_doubleentry because it has problems on these replicas:
Table subs_270315_9nkd3.imp_gl_doubleentry on replica replication is missing these columns: projectcode, docdate
This can break replication. If you understand the risks, specify --no-check-slave-tables to disable this check.
06-09T19:24:59 Skipping table subs_270315_9nkd3.imp_gl_doubleentrykoff because it has problems on these replicas:
Table subs_270315_9nkd3.imp_gl_doubleentrykoff does not exist on replica replication
This can break replication. If you understand the risks, specify --no-check-slave-tables to disable this check.
06-09T19:24:59 Error checksumming table subs_270315_9nkd3.imp_gl_doubleentrykoff: DBD::mysql::db selectrow_hashref failed: Table ‘subs_270315_9nkd3.imp_gl_doubleentrykoff’ doesn’t exist [for Statement “EXPLAIN SELECT * FROM subs_270315_9nkd3.imp_gl_doubleentrykoff WHERE 1=1”] at /usr/bin/pt-table-checksum line 6521.
The dsns table on the master should have the connection information for the slave. The dsns table on the slave can be the same as the master since normally it would be replicated.
The tool is saying that those tables do no exist / have a different structure. So if you think that is wrong, you can try the --no-check-slave-tables option it mentions. But as it tells you, that will break replication if the tables are actually no there / have the wrong structure. So use that at your own risk.
I would verify your connection information in the master dsns table and make sure that the IP and username/password actually work to connect to the slave from the host you are running pt-table-checksum from as well.
When i tried to check all database and it return below output:
:~$pt-table-checksum --no-check-replication-filters --host=192.168.104.197 --port=3306 --user=root --password= --recursion-method dsn=D=percona,t=dsns
Waiting to check replicas for differences: 0% 00:00 remain
Waiting to check replicas for differences: 0% 00:00 remain
Waiting to check replicas for differences: 0% 00:00 remain
I tried to debug it with below command:
:~$PTDEBUG=1 pt-table-checksum --no-check-replication-filters --host=192.168.104.197 --port=3306 --user=root --password= --recursion-method dsn=D=percona,t=dsns > /tmp/pt.log 2>&1
Waiting to check replicas for differences: 0% 00:00 remain
pt_table_checksum:11144 5583 Sleep 1.25 waiting for chunks
pt_table_checksum:11111 5583 replication max chunk: undef
pt_table_checksum:11144 5583 Sleep 1.25 waiting for chunks
pt_table_checksum:11111 5583 replication max chunk: undef
Should I wait longer time to let it finish load the waiting to check replicas for differences?
Are your replication filters preventing some of the databases from being replicated still? If it works on one of them, then it likely is due to your replication configuration.
06-25T18:19:31 Skipping table subs_270315_dbpxq.mc_postcode because on the master it would be checksummed in one chunk but on these replicas it has too many rows:
2819 rows on replication
The current chunk size limit is 2794 rows (chunk size=1397 * chunk size limit=2.0).
Try upping your --chunk-size-limit to 3 and give it another go. Take a look at the below link prior so you know what is happening and the warning about it potentially causing additional load on your system: