Combining --busy-time and --match-info in pt-kill

I’m not getting pt-kill to work as expected using match-info to select a specific problematic query (probably because of bug in MySQL, but that’s another question).

When combining the two it sometimes kills queries ignoring the execution time of the executed query. Below is output of a command that is not working as expected:

$ pt-kill --verbose --busy-time=5s --user=root --victims=all --interval=5 --run-time=1s --match-info "MAX_EXECUTION_TIME.5000." --print --nostrip-comments
# 2019-01-16T11:39:31 /usr/bin/pt-kill starting
# 2019-01-16T11:39:31 Connected to host jed
# 2019-01-16T11:39:31 Find spec: $VAR1 = {
all => undef,
busy_time => '5',
idle_time => undef,
ignore => {
Command => undef,
Host => undef,
Id => 360385160,
Info => undef,
State => 'Locked',
User => undef,
db => undef
},
match => {
Command => undef,
Host => undef,
Info => 'MAX_EXECUTION_TIME.5000.',
State => undef,
User => undef,
db => undef
},
replication_threads => undef
};

# 2019-01-16T11:39:31 Run-time: 1 seconds at 5 second intervals
# 2019-01-16T11:39:31 Checking processlist
# 2019-01-16T11:39:31 Matched 1 queries
# 2019-01-16T11:39:31 KILL 360385156 (Execute 0 sec) SELECT
/*+ MAX_EXECUTION_TIME(5000) */
*
FROM
SomeTable
WHERE 
MATCH (body) AGAINST ('+abc*' IN BOOLEAN MODE)
ORDER BY quality DESC
LIMIT 1000
# 2019-01-16T11:39:31 Sleeping 5 seconds after normal interval
# 2019-01-16T11:39:36 /usr/bin/pt-kill ending

(the MAX_EXECUTION_TIME not working is the likely bug)

However when I try to replicate the problem using a synthetic query it works as expected.

I have found the following bugs that are probably relevant:
[url][PT-1492] pt-kill in version 3.0.7 seems not to respect busy-time any longer - Percona JIRA
[url][PT-548] LP #1016272: pt-kill kills prepared statements without checking busy-time - Percona JIRA

But cannot find a workaround that works. Does anyone know how I can get pt-kill to work for this allication?

Regards,
Patrik

Try --each-busy-time instead of --busy-time=5s

Hi Carlos,
Thanks for your reply! I will try your suggestion.

However I don’t really understand the docs when it comes to each-busy-time, if every currently running query is in the default group won’t any properly running query (ie. query running for much less than 1s) executing at the same time as the problematic ones (eg running for more than 10s) prevent pt-kill from catching them? It would seem to me that with a server running at several thousand QPS there is a high likelyhood of the problematic queries running at the same time as properly running queries and thus preventing pt-kill from triggering?

Regards,

Switching to the --each-busy-time option does not seem to work either. I have run the command for several days and can see in my pt-stalk logs that there are queries that should have been killed, such as the one below.

*************************** 9. row ***************************
Id: 18333911
User: xyz
Host: 1.2.3.4:58624
db: xyz
Command: Execute
Time: 38
State: Creating sort index
Info: SELECT
/*+ MAX_EXECUTION_TIME(5000) */
*
FROM
Suggestions
WHERE
MATCH (someColumn) AGAINST ('+somestring*' IN BOOLEAN MODE)
ORDER BY quality DESC
LIMIT 1000
Rows_sent: 0
Rows_examined: 0

The pt-kill command I used was the following.

pt-kill --each-busy-time=10s --victims=all --interval=5 --run-time=100000s --match-info "MAX_EXECUTION_TIME.5000." --kill --nostrip-comments