Not the answer you need?
Register and ask your own question!

PMM MySQL low-res exporter : context deadline exceeded

babinebabine EntrantCurrent User Role Patron

Hello,

I have a (mostly) working PMM2 (2.10.0) install. However I noticed that some dashboard seem mostly empty or have very few metrics (lots of gaps).

After investigation it seems that all the "low-res" prometheus target are taking too long to be scraped (seen on "Unealthy" on http://pmmserver/prometheus/targets page), for example :

Get "http://devdbserver:42000/metrics?collect%5B%5D=binlog_size&collect%5B%5D=custom_query.lr&collect%5B%5D=engine_tokudb_status&collect%5B%5D=global_variables&collect%5B%5D=heartbeat&collect%5B%5D=info_schema.clientstats&collect%5B%5D=info_schema.innodb_tablespaces&collect%5B%5D=info_schema.userstats&collect%5B%5D=perf_schema.eventsstatements&collect%5B%5D=perf_schema.file_instances": context deadline exceeded


This is a dev server on which we have "quite some" tables (which is nowhere near what is on production servers) :

# find /var/lib/mysql -name '*.ibd' | wc -l
98301


Getting the metrics locally (to avoid possible network issues) takes around 17s :

# time curl http://localhost:42000/metrics-lr -u 'pmm:/agent_id/********' > /tmp/out.txt                                                                           <(14:56:13)>
 % Total  % Received % Xferd Average Speed  Time  Time   Time Current
                 Dload Upload  Total  Spent  Left Speed
100 145M  0 145M  0   0 8585k   0 --:--:-- 0:00:17 --:--:-- 35.3M
curl http://localhost:42000/metrics-lr -u  > /tmp/out.txt  0.02s user 0.14s system 0% cpu 17.394 total

The metrics are 145M for around 1M lines.


Most represented metrics are the following :

$ grep -v '^#' /tmp/out.txt | cut -f1 -d'{' | sort | uniq -c | sort -h | tail
    250 mysql_perf_schema_events_statements_sort_rows_total
    250 mysql_perf_schema_events_statements_tmp_disk_tables_total
    250 mysql_perf_schema_events_statements_tmp_tables_total
    250 mysql_perf_schema_events_statements_total
    250 mysql_perf_schema_events_statements_warnings_total
  98145 mysql_info_schema_innodb_tablespace_allocated_size_bytes
  98145 mysql_info_schema_innodb_tablespace_file_size_bytes
  98145 mysql_info_schema_innodb_tablespace_space_info
 396528 mysql_perf_schema_file_instances_bytes
 396528 mysql_perf_schema_file_instances_total


Is there a way to increase the timeout or maybe not export some of these metrics ?


(For the record HR and MR targets take less than 0.1s and 1s respectively from a remote server)

Answers

  • Agustin GAgustin G Percona Percona Staff Role

    Hi babine,

    We have created the following bug to track this some days ago: https://jira.percona.com/browse/PMM-6744, which is most likely what you are seeing. Can you double-check if collecting all but perf_schema.file_instances will make the curl command take less than 10 seconds for you too?

    time curl -u 'pmm:/agent_id/********'  http://devdbserver:42000/metrics?collect%5B%5D=binlog_size&collect%5B%5D=custom_query.lr&collect%5B%5D=engine_tokudb_status&collect%5B%5D=global_variables&collect%5B%5D=heartbeat&collect%5B%5D=info_schema.clientstats&collect%5B%5D=info_schema.innodb_tablespaces&collect%5B%5D=info_schema.userstats&collect%5B%5D=perf_schema.eventsstatements >/dev/null
    


    Best,

    Agustín.

  • babinebabine Entrant Current User Role Patron

    Thanks for the answer, good news !


    On the dev server itself :

    # time curl 'http://localhost:42000/metrics?collect%5B%5D=binlog_size&collect%5B%5D=custom_query.lr&collect%5B%5D=engine_tokudb_status&collect%5B%5D=global_variables&collect%5B%5D=heartbeat&collect%5B%5D=info_schema.clientstats&collect%5B%5D=info_schema.innodb_tablespaces&collect%5B%5D=info_schema.userstats&collect%5B%5D=perf_schema.eventsstatement' -u 'pmm:/agent_id/********' > /tmp/out.txt 
     % Total  % Received % Xferd Average Speed  Time  Time   Time Current
                     Dload Upload  Total  Spent  Left Speed
    100 37.4M  0 37.4M  0   0 11.4M   0 --:--:-- 0:00:03 --:--:-- 11.4M
    curl -u 'pmm:/agent_id/********' > /tmp/out.txt 0.00s user 0.05s system 1% cpu 3.336 total
    

    It takes a bit more than 3 seconds for 38MB (locally)


    I tried modifying the prometheus.yml file in the PMM server docker to change the timeout to 30 seconds but it was somehow reverted to 10 seconds upon restart.

  • Agustin GAgustin G Percona Percona Staff Role

    Great! I suggest you to follow that JIRA ticket, then, to get the latest updates on when it will be resolved.

    Regarding:

    >I tried modifying the prometheus.yml file in the PMM server docker to change the timeout to 30 seconds but it was somehow reverted to 10 seconds upon restart.

    Unfortunately, there is a maximum scrape_timeout set to 10s globally. Even if you change the scrape_interval to something greater, this is currently the maximum allowed (and it will be overwritten automatically if you manually change the config file, as you have already noted). You can always create a new feature request, if you think it will be worth it.

Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.