Not the answer you need?
Register and ask your own question!

my graphs in Grafana sometimes working but sometimes not

seagenseagen ContributorInactive User Role Beginner
I installed PMM 1.0.4 yesterday. From the beginning when there were only one or two test hosts, the graphs were still working. But later they were not working when I added several hosts into pmm-server one by one.
# pmm-admin list
RUNNING is all YES

# pmm-admin check-network --no-emoji
* Client --> Server all is OK
* Client <-- Server all is PROBLEM ( I already checked the firewall which was stopped.)

The following is one of my graphs.

Please give me any advice, thanks so much. BTW, PMM is so cool:P

Comments

  • weberweber Advisor Inactive User Role Beginner
    Watch this page http://server/prometheus/targets to see endpoint status.
    Also you can check the log file by entering container "docker exec -ti pmm-server bash", then "vi /var/log/prometheus.log".

    This usually happen when there is a network latency between server and clients.

    Another thing you can test whether 1s resolution is not too much for given system resources for monitor server (where container runs) and network latency.
    You can try 5s and see if it works better https://www.percona.com/doc/percona-monitoring-and-management/faq.html#what-resolution-is-used-for-metrics
  • seagenseagen Contributor Inactive User Role Beginner
    I found the following error in prometheus.log when graphs were not working.
    --
    time="2016-09-28T08:13:57Z" level=error msg="Storage needs throttling. Scrapes and rule evaluations will be skipped." chunksToPersist=78816 maxChunksToPersist=524288 maxToleratedMemChunks=288358 memoryChunks=300294 source="storage.go:707"

    this afternoon I also tried to change the settings of prometheus.yml, such as scrape_intervals , scrape_timeout. And then to restart pmm-server the above issue is still there. :(

    As your mention I added the option, the output is the following..(docker create met the same issue)

    # docker run -d -p 80:80 -m METRICS_RESOLUTION=5s --volumes-from pmm-data2 --name pmm-server2 --restart always percona/pmm-server:1.0.4
    docker: invalid size: 'METRICS_RESOLUTION=5s'.
    See 'docker run --help'.
  • weberweber Advisor Inactive User Role Beginner
    How many endpoints do you have in Prometheus? Or time series? (Prometheus dashboard).
    May be it's not enough memory 256M dedicated to Prometheus https://www.percona.com/doc/percona-monitoring-and-management/faq.html#how-to-control-memory-consumption-for-prometheus
  • seagenseagen Contributor Inactive User Role Beginner
    Morning Roman..thanks so much for your nice hints.

    Yesterday I removed all hosts from pmm-server and then added 5 new hosts back to pmm-server. Until now all graphs are working. As you said I went to prometheus/targets and found all endpoints whose state are UP except only one (42002/metrics-lr) that is DOWN, error is context deadline exceeded.

    My PMM server is a virtual machine with 4G Ram, 2core. Before there was mysqld running on it. I already stopped it yesterday. I am not sure whether it is not enough resource for prometheus.

    Otherwise, what does these metrics mean, metrics-hr, metrics-mr and metrics-lr ?
  • weberweber Advisor Inactive User Role Beginner
    metrics-hr - 1s resolution metrics
    metrics-mr - 5s resolution metrics
    metrics-lr - 60s resolution metrics

    metrics-lr includes global variables and more intensive stats like table stats, user stats etc.

    "pmm-admin add mysql --help" has the following flags:
    --disable-binlogstats disable binlog statistics
    --disable-processlist disable process state metrics
    --disable-tablestats disable table statistics (disabled automatically with 10000+ tables)
    --disable-userstats disable user statistics

    How many tables do you have? SELECT COUNT(*) FROM information_schema.tables

    For 5 hosts I recommend to bump Prometheus memory to 1024M as you say VM has 4G.
  • seagenseagen Contributor Inactive User Role Beginner
    Almost 5000 tables are there in the instance.

    Aye, I already start to learn prometheus, which is such a huge system and powerful...
  • weberweber Advisor Inactive User Role Beginner
    Looks like 5000 tables is still a lot to return various metrics on each. Disabling table stats (re-adding mysql:metrics with --disable-tablestats) should make mysql-lr job up.
  • seagenseagen Contributor Inactive User Role Beginner
    You are right Roman. metrics-lr now is up with --disable-tablestats. Thanks a lot!
  • weberweber Advisor Inactive User Role Beginner
    Thanks for checking, I think we should lower the count of tables when table stats is disabled automatically.
  • seagenseagen Contributor Inactive User Role Beginner
    That should be nice.

    Roman, anther issue happened again on mongodb graph :( I remember that the first time adding one mongodb server, all graphs were working.. Today I also tried to add the mongo server to pmm server, but not all of graphs is working, such as command operations sec, document operations, getLastError-xxx, oplog insert time, Memory fault .. no graph.

    And then I went to http://server/prometheus/graph, manually executed the metrics. I could get values. Please give some advice.

    Thanks.
  • weberweber Advisor Inactive User Role Beginner
    If you added mongodb instance w/o nodetype, replset flags etc. then you should see the graphs only on Standalone instance dashboard. We plan to make nodetype and replset auto-discovered so this is not needed.
  • seagenseagen Contributor Inactive User Role Beginner
    I already added --replset repset --nodetype mongod --uri mongodb://xxxx

    Before I could get all graphs on ReplSet type. Right now the above mentioned graphs are empty both on Standalone instance and Replica set.
This discussion has been closed.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.