Could we stop Prometheus in docker gracefully?

bckim · January 10, 2018, 1:24am

Hi,

I have one quesion about stopping Prometheus in Docker.

when I restart PMM docker , Prometheus could be crashed like this :

.
.
.
time=“2018-01-10T06:53:00Z” level=info msg=“5170000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5180000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5190000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5200000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5210000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5220000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5230000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5240000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5250000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5260000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5270000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5280000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5290000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5300000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5310000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5320000 archived metrics checked.” source=“crashrecovery.go:418”
.
.
.

I think this log is related with recover process in prometheus.

but it takes too long time.

so I want to know how can we stop the prometheus gracefully without crash.

thanks.

Peter · January 10, 2018, 9:00am

Hi,

What PMM Version are you running ? Also how much memory do you have allocated for Metrics ?

bckim · January 10, 2018, 6:37pm

hi,

My PMM version is 1.5.3

and I allocated 40G memory for prometheus.

and prometheus data file size is 174Gb.

thanks.

$ docker run -d
-p 80:80
-e METRICS_MEMORY=48318382080 -e METRICS_RETENTION=2160h -e QUERIES_RETENTION=2160h -e METRICS_RESOLUTION=5s
–volumes-from pmm-data
–name pmm-server
–restart always
percona/pmm-server:latest

Mykola · January 11, 2018, 1:31am

Hi bckim,

the root of the issue: “docker stop” command sends TERM signal, docker wait 10 seconds and send KILL signal.
but very large prometheus installations cannot dump gigabytes of memory to disk in 10 seconds.
workaround - use “docker stop -t 300 pmm-server” command

Mykola · January 11, 2018, 10:07am

also, you can add “–stop-timeout 600” option to “docker run” command, in this case, “docker stop” command should work better

bckim · January 11, 2018, 6:47pm

thank you.

I could solve this problem with your help.

Topic		Replies	Views
prometheus got killed because of large usage of memory PMM 1.x	2	1209	October 25, 2017
Prometheus crashed on PMM 1.1.4.0 PMM 1.x	0	549	September 10, 2018
How to clear data in PMM PMM 1.x	9	3017	May 10, 2017
PMM shutdown flow PMM 1.x	3	747	October 6, 2017
how to clean up space without effect the config PMM 1.x	17	7733	October 30, 2018

Could we stop Prometheus in docker gracefully?

Related topics