Could we stop Prometheus in docker gracefully?

Hi,

I have one quesion about stopping Prometheus in Docker.

when I restart PMM docker , Prometheus could be crashed like this :

.
.
.
time=“2018-01-10T06:53:00Z” level=info msg=“5170000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5180000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5190000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5200000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5210000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5220000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5230000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:01Z” level=info msg=“5240000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5250000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5260000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5270000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5280000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5290000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5300000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5310000 archived metrics checked.” source=“crashrecovery.go:418”
time=“2018-01-10T06:53:02Z” level=info msg=“5320000 archived metrics checked.” source=“crashrecovery.go:418”
.
.
.

I think this log is related with recover process in prometheus.

but it takes too long time.

so I want to know how can we stop the prometheus gracefully without crash.

thanks.

Hi,

What PMM Version are you running ? Also how much memory do you have allocated for Metrics ?

hi,

My PMM version is 1.5.3

and I allocated 40G memory for prometheus.

and prometheus data file size is 174Gb.

thanks.

$ docker run -d
-p 80:80
-e METRICS_MEMORY=48318382080 -e METRICS_RETENTION=2160h -e QUERIES_RETENTION=2160h -e METRICS_RESOLUTION=5s
–volumes-from pmm-data
–name pmm-server
–restart always
percona/pmm-server:latest

Hi bckim,

the root of the issue: “docker stop” command sends TERM signal, docker wait 10 seconds and send KILL signal.
but very large prometheus installations cannot dump gigabytes of memory to disk in 10 seconds.
workaround - use “docker stop -t 300 pmm-server” command

also, you can add “–stop-timeout 600” option to “docker run” command, in this case, “docker stop” command should work better

thank you.

I could solve this problem with your help.