Had to re-run pmm-admin config and pmm-admin add mysql after upgrade

Hi,
I just upgraded by PMM 2.2.0 to 2.2.1 and after upgrade the server had no nodes or anything in the GUI.
I had to re-run pmm-admin config and pmm-admin add mysql on all nodes for them to reappear.

Is that really how it should work or is it PMM 2.0 still considered unstable?

All historical data for the nodes was gone as well. After re-adding the graphs got populated again.

BR
Johan

No one else has encountered this ?

Hm. Very strange. What PMM deployment method did you use ? Docker ?  Virtual image etc ?

Also why would you upgrade to 2.2.1 instead of 2.6.0 which is latest ?

how did you perform the upgrade?  If you did a container upgrade there are a number of ways you can lose data (non-persistent storage volume, different ‘docker run’ command to convert image to container) but if you did it via the upgrade panel in the UI I’m not aware of any issues that would have lead to data loss.  I have had an update using the UI fail for me one time with loss of internet in the middle of the process but we’ve added additional checks to ensure we handle it more gracefully.  Can you provide any more details as I could certainly try to recreate on my lab just to see if there’s a workflow we didn’t consider. 

@“steve.hoffman”  I used the same commands i used when upgrading PMM v1.
$ sudo docker stop pmm-server $ sudo docker rm pmm-server
$ sudo docker run -d -i -t -p 80:80 --volumes-from pmm-data --name pmm-server --restart always -e METRICS_RETENTION=Xh --env METRICS_RESOLUTION=5s --env METRICS_MEMORY=Y percona/pmm-server:<version>
@Peter : I haven’t tried any more upgrades since installing 2.1.0 and then upgrading to 2.2.0 and 2.2.1 Both 2.2.0 and 2.2.1 upgrades have meant loss of data and having to re-connect all clients again. I run it on 60 QA databases so it takes a while to do it.
Correction, the version now is 2.4.0

That command looks correct and makes me wonder if there’s a problem with the pmm-data container.  I supposed the first thing to check is does it exist (it won’t be running so you’ll need ‘docker ps -a’ to see it).  If it’s there, do you happen to have the command you created the data container with in the first place.  What you describe could be the result of changing the /srv path to something else say /data which could explain the data loss. We store all the DB’s in /srv by default so if your persistent volume maps to something like /data but we put the prometheus, postgres and clickhouse db’s on /srv when you delete your container it won’t be preserved.  Also I’m going to see if I can move this post over to the PMM2 board so that others could see this discussion and possibly weigh in.  

Hi @“steve.hoffman”
Thanks for looking into this. The output from ‘docker ps -a’ is : 
root@pmm-server:[QA]:~# docker ps -aCONTAINER ID        IMAGE                       COMMAND                  CREATED             STATUS                      PORTS                                      NAMES749009c7dd0b        percona/pmm-server:2.4.0    “/opt/entrypoint.sh”     7 weeks ago         Up 2 days (unhealthy)       0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp   pmm-server45b0fd9580de        percona/pmm-server:2.2.0    “/bin/true -g /var…”   3 months ago        Created                                                                pmm-data68f1bee12b82        percona/pmm-server:1.17.1   “/opt/entrypoint.sh”     15 months ago       Exited (137) 3 months ago                                              pmm-server-backupThe command used to create the docker instance is the same as above :
 sudo docker run -d -i -t -p 80:80 --volumes-from pmm-data --name pmm-server --restart always -e METRICS_RETENTION=Xh --env METRICS_RESOLUTION=5s --env METRICS_MEMORY=Y percona/pmm-server:<version>

So I see the pmm-data container DOES exists so the only other thing I can think of is it being mounted to a different location than /srv.  
check your command history to see if you can find anything along the lines of: 

docker create -v /srv --name pmm-data percona/pmm-server:2.2.0 /bin/true

If you can’t find that you can run ‘docker inspect pmm-server’ and look in the “mounts” section to see where your data volume is mapping to. It should be /srv but I know once I typed it /srvb and went through the same agony as you and after I reregistered and enabled a dozen systems I went back and looked at my command history to see my fat-fingered mistake so I had a data volume…but nothing was getting written there by PMM.  Sorry to start off so basic in the hunt but want to eliminate the obvious stuff.  

Hi @“steve.hoffman” The only create command i have used is this : 
sudo docker create -v /opt/prometheus/data -v /opt/consul-data -v /var/lib/mysql -v /var/lib/grafana --name pmm-data percona/pmm-server:2.2.0 /bin/true -g /var/lib/docker
I used the same type of command when creating the docker container for PMM v1 and upgrades have worked there from 1.0.7 to 1.17.1
No worries about starting basic, it’s usually where the errors are discovered.

Well, I see the issue.  And I have to own this one :disappointed: The docs were not clear on what version of instructions someone was following so it was very easy to follow PMM1 docs and simply change the version in the commands and experience what felt like success.  (We have made, and continue to make, improvements here to reduce this confusion!) 

Long story short, PMM2’s storage has been centralized in /srv (prometheus, clickhouse, postgres, alertmanager, grafana, logs, configs, you name it) all go into one location vs PMM1 that had to create volume maps for mysql, prometheus, etc.  That means your pmm-server container for PMM2 was correctly importing the volumes form your data container but none of the paths in your -v commands above would ever get data written to them by PMM2…it was all going to srv which was removed with the ‘docker rm pmm-server’ command you referenced above.  

If you run docker rm again on your current setup you will again lose all your data so let me talk to the team to see if there’s a way to create a new pmm-data container that maps to the volume /srv (a MUST for PMM2) AND get the data out of the pmm-server container’s /srv directory into the pmm-data container so that when you start the pmm-server it just picks up where it left not not realizing that /srv used to be local to pmm-server container but now is actually a mount to pmm-data container.  

Hi,
No worries. I have PMM2 only in QA and if i wipe it all again it doesn’t really matter. 
Is this the correct way of setting up a docker image for PMM2 and then updating it ?
Setting up PMM 2.0-------------------------- $ docker pull percona/pmm-server:2.4.0 $ docker create  -v /srv --name pmm-data percona/pmm-server:2.4.0 /bin/true -g /var/lib/docker $ docker run -d -i -t -p 80:80 -p 443:443 --volumes-from pmm-data --name pmm-server --restart always -e 192h --env METRICS_RESOLUTION=5s --env METRICS_MEMORY=2097152 percona/pmm-server:2.4.0
Upgrading PMM 2.0--------------------------- $ docker pull percona/pmm-server:2.6.0 $ docker run -d -i -t -p 80:80 -p 443:443 --volumes-from pmm-data --name pmm-server --restart always -e 192h --env METRICS_RESOLUTION=5s --env METRICS_MEMORY=2097152 percona/pmm-server:2.6.0

You’ve got the idea (replace 2.6.0 with 2.6.1 though!)…you’ll need a ‘docker rm’ in the upgrade sequence or if nothing else, rename your existing pmm-server container to something else or it’ll fail on the ‘docker run’.  We do have the in-place upgrade that actually upgrades the container from inside instead of replacing the old container with a new one but I upgrade the same way as you have outlined due to something extremely unique about my setup that we’ve not been able to put our finger on just yet.  You also don’t technically need the “METRICS_RESOLUTION” env variable since we have enabled this in the “pmm-settings” page and you can adjust “on the fly”.  

Alright, thanks for all your help here Steve. Much appreciated :smile:
Tomorrow (or early next week) i’ll wipe everything in my QA setup and redo it to check my documentation and verify the steps 

Hi @“steve.hoffman”
I just upgraded server and clients to 2.6.1 and everything looks great. My documentation has been updated so i consider this solved.
Thanks again for your help