Hello team,
pbm version:
Version: 1.1.1Platform: linux/amd64GitCommit: 457bc0eaf861c8c15c997333ce1d8108a138874bGitBranch: masterBuildTime: 2020-01-31_08:16_UTCGoVersion: go1.12.9
pbm_config.yaml
storage: type: filesystem filesystem: path: /backup/
I have a 3 node - PSS cluster. I have pbm-agent setup on all 3 nodes. I take backup onto NFS mount point. Permissions look correct. PBM backups are successful.
When I try do restore, it automatically chose the primary for restore , but it failed at the end saying Not master. The mongoDB URI contains all the 3 nodes. And one node is primary.
2020/04/13 12:55:12 Got command restore 2020-04-13T11:21:15Z2020/04/13 12:55:12 [INFO] Restore of ‘2020-04-13T11:21:15Z’ started2020-04-13T12:55:12.860+0000 preparing collections to restore from2020-04-13T12:55:12.879+0000 finished restoring admin.myOutput (0 documents, 0 failures)
2020/04/13 12:55:12 [ERROR] restore: restore mongo dump (successes: 0 / fails: 0): admin.my Output: error dropping collection: (NotMaster) not master
I then remove changed the URI to point to only the master and tried again, and getting the same error.
- Any idea how to debug or what might cause this issue? The online docs does not say how to resolve this.
- Should we be restoring on to a new empty cluster? or the existing cluster?
Hi.
It sounds as though it might be an misleading error message., the one about “not master”, if the pbm-agent was the one on the server with the primary node. A misleading error message is a bug in it’s own right. But I’ll try to guess the problem ‘behind’.
Two hypotheses:
- The pbm-agent connected to the wrong node.
Q. Is it started with config that has the MongoDB URI with all three nodes? That would be a problem if so. It should connect in standalone/direct style to only the mongod on the same host only. If it was given a replicaset-style URI it would connect to the primary wherever it is. I.e. all three pbm-agents would act as the one on the primary.
- I notice the filesystem type of storage is being used. This is very easy to incompletely set up. (Basically I don’t recommend it; but I see it as the first thing most users try when testing.) I wonder if the node that acted couldn’t read the backup files because they weren’t in whichever remote fileserver mount was mounted at /backup/.
Cheers,
Akira
Hello Akira,
Thanks !! I changed the pbm-agent config MONGODB URI only to one node on which its running and it resolved the issue.
Continuing with the pbm-agent service file, if we use the systemd service file i dont see logs.
The logs start appearing only when we use
nohup
pbm-agent --mongodb-uri=“mongodb://pbmuser:secretpwd@$HOSTNAME:27017”
> /var/log/pbm-agent.log 2>&1 &
can you please share a sample systemd service file with logging.
Hello Akira,
Thanks !! I changed the pbm-agent config MONGODB URI only to one node on which its running and it resolved the issue.
Continuing with the pbm-agent service file, if we use the systemd service file i dont see logs.
The logs start appearing only when we use
nohup
pbm-agent --mongodb-uri=“mongodb://pbmuser:secretpwd@$HOSTNAME:27017”
> /var/log/pbm-agent.log 2>&1 &
can you please share a sample systemd service file with logging.