Pbm restore physical backupset error with "too many files"

Hi all,

One more question. When I restore physical backupset, the pbm reports “too many files” error and fails. The error looks like this:

Sep 19 13:13:23 stb-mongob-140 pbm-agent: 2022-09-19T13:13:23.000+0800 E [restore/2022-09-18T08:39:39Z] mark restore as failed `copy files: create destination file </usr/local/mongodb/collection-1626--4342777260254070919.wt>: open /usr/local/mongodb/collection-1626--4342777260254070919.wt: too many open files`: set backup state: server selection error: server selection timeout, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: 192.168.11.40:27019, Type: Unknown, Last error: connection() error occured during connection handshake: dial tcp 192.168.11.40:27019: connect: connection refused }, { Addr: 192.168.11.41:27019, Type: Unknown, Last error: connection() error occured during connection handshake: dial tcp 192.168.11.41:27019: connect: connection refused }, { Addr: 192.168.1.40:27019, Type: Unknown, Last error: connection() error occured during connection handshake: dial tcp 192.168.1.40:27019: connect: connection refused }, ] }
Sep 19 13:13:23 stb-mongob-140 pbm-agent: 2022-09-19T13:13:23.000+0800 I [restore/2022-09-18T08:39:39Z] restore finished copy files: create destination file </usr/local/mongodb/collection-1626--4342777260254070919.wt>: open /usr/local/mongodb/collection-1626--4342777260254070919.wt: too many open files
Sep 19 13:13:23 stb-mongob-140 pbm-agent: 2022-09-19T13:13:23.000+0800 E [restore/2022-09-18T08:39:39Z] copy files: create destination file </usr/local/mongodb/collection-1626--4342777260254070919.wt>: open /usr/local/mongodb/collection-1626--4342777260254070919.wt: too many open files
Sep 19 13:13:23 stb-mongob-140 pbm-agent: 2022-09-19T13:13:23.000+0800 I change stream was closed
Sep 19 13:13:23 stb-mongob-140 pbm-agent: 2022/09/19 13:13:23 Exit: <nil>

My mongodb environment:
server1: mongod port 27018, config server port 27019, mongos port 27020
server2: mongod port 27018, config server port 27019, mongos port 27020
server3: mongod port 27018, config server port 27019, mongos port 27020

All three mongod instances are PSS called stb1. I created a NFS service by Linux server. The backupset was saved to NFS mount point, /backup.
I’m not sure what’s the “too many files” error comes from. Does the NFS server return the error or PBM?

Could you please advise?

Thanks,
Dillon

1 Like

Hi Dillon,

The error means PBM bumped into the maximum open file (file descriptors) limit on your server trying to copy one of the backup files.
You can check and set system limits with ulimit command. To investigate currently opened files you can use lsof.
PBM shouldn’t open many files as all copying during the restore is done sequentially. And file descriptors being closed right away.

2 Likes

Thank you for reply. After I see this error, I changed “Max open files” of root to 1048576. I use root to startup mongod, config server, mongos and pbm-agent. Per your notes, I go back and check the /proc/pid/limits of all these services.
Yes, the pbm-agent shows “max open files” is 1024 and 4096. It looks I have change the “max open files” of systemd service.

I have updated systemd service and tried again.

2 Likes

Thanks. I fixed the “max open files” of pbm-agent service. It works.

2 Likes