I tried to deploy PMM on our AWS ECS cluster, in FARGATE mode.
I created only one container, when its /srv is mounted to external storage volume (AWS EFS).
The container starts successfully and its task reached the ‘running’ state. But eventually, the container fails to stay healthy with the following log errors:
“2020-07-16 16:50:23,196 CRIT Supervisor is running as root. Privileges were not dropped because no user is specified in the config file. If you intend to run as root, you can set user=root in the config file to avoid this message.”
“2020-07-16 16:50:24,561 INFO exited: nginx (exit status 1; not expected)”
“2020-07-16 16:50:31,629 INFO gave up: nginx entered FATAL state, too many start retries too quickly”
What version of PMM are you using? There was an issue with a version (2.7.0 maybe?) that if a container did not have internet access it would cause nginx to die a fiery death and there wasn’t a simple recovery without going inside and modifying the offending entry (related to the “blog” panel on the default home dashboard). That’d be my first guess…second (And this is a guess because the logs you posted above make no connection) is that privs were not authorized by the host regardless of the container hinting it stayed as root and nginx couldn’t make the necessary bindings to port 80/443. Again, just a guess there but it’s friday…maybe I’ll get lucky ;-)
I’m using percona/pmm-server:latest and then tried percona/pmm-server:1, got the same issue on both images.
Have an idea of how to debug the ngnix in this image?
Thanks a lot,
The problem is the persistent storage that I’m trying to add, as I wrote above (AWS EFS).
The container cannot create the /srv folder in the volume that I attached.
I gave the root folder 777 permissions and still, it doesn’t work.
Although, when I removed the attached volume from my configurations, PMM started and worked like a charm
Anyone have an idea what else is needed in order to make it work on an external volume?