PMM on AWS fails when adding RDS MySQL remote instance

We have PMM running nicely on an EC2 instance and I can login as the admin. But, after I fill in all the RDS details in the setup form and click on ‘Add service’, I get the following error:

pmm-agent with ID “pmm-server” is not currently connected

I ran pmm-admin status on the EC2 instance and got the following:

Agent ID: /agent_id/agent-id-string
Node ID : /node_id/node-id-string

PMM Server:
URL : https://IPaddress:443/
Version: 2.19.0

PMM Client:
Connected : true
Latency : 289.252µs
pmm-admin version: 2.19.0
pmm-agent version: 2.19.0

Agents:
/agent_id/agent-id-string node_exporter Running
/agent_id/agent-id-string vmagent Running

So, everything seems fine, but I am unable to add the remote RDS mysql instance to start the monitoring. :frowning_face:

Any help would be appreciated.

After a fruitless search, I came across this article:

Based on that, I logged into my AWS EC2 instance running PMM and hit the suggested url:
https://IPaddress/prometheus/targets

and found that several node_exporter_agents and postgres_exporter_agents are all in the DOWN state with the following common error:

cannot read data: cannot scrape “http://127.0.0.1:0/metrics?collect[]=bonding&collect[]=entropy&collect[]=textfile.lr&collect[]=uname”: Get “http://127.0.0.1:0/metrics?collect[]=bonding&collect[]=entropy&collect[]=textfile.lr&collect[]=uname”: dial tcp4 127.0.0.1:0: connect: connection refused; try -enableTCP6 command-line flag if you scrape ipv6 addresses

Could someone please give a clue on how to fix this?

Of course, the specific metric varies, but the above error is common to all the exporter agents that rae down. Note: The EC2 instance is in a subnet that doesn’t have IPv6. I doubt that’s the issue, though.

1 Like

Sorry to bump this up, but could someone at Percona please throw some light on a way to fix this error?

I am wondering if this has something to do with the ingress/egress rules of the security group that Percona’s install process created - we went with ‘seller recommended settings’ and that opened up just Port 80, 443 and 22.

1 Like

So the PMM instance is running on EC2 installed direct from Marketplace and you’re trying to add RDS remote instance monitoring through the UI’s Add Instance option and on the final step you get the error?

The basics of how remote instance monitoring works is that PMM Server ALWAYS gets its data from a client…most of the time the client is installed on remote nodes and communicates back to the server but in remote monitoring’s instance, the client actually resides on the pmm-server itself (hence the 127.0.0.1 as the dial address). This says to me that the agent inside the pmm-server isn’t running. Easy to verify with by getting a prompt on the machine and running (as root) supervisorctl status and look at the pmm-agent entry to see if it’s running.

if it’s not you can dig into /srv/logs/pmm-agent.log to see if there’s any hint as to why it couldn’t start and we can hopefully troubleshoot from there. if it is running then I’d be supicious about internal communication and we might have to get basic looking at config files /usr/local/percona/pmm2/config/pmm-agent.yaml and start telneting to ports etc.

2 Likes

Thanks! Yes, the error is in the very last step.

I ran supervisorctl status and got the following results. It looks like the pmm agent is running:

[admin@pmm ~]$ sudo supervisorctl status
alertmanager RUNNING pid 1512, uptime 0:00:58
clickhouse RUNNING pid 1506, uptime 0:00:58
cron RUNNING pid 1509, uptime 0:00:58
dashboard-upgrade EXITED Jul 20 12:49 PM
dbaas-controller STOPPED Not started
grafana RUNNING pid 1507, uptime 0:00:58
nginx RUNNING pid 1508, uptime 0:00:58
pmm-agent RUNNING pid 1516, uptime 0:00:58
pmm-managed RUNNING pid 1515, uptime 0:00:58
pmm-update-perform STOPPED Not started
postgresql RUNNING pid 1505, uptime 0:00:58
prometheus STOPPED Not started
qan-api2 RUNNING pid 1560, uptime 0:00:38
victoriametrics RUNNING pid 1510, uptime 0:00:58
vmalert RUNNING pid 1511, uptime 0:00:58
[admin@pmm ~]$

As pmm-agent is running, based on your suggestion, I looked at /usr/local/percona/pmm2/config/pmm-agent.yaml and found the following:

id: /agent_id/9b751828-3ffc-49b4-8d52-a421441a39e3
listen-address: 127.0.0.1
listen-port: 7777
server:

Does this mean Port 7777 has to be opened up on the EC2 instance?

1 Like

No you shouldn’t have to…do me a favor and install telnet yum install telnet and then execute telnet 127.0.0.1 7777 if it’s successful you’ll just get an “escape character is ‘^]’” or similar (hit ctrl+] and type quit if you’re not familiar with telnet). but I don’t think pmm-sever tries to leverage 7777 (although it may…I’ll ask internally). because you’re making a call for ‘metrics’ that’s pmm-server trying to talk to agent and it should say something like 127.0.0.1:4200x.

Take a look at the following page: PMM → PMM Inventory → Agents tab…can you send a screenshot of that? you should not have to enable any ports internally…it should just work so it seems a config is messed up somewhere…just not sure where. Gonna see if anyone else has ideas…

2 Likes

I installed telnet and got the following for the telnet command:

[admin@pmm ~]$ telnet 127.0.0.1 7777
Trying 127.0.0.1…
Connected to 127.0.0.1.
Escape character is ‘^]’.

Attaching a screenshot of the Agents tab under PMM inventory.

Thanks, again, for the assistance.

1 Like

Could you tell us what have you done after installing PMM Server? Did you install client or update pmm-agent.yaml manually? id in your pmm-agent.yaml should be pmm-server to work correctly.

Thank you.

1 Like

Sure. I only remember doing the following after setting up PMM:

  1. Ran the following command with the updated password for the PMM admin user:

pmm-admin config --server-insecure-tls --server-url=https://admin:newpasswd@:443

  1. Modified the hostname of the PMM ec2 instance by sudo hostnamectl set-hostname hostname, added a DNS record at Route53 to make this the public DNS, updated the reverse DNS of the Elastic IP to point to this and rebooted the instance.
  2. Created an SSL for this hostname through ACM, but didn’t deploy it as it would require an ELB in front and I didn’t want to add costs.

That’s it. So, pretty much the only thing I did on the PMM server was to run the config command a second time.

P.S: Just checked .bash_history of the PMM server and confirmed that #1 & #2 above are the only commands I ran on the PMM server before running into this problem.

1 Like

Thank you for your answer
Looks like

pmm-admin config --server-insecure-tls --server-url=https://admin:newpasswd@:443

overwrote pmm-agent.yaml, that’s why you don’t have pmm-agent with id pmm-server which is required to monitor remote services.
Now you should update your pmm-agent.yaml and replace id there with pmm-server to make it work. Then just restart pmm-agent service.

1 Like

Thanks. I went ahead and modified /usr/local/percona/pmm2/config/pmm-agent.yaml, replacing the id there as below:

Updated by pmm-agent setup.
#id: /agent_id/9b751828-3ffc-49b4-8d52-a421441a39e3
id: pmm-server
listen-address: 127.0.0.1
listen-port: 7777

Not sure if this is the right way to go about it as I get the same error when adding the RDS MySQL instance for monitoring. pmm-agent is running fine after the restart.

1 Like

Could you share Inventory page?

1 Like

Thanks. Sure, I am attaching screenshots of each of the tabs (Services, Agents, Nodes) of the Inventory page.

One thing I noticed is that, in the Nodes tab, the agent points to the internal hostname and private IP address of the EC2 instance. I don’t know if that’s how it’s supposed to be.

Just to clarify again, I have followed verbatim the instructions on these two pages:

  1. AWS Marketplace - Percona Monitoring and Management
  2. Amazon RDS - Percona Monitoring and Management


1 Like

Hmmm…I think our instructions may have been unclear and need to be fixed!

the part that says:
authentication between PMM Server and PMM Clients - you will re-use these credentials when configuring PMM Client for the first time on a server, for example:

pmm-admin config --server-insecure-tls --server-url=https://admin:admin@<IP Address>:443

is NOT meant to be run on your newly installed PMM server…that’s only needed if you install pmm-client on another server and want your new PMM instance to also monitor it and DB’s on it. (trying to get that documentation updated ASAP)

running that command locally is what broke PMM’s connection with itself…at this point it may be easier to blow the PMM instance away and recreate it from scratch since that pmm-admin command not only changes the pmm-agent.yaml file but attempts to register it in the internal database as well.

2 Likes

I was wondering if that might be the case! :grinning:

Another part that would be good to fix is the set of commands for creating the pmm user (@Vadim_Yalovets helped me with that) on the RDS instance. They don’t work on MySQL 8.x.x. Or, maybe a distinction could be made between 5.7.x and 8.x.x.

Thanks to all the Percona staff who pitched in to try and help me on this post. Appreciate your efforts very much. :+1:

I will go ahead and create a new instance.

1 Like

We are on it! Thanks for the feedback and keep us posted!

2 Likes

Not sure why I can’t edit my post but submitted updates to both issues so once they’re approved hopefully no other runs into this :wink:

2 Likes

Thanks for acting quickly on this.

1 Like

For what it’s worth, I noticed the same error in the documentation for PMM 1.x on AWS. It may be useful to correct that, too:

1 Like

Checked just now and the old documentation still persists. Hope it can be corrected soon.

1 Like