PMM Alerting on Google Chat

Description:

Recieved notification on Google Chat (Webhook URL) are only contain the Title, even with default settings. It used to send Title, Labels, Summary etc. It used to work ok earlier but I can’t remember when this started to happen ( I recently migrated from docker to a new server with docker compose, but notification earlier than this are still the same. Percona Alerting is enabled.

I tried to re-create the Google Chat contact (new webhook URL), remove the template and in the message calling the template etc.

Version:

2.42.0 (Docker Compose)

Additional Information:

On email or Telegram I get the whole information (Which with template if I want I can get more specific values). Again the issue is only with Google Chat (And we recentely migrated to Google Chat from Slack, so this is the only platform to get the notifications).

Example

When PMM agent is down or unreachable, all I get is this:

 [FIRING:1] pmm_agent_down Alerting Rule (/agent_id/612b8cd5-9ee1-4042-91d6-5a64b389d922 pmm-agent 0 OS pmm-server pmm-managed /node_id/fc82eddc-d03e-42f0-a0ff-da285110e755 **name of the VM** 1 custom critical pmm_agent_down 2.42.0)```

No Values, Summary or anything.

@Apostolos_Karavias, from the quick view, I suspect this is because of the older version of Grafana used in PMM for Alerting.
We use Granafa 9 and will migrate to Grafana 11 in pmm v3 next quarter.
Please, follow us and test the pmm v3 beta when it’s out this quarter to confirm the problem is gone.

1 Like

Thnak you @Roma_Novikov for your reply. I’ll wait for it, because having right alerts is very crucial for us, as we have migrated our DB monitoring fully to PMM.

1 Like

Ja zupełnie nie wiem o czym mówicie. Jestem daleko od AIT

Hello again,

I tried to find some older notifications on this and it seems that this issue occured on version 2.42, as the 2.41 was sending the whole message:

[FIRING:1] pmm_agent_down Alerting Rule (/agent_id/437486c0-e000-4520-81f0-230621d90ac4 pmm-agent 0 OS pmm-server pmm-managed /node_id/c1886df7-fe9e-4860-94db-d8673f9490e1 node_name 1 critical pmm_agent_down 2.41.2)

[FIRING:1] pmm_agent_down Alerting Rule (/agent_id/437486c0-e000-4520-81f0-230621d90ac4 pmm-agent 0 OS pmm-server pmm-managed /node_id/c1886df7-fe9e-4860-94db-d8673f9490e1 Node_name 1 critical pmm_agent_down 2.41.2)**Firing**

Value: [ var='A' labels={agent_id=/agent_id/437486c0-e000-4520-81f0-230621d90ac4, agent_type=pmm-agent, disabled=0, instance=pmm-server, job=pmm-managed, node_id=/node_id/c1886df7-fe9e-4860-94db-d8673f9490e1, node_name=node_name, version=2.41.2} value=1 ]
Labels:
 - alertname = pmm_agent_down Alerting Rule
 - agent_id = /agent_id/437486c0-e000-4520-81f0-230621d90ac4
 - agent_type = pmm-agent
 - disabled = 0
 - grafana_folder = OS
 - instance = pmm-server
 - job = pmm-managed
 - node_id = /node_id/c1886df7-fe9e-4860-94db-d8673f9490e1
 - node_name = node_name
 - percona_alerting = 1
 - severity = critical
 - template_name = pmm_agent_down
 - version = 2.41.2
Annotations:
 - description = PMM agent on node 'node_name', node ID '/node_id/c1886df7-fe9e-4860-94db-d8673f9490e1', cannot be reached. Host may be down.
 - summary = PMM agent on node node_name' cannot be reached. Host may be down.

The FIRING:1 is show twice as one is from the title and the other from the message.

Any idea why this is happening? Clearly is not an issue with percona/grafana version as it worked before, so it’s not something that will be fixed on newer Grafana version/PMM3.

Thank you fr your time checking into this.

Just tried the notification on email instead of Google Chat and I get all the information properly, so either something change on Google Chat webhooks or how alerting works with Google Chat webhooks?

EDIT: grafana logs show only the following:

logger=alerting.notifier.googlechat t=2024-09-03T10:29:24.476235608Z level=error msg="Missing receiver"
logger=alerting.notifier.googlechat t=2024-09-03T10:29:24.476272735Z level=error msg="Missing group labels"

I’ve done some extra testing with another instances. To mention on other channels (email, telegram) I get the right information.

So for Google Chat:

Grafana Cloud Trial: Works as expected (Title and Values)
PMM (Grafana 9.20) - Docker Compose - HTTPS: Only Title
Grafana 11.20 - Docker Compose OSS - Traefik for HTTPS: Only Title
Grafana 11.20 - Docker OSS Image without https: Works as Expected.

After all the issue was on the docker compose, where the GF_ enviroment variable for URL was in " ". After removing them, the Google Chat Notification worked as expected.