Normalized CPU Load Alert Template - Failed to parse rule template error

I am legitimately lost here. I went from Grafana v5.1.3 to v8.3.5. The older version was already in place when I started using it, and alerts were very simple. Most, if not all of my alerts were based on node_load1, and the usual threshold was 2.0. It would send an alert to Slack with the node_name, a brief description, and the value. The alert URL itself sent you to the panel of the server that was alerting. There was no template creation.

Grafana_5_1_3_Slack_Alert_Example

Now, alerts look horrible since I’m unsure of how to properly create a template.

Grafana_8_3_5_Slack_Alert_Example

I decided to try and replicate this type of alert using Integrated Alerts, and the only option is Node High CPU Load, which is a percentage. I need the value of node_load1.

So, while in Alert Rule Templates, I’ve tried to add a very simplistic version just to get something going where I can expand on it later with more details. I am completely new to this.

I tried adding the following to a new Alert Rule Template:

---
templates:
  - name: CPU Normalized Load
    summary: Load Alert
    expr: |-
      node_load1{node_name=~".*admin.*", mode!="idle"} 
      > [[ .threshold ]]
    params:
      - name: threshold
        summary: The value of the normalized load
        type: float
        range: [0, 3]
        value: 2
    for: 5m
    severity: warning
    annotations:
      summary: Currently experiencing load issues. ({{ $labels.service_name }})
      description: |-
        Normalized load for {{ $labels.node_name }} is currently {{ $value }}

However, I get a “Failed to parse rule template” error.

How can I create a simple alert that gives me the node_name, a description, a URL that takes me to the panel in question (in view mode), and that shows the value reported?

1 Like

You were close…just missing the template version…try this:

templates:
  - name: CPU Normalized Load
    version: 1
    summary: Load Alert
    expr: |-
      node_load1{node_name=~".*admin.*", mode!="idle"} 
      > [[ .threshold ]]
    params:
      - name: threshold
        summary: The value of the normalized load
        type: float
        range: [0, 3]
        value: 2
    for: 5m
    severity: warning
    annotations:
      summary: Currently experiencing load issues. ({{ $labels.service_name }})
      description: |-
        Normalized load for {{ $labels.node_name }} is currently {{ $value }}

I will create (if I can’t find one already) a jira ticket to show the error on the screen…I found it by:

docker exec -it pmm-server bash then tail -f /srv/logs/pmm-managed.log and submitting it…that resulted in the following error:

failed to parse rule template form request: +unexpected version 0 component=management/ia/templates

so even that wouldn’t have been super helpful without knowing but perhaps this will help someone else in the future creating a template.

Oh, and P.S. there are improvements coming to alerting that will make it a bit easier to get rules created faster without being a YAML guru.

2 Likes

You’re not going to believe this, but it’s been so long since I’ve come back to this, I can’t find the location of the template file now. Lol. I still accept your answer though!

1 Like