QAN PostgresSQL pg_stat_monitor not working

I am trying to get QAN monitoring for my Postgres database. I have the Postgres server talking to PMM and I am getting OS and Postgres metrics but not QAN.

database version: PostgreSQL 12.14

pmm-admin list

Service type Service name Address and port Service ID
PostgreSQL dev-rms-db1-postgresql 127.0.0.1:5432 /service_id/b40789d4-2fa4-42ab-91ff-d1e1737a97d7

Agent type Status Metrics Mode Agent ID Service ID Port
pmm_agent Connected /agent_id/fcae9b4e-36f1-4870-a380-28a18d933e35 0
node_exporter Running push /agent_id/6f27184d-2164-4f84-b4d5-cd96a79c7294 42000
postgres_exporter Running push /agent_id/ba4c4f4d-d3e9-40f4-83ce-e7584619af43 /service_id/b40789d4-2fa4-42ab-91ff-d1e1737a97d7 42002
postgresql_pgstatmonitor_agent Waiting /agent_id/c4a94e90-7c2d-4c46-abd1-fed90eef9d41 /service_id/b40789d4-2fa4-42ab-91ff-d1e1737a97d7 0
vmagent Running push /agent_id/fbe0a404-e60a-4744-b6bc-d3ec03fdd4af 42001

I am seeing this error in the logs:
pmm-agent[11021]: ERRO[2023-04-17T19:42:24.687-05:00] getNewBuckets failed: failed to query pg_stat_monitor: pq: column pg_stat_monitor.plans does not exist agentID=/agent_id/c4a94e90-7c2d-4c46-abd1-fed90eef9d41 component=agent-builtin type=qan_postgresql_pgstatmonitor_agent

and in the Postgres logs:
2023-04-17 19:14:24.514 CDT [11205] ERROR: column pg_stat_monitor.plans does not exist at character 1170
2023-04-17 19:14:24.514 CDT [11205] HINT: Perhaps you meant to reference the column “pg_stat_monitor.planid”.
2023-04-17 19:14:24.514 CDT [11205] STATEMENT: SELECT /* pmm-agent:pgstatmonitor */ “pg_stat_monitor”.“bucket”, “pg_stat_monitor”.“client_ip”, “pg_stat_monitor”.“query”, “pg_stat_monitor”.“calls”, “pg_stat_monitor”.“shared_blks_hit”, “pg_stat_monitor”.“shared_
blks_read”, “pg_stat_monitor”.“shared_blks_dirtied”, “pg_stat_monitor”.“shared_blks_written”, “pg_stat_monitor”.“local_blks_hit”, “pg_stat_monitor”.“local_blks_read”, “pg_stat_monitor”.“local_blks_dirtied”, “pg_stat_monitor”.“local_blks_written”, “pg_stat_monit
or”.“temp_blks_read”, “pg_stat_monitor”.“temp_blks_written”, “pg_stat_monitor”.“blk_read_time”, “pg_stat_monitor”.“blk_write_time”, “pg_stat_monitor”.“resp_calls”, “pg_stat_monitor”.“cpu_user_time”, “pg_stat_monitor”.“cpu_sys_time”, “pg_stat_monitor”.“rows”, “p
g_stat_monitor”.“relations”, “pg_stat_monitor”.“datname”, “pg_stat_monitor”.“userid”, “pg_stat_monitor”.“top_queryid”, “pg_stat_monitor”.“planid”, “pg_stat_monitor”.“query_plan”, “pg_stat_monitor”.“top_query”, “pg_stat_monitor”.“application_name”, “pg_stat_moni
tor”.“cmd_type”, “pg_stat_monitor”.“cmd_type_text”, “pg_stat_monitor”.“elevel”, “pg_stat_monitor”.“sqlcode”, “pg_stat_monitor”.“message”, “pg_stat_monitor”.“pgsm_query_id”, “pg_stat_monitor”.“plans”, “pg_stat_monitor”.“dbid”, “pg_stat_monitor”.“total_time”, “pg
_stat_monitor”.“min_time”, “pg_stat_monitor”.“max_time”, “pg_stat_monitor”.“mean_time”, “pg_stat_monitor”.“stddev_time”, “pg_stat_monitor”.“bucket_start_time” FROM “pg_stat_monitor” WHERE queryid IS NOT NULL AND query IS NOT NULL AND bucket_done

This does not show that column:
\d pg_stat_monitor;
View “public.pg_stat_monitor”
Column | Type | Collation | Nullable | Default
---------------------±-------------------------±----------±---------±--------
bucket | bigint | | |
bucket_start_time | timestamp with time zone | | |
userid | oid | | |
username | text | | |
dbid | oid | | |
datname | text | | |
client_ip | inet | | |
pgsm_query_id | bigint | | |
queryid | bigint | | |
top_queryid | bigint | | |
query | text | | |
comments | text | | |
planid | bigint | | |
query_plan | text | | |
top_query | text | | |
application_name | text | | |
relations | text | | |
cmd_type | integer | | |
cmd_type_text | text | | |
elevel | integer | | |
sqlcode | text | | |
message | text | | |
calls | bigint | | |
total_time | double precision | | |
min_time | double precision | | |
max_time | double precision | | |
mean_time | double precision | | |
stddev_time | double precision | | |
rows | bigint | | |
shared_blks_hit | bigint | | |
shared_blks_read | bigint | | |
shared_blks_dirtied | bigint | | |
shared_blks_written | bigint | | |
local_blks_hit | bigint | | |
local_blks_read | bigint | | |
local_blks_dirtied | bigint | | |
local_blks_written | bigint | | |
temp_blks_read | bigint | | |
temp_blks_written | bigint | | |
blk_read_time | double precision | | |
blk_write_time | double precision | | |
resp_calls | text | | |
cpu_user_time | double precision | | |
cpu_sys_time | double precision | | |
bucket_done | boolean | | |

Do you have any ideas? Should I try adding the column? Did something not get installed correctly?

Thanks,
Luke

Hey Luke

I think I had somewhat similar problem to the one described by you. In my case I solve it with multiple steps:

  • re-installing the packages pmm-agent and percona-pg_stat_monitor12 (pmm-agent was updated but not the pg_stat_monitor)
  • reconfigure the extension based on the latest config from https://docs.percona.com/pg-stat-monitor/
  • DROP/CREATE EXTENSION pg_stat_monitor;
  • double check the permissions of the dedicated monitoring user and make sure it’s working
  • delete and re-add the pmm service - pmm-admin remove/add postgresql [SERVICE_NAME]

Probably all of this wasn’t going to be needed if I simply execute the update procedure of the PG extension - Upgrade - pg_stat_monitor Documentation.
May be you could give it a shot.

Hope this is somewhat helpful.
Cheers

1 Like

Make sure you connect to your postgres database and create the pg_stat_monitor extension in there.