Need Advice on Disk Space Cleanup for PMM Server 2.x (Docker)

Hi PMM Community,

We are running PMM Server 2.x in Docker on RHEL and noticed that the PMM data volume has grown significantly over time.

Current environment:

  • PMM Server: 2.x (Docker deployment)
  • Docker image: percona/pmm-server:2
  • Uptime: approximately 2 years
  • Docker volume size:
docker system df -v

Local Volumes space usage:

VOLUME NAME   LINKS   SIZE
pmm-data      1       110.1GB

The container writable layer is only about 5.5 GB, so most of the storage consumption appears to be inside the persistent pmm-data volume.

Before performing any cleanup, we would like to understand the recommended and supported way to reclaim disk space safely.

Questions:

  1. What is the recommended procedure to identify which PMM component is consuming the most storage (VictoriaMetrics, ClickHouse/QAN, logs, etc.)?
  2. Is there a supported method to reduce storage usage without losing all historical monitoring data?
  3. Can retention policies be adjusted retroactively to reclaim existing space?
  4. Are there any PMM maintenance commands or cleanup procedures recommended by Percona?
  5. Has anyone successfully reduced a large PMM data volume while preserving recent monitoring history?

Any guidance or best practices would be greatly appreciated.

Thank you.

@Ly_Kimmeng

What is the recommended procedure to identify which PMM component is consuming the most storage (VictoriaMetrics, ClickHouse/QAN, logs, etc.)?

By executing the command below, you can quickly check exactly what component is taking majority of storage space.

**du -sh /srv/* | sort -hr**
93M	/srv/grafana
87M	/srv/postgres14
27M	/srv/clickhouse
13M	/srv/victoriametrics
6.3M	/srv/logs
52K	/srv/nginx
8.0K	/srv/prometheus
8.0K	/srv/pmm-agent
8.0K	/srv/nomad
8.0K	/srv/alerting
4.0K	/srv/pmm-encryption.key
4.0K	/srv/pmm-distribution
4.0K	/srv/backup

Is there a supported method to reduce storage usage without losing all historical monitoring data?
Can retention policies be adjusted retroactively to reclaim existing space?

If the [/srv/logs] directory consumes space, related files can be truncated; however, for the PMM component, any cleanup may result in the loss of historical metrics.

Below is the documentation which better guides in dealing with high PMM disk usage. You can change the retention policy or limit tablestat/disable it as well if feasible for you.

VictoriaMetrics: downsampling feature

Limiting or disabling tablestat, especially if having tons of tables to be monitored there

Metric history/Data retention

Are there any PMM maintenance commands or cleanup procedures recommended by Percona?

If you could clarify exactly which component is taking up space, we may be able to better guide you on the appropriate solution. Please share the output of the initial command we shared above.

Has anyone successfully reduced a large PMM data volume while preserving recent monitoring history?

Yes, the above procedure I shared will help you in lowering the hgh disk usage to some extent. But again if there are many Client say [500 or more] connect to a single PMM-Server the disk consumption will considerable high comparing to a setup where having a few PMM Client/database boxes. How many PMM Client/Agent you having which connectes to this PMM Server ?

Also, there are some changes introduced in PMM ≥ 2.41.0 which can cause rapid table size growth for table:asynchronous_metric_log. Bye the way, what PMM version you are using ?