Conditions for ExternalVM are skipped when HA_mode is true

ddkozyreva · March 17, 2026, 10:50pm

Hi! I have two problems with HA mode on the largest installations. And both of them probably caused by this commit which added `&& !HA.Enabled` to the condition that previously skipped config validation for external VM setups. I’d like to know why these conditions were added and whether this is a mistake. I’ll list my problems caused by this below for me:

This appears to have accidentally re-enabled a local dry-run validation of the full scrape config on every update. The dry-run spawns a local `victoriametrics` process — which seems unnecessary when using an external VM, since the actual scrape config consumer is `vmagent`, a different binary. On large installations (3000+ agents) this takes several minutes, exceeds the context timeout, and crashes the leader repeatedly, destabilizing Raft quorum.

Proposed fix: skip `validateConfig` when ExternalVM is configured, keeping the `reload` call intact.

Same commit, likely same root cause. Two conditions in `populateConfig` both had `&& !HA.Enabled` guards. With ExternalVM + HA both evaluate to false, so neither scrape config block gets added. The generated `victoriametrics-promscrape.yml` ends up with only internal jobs and no agent targets — metrics collection stops silently. Proposed fix: remove the `!HA.Enabled` constraint from both conditions, as HA state doesn’t seem to affect which scrape configs should be generated.

Tested on PMM 3.6.0, Docker HA cluster, 3 nodes, ~6900 agents, external PostgreSQL, external VictoriaMetrics.

ddkozyreva · March 17, 2026, 11:09pm

github.com/percona/pmm

managed/services/victoriametrics/victoriametrics.go

v3


      
          	svc.RequestConfigurationUpdate()
          }
          
          // ID returns the service identifier.
          func (svc *Service) ID() string {
          	return "victoriametrics"
          }
          
          // updateConfiguration updates VictoriaMetrics configuration.
          func (svc *Service) updateConfiguration(ctx context.Context) error {
          	if svc.params.ExternalVM() && !svc.haService.Params().Enabled {
          		return nil
          	}
          	start := time.Now()
          	defer func() {
          		if dur := time.Since(start); dur > time.Second {
          			svc.l.Warnf("updateConfiguration took %s.", dur)
          		}
          	}()
          
          	cfg, err := svc.buildVMConfig()

github.com/percona/pmm

managed/services/victoriametrics/victoriametrics.go

v3


      
          			pmmServerNodeName = svc.haService.Params().NodeID
          		}
          
          		resolutions := settings.MetricsResolutions
          		if cfg.GlobalConfig.ScrapeInterval == 0 {
          			cfg.GlobalConfig.ScrapeInterval = config.Duration(resolutions.LR)
          		}
          		if cfg.GlobalConfig.ScrapeTimeout == 0 {
          			cfg.GlobalConfig.ScrapeTimeout = ScrapeTimeout(resolutions.LR)
          		}
          		if !svc.params.ExternalVM() && !svc.haService.Params().Enabled {
          			cfg.ScrapeConfigs = append(cfg.ScrapeConfigs, scrapeConfigForVictoriaMetrics(svc.l, resolutions.HR, svc.params))
          		}
          		if svc.params.ExternalVM() && !svc.haService.Params().Enabled {
          			cfg.ScrapeConfigs = append(cfg.ScrapeConfigs, scrapeConfigForInternalVMAgent(resolutions.HR, svc.baseURL.Host))
          		}
          		cfg.ScrapeConfigs = append(cfg.ScrapeConfigs, scrapeConfigForVMAlert(resolutions.HR, pmmServerNodeName))
          		cfg.ScrapeConfigs = append(cfg.ScrapeConfigs, addInternalServicesToScrape(resolutions, svc, pmmServerNodeName)...)
          		if pointer.GetBool(settings.Nomad.Enabled) {
          			cfg.ScrapeConfigs = append(cfg.ScrapeConfigs, scrapeConfigForNomadServer(resolutions.MR, pmmServerNodeName))
          		}

github.com/percona/pmm

managed/services/victoriametrics/victoriametrics.go

v3


      
          		resolutions := settings.MetricsResolutions
          		if cfg.GlobalConfig.ScrapeInterval == 0 {
          			cfg.GlobalConfig.ScrapeInterval = config.Duration(resolutions.LR)
          		}
          		if cfg.GlobalConfig.ScrapeTimeout == 0 {
          			cfg.GlobalConfig.ScrapeTimeout = ScrapeTimeout(resolutions.LR)
          		}
          		if !svc.params.ExternalVM() && !svc.haService.Params().Enabled {
          			cfg.ScrapeConfigs = append(cfg.ScrapeConfigs, scrapeConfigForVictoriaMetrics(svc.l, resolutions.HR, svc.params))
          		}
          		if svc.params.ExternalVM() && !svc.haService.Params().Enabled {
          			cfg.ScrapeConfigs = append(cfg.ScrapeConfigs, scrapeConfigForInternalVMAgent(resolutions.HR, svc.baseURL.Host))
          		}
          		cfg.ScrapeConfigs = append(cfg.ScrapeConfigs, scrapeConfigForVMAlert(resolutions.HR, pmmServerNodeName))
          		cfg.ScrapeConfigs = append(cfg.ScrapeConfigs, addInternalServicesToScrape(resolutions, svc, pmmServerNodeName)...)
          		if pointer.GetBool(settings.Nomad.Enabled) {
          			cfg.ScrapeConfigs = append(cfg.ScrapeConfigs, scrapeConfigForNomadServer(resolutions.MR, pmmServerNodeName))
          		}
          		// In HA mode, skip external exporter agents if this node is not the leader
          		skipExternalAgents := !svc.haService.IsLeader()
          		return AddScrapeConfigs(svc.l, cfg, tx.Querier, &resolutions, nil, false, skipExternalAgents)

Topic		Replies	Views
Add custom lines into vmagent under pmm-agent	7	202	April 16, 2025
PMM with external victoria metrics server Percona Monitoring and Management (PMM) pmm , pmm-doc	3	554	June 19, 2024
How to add job on victoriametrics PMM 2.x closed-no-reply	0	774	May 3, 2022
Victoriametrics cpu usage looks much higher than prometheus PMM 2.x	13	4247	July 29, 2022
External VictoriaDB in PMM setup Percona Monitoring and Management (PMM)	3	758	February 17, 2023

Conditions for ExternalVM are skipped when HA_mode is true

Related topics