Pmm-managed restarting - runtime error - 2.26.0

Since upgrade to PMM 2.26.0, PMM Managed is restarting every few seconds. It is receiving panic runtime error each time. Please see logs below. Error message is “panic: runtime error: invalid memory address or nil pointer dereference”.

INFO[2022-02-24T13:50:17.423+00:00] vmalert configuration not changed, doing nothing. component=supervisord
INFO[2022-02-24T13:50:17.423+00:00] alertmanager configuration not changed, doing nothing. component=supervisord
INFO[2022-02-24T13:50:17.423+00:00] qan-api2 configuration not changed, doing nothing. component=supervisord
INFO[2022-02-24T13:50:17.423+00:00] grafana configuration not changed, doing nothing. component=supervisord
INFO[2022-02-24T13:50:17.423+00:00] dbaas-controller configuration not changed, doing nothing. component=supervisord
INFO[2022-02-24T13:50:17.423+00:00] prometheus configuration not changed, doing nothing. component=supervisord
INFO[2022-02-24T13:50:17.423+00:00] victoriametrics configuration not changed, doing nothing. component=supervisord
INFO[2022-02-24T13:50:17.423+00:00] Checking VictoriaMetrics… component=setup
INFO[2022-02-24T13:50:17.424+00:00] Checking VMAlert… component=setup
INFO[2022-02-24T13:50:17.424+00:00] Checking Alertmanager… component=setup
INFO[2022-02-24T13:50:17.425+00:00] Setup completed. component=setup
INFO[2022-02-24T13:50:17.776+00:00] Starting services… component=main
INFO[2022-02-24T13:50:17.776+00:00] Starting… component=vmalert
INFO[2022-02-24T13:50:17.776+00:00] Starting… component=victoriametrics
INFO[2022-02-24T13:50:17.776+00:00] Starting… component=checks
INFO[2022-02-24T13:50:17.776+00:00] Starting… component=alertmanager
INFO[2022-02-24T13:50:17.776+00:00] Starting server on http://127.0.0.1:7772/ … component=JSON
INFO[2022-02-24T13:50:17.776+00:00] Starting server on http://127.0.0.1:7771/ … component=gRPC
INFO[2022-02-24T13:50:17.776+00:00] Starting server on http://127.0.0.1:7773/debug
Registered handlers:
http://127.0.0.1:7773/debug/metrics
http://127.0.0.1:7773/debug/vars
http://127.0.0.1:7773/debug/requests
http://127.0.0.1:7773/debug/events
http://127.0.0.1:7773/debug/pprof component=debug
INFO[2022-02-24T13:50:17.777+00:00] Using default SaaS host “check.percona.com”.
INFO[2022-02-24T13:50:17.777+00:00] Using SaaS host “check.percona.com”.
INFO[2022-02-24T13:50:17.777+00:00] Environment variable “PERCONA_PLATFORM_API_TIMEOUT” is not set, using “30s” as a default timeout for platform API. component=auth
INFO[2022-02-24T13:50:18.413+00:00] Starting Stream /agent.Agent/Connect … agent_id=pmm-server request=b3a418f8-9578-11ec-8421-1260e5508cc3
INFO[2022-02-24T13:50:18.415+00:00] Connected pmm-agent: &{ID:pmm-server Version:2.26.0 MetricsPort:0}. agent_id=pmm-server request=b3a418f8-9578-11ec-8421-1260e5508cc3
INFO[2022-02-24T13:50:18.415+00:00] Starting runStateChangeHandler … agent_id=pmm-server request=b3a418f8-9578-11ec-8421-1260e5508cc3
INFO[2022-02-24T13:50:18.780+00:00] Configuration reloaded. component=vmalert
INFO[2022-02-24T13:50:18.813+00:00] Configuration reloaded. component=alertmanager
INFO[2022-02-24T13:50:18.969+00:00] Done. component=victoriametrics
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x11afc0e]

goroutine 948 [running]:
github.com/percona/pmm-managed/services/victoriametrics.AddScrapeConfigs(0xc0006e2180, 0xc0006e2180, 0x14ff92f, 0x8, 0x187b300, 0x0)
/home/builder/rpm/BUILD/pmm-managed-6914083707b5478605e7bc816de8fc68b5f511f6/src/github.com/percona/pmm-managed/services/victoriametrics/prometheus.go:90 +0x68e
github.com/percona/pmm-managed/services/victoriametrics.(*Service).populateConfig.func1(0xc001030210)
/home/builder/rpm/BUILD/pmm-managed-6914083707b5478605e7bc816de8fc68b5f511f6/src/github.com/percona/pmm-managed/services/victoriametrics/victoriametrics.go:337 +0x625
gopkg.in/reform%2ev1.(*DB).InTransactionContext(0xc0011f5c01, {0x18630c8, 0xc00003e038}, 0x1863138, 0xc0010dcd90)
/home/builder/go/pkg/mod/gopkg.in/reform.v1@v1.5.1/db.go:93 +0xa2
gopkg.in/reform%2ev1.(*DB).InTransaction(…)
/home/builder/go/pkg/mod/gopkg.in/reform.v1@v1.5.1/db.go:74
github.com/percona/pmm-managed/services/victoriametrics.(*Service).populateConfig(0xc000572120, 0xc07e01276e5c34cb)
/home/builder/rpm/BUILD/pmm-managed-6914083707b5478605e7bc816de8fc68b5f511f6/src/github.com/percona/pmm-managed/services/victoriametrics/victoriametrics.go:322 +0x56
github.com/percona/pmm-managed/services/victoriametrics.(*Service).marshalConfig(0xc0001e8140, 0x0)
/home/builder/rpm/BUILD/pmm-managed-6914083707b5478605e7bc816de8fc68b5f511f6/src/github.com/percona/pmm-managed/services/victoriametrics/victoriametrics.go:215 +0x25
github.com/percona/pmm-managed/services/victoriametrics.(*Service).updateConfiguration(0xc0001e8140, {0x1863100, 0xc0005699e0})
/home/builder/rpm/BUILD/pmm-managed-6914083707b5478605e7bc816de8fc68b5f511f6/src/github.com/percona/pmm-managed/services/victoriametrics/victoriametrics.go:157 +0xb5
github.com/percona/pmm-managed/services/victoriametrics.(*Service).Run(0xc0001e8140, {0x1863138, 0xc0009f85d0})
/home/builder/rpm/BUILD/pmm-managed-6914083707b5478605e7bc816de8fc68b5f511f6/src/github.com/percona/pmm-managed/services/victoriametrics/victoriametrics.go:130 +0x325
main.main.func6()
/home/builder/rpm/BUILD/pmm-managed-6914083707b5478605e7bc816de8fc68b5f511f6/src/github.com/percona/pmm-managed/main.go:838 +0x65
created by main.main
/home/builder/rpm/BUILD/pmm-managed-6914083707b5478605e7bc816de8fc68b5f511f6/src/github.com/percona/pmm-managed/main.go:836 +0x3fe5

2 Likes

Hi @odemark1 , thanks for posting to the Percona forums!

I’ve shared your report with our Engineers, as I don’t see either see the immediate cause of the panic :frowning:

1 Like

@odemark1 thanks a lot for the information, I believe I have managed to re-create this issue and seen some similar logs on my instance pmm-managed. I would need some confirmation from you to be sure exactly if we have the same preconditions [PMM-9657] Upgrading to PMM-Server 2.26.0 causes pmm-managed Panic errors - Percona JIRA

  1. You were using PMM-Server 2.25.0
  2. You had remote MongoDB TLS/SSL instances connected on 2.25.0 before the upgrade?
1 Like

Hi Puneet-

Yes, PMM upgraded from 2.25.0 to 2.26.0. We upgrade AWS Marketplace PMM environments on a monthly basis for the last few years. We never had an issue with PMM upgrades.

MongoDB is not being used.

Thanks, Mark

1 Like

Sure, I have tried to reproduce your issue with the AMI instance without the MongoDB instance added and could not reproduce it.

It would be a great help if you could tell me more about your setup, what did the setup look like?

Regards,
Puneet Kala

1 Like

The setup is using the AWS Marketplace PMM setup. So, the install of PMM is done for us already. PMM is monitoring instances using SSL to connect to RDS MariaDB, RDS MySQL, Aurora MySQL, and also MySQL instances.

1 Like

Hi,
could you log into docker container, open psql shell and show result for the following query?
Select agent_id, agent_type, pmm_agent_id, status, version from agents;

Thank you.

1 Like

The table has 680 lines. Would you like to narrow down the records?

1 Like

is there any pmm-agent without version?
is there any agent with non existing pmm-agent id?

1 Like

Most with version are null except data below.

There is agent_id listed in the agent table for every line.

1 Like

We are looking for agents with type ‘pmm-agent’ with no version or for orphan agents which has corrupted pmm_agent_id field.
Could you share results for the following queries?
Select agent_id from agents where agent_type = 'pmm-agent' and version is null

Select agent_id from agents where agent_type = 'pmm-agent'
Select distinct pmm_agent_id from agents

1 Like

Here are the results.

image

image

1 Like

Could you check a version of pmm-agents on the instances from the first query results by running following command in CLI?
pmm-admin --version
And then put the value into version field in agents table for them.
TBH it looks really weird that these agents doesn’t have a version.

Thanks

1 Like

Here is the version information.
image

Null fields for version have been updated. There are no nulls anymore and 2.26.0 was added.

1 Like

This solved the problem. Thank You!

2 Likes