Fix of "database is locked" issue in and How to migrate existing data to PMM 2.40

Description:

Like other users, we are also facing issues WRT SQLlite and graphana in mysql PMM 2.39 version, where PMM is getting crashed randomly with error 503 service unavailable followed by some traceID error. We found a lot of other clients are facing the same issue, the issue has not reappeared post we changed the journal_mode for SQLLite to WAL, and we are monitoring the same.
Below are the links for reference:
Percona issue link: Database is Locked pmm 2.38.1 - #8 by Naresh9999
After Adding 100 Servers PMM Load is Too High
After upgrade to 2.39.0, the grafana takes too much CPU - #5 by Luke03011
Percona jira link: [PMM-12415] PMM dashboard has a high CPU load and UI is unresponsive after adding 100 Servers. - Percona JIRA

Steps to Reproduce:

When the total number of mysql servers onboarding crossed > 110, we started seeing the issue.

Version:

PMM 2.39

Logs:

logger=context userId=0 orgId=0 uname= t=2023-09-28T10:31:53.778351685Z level=error msg=“Request Completed” method=GET path=/api/auth/key status=500 remote_addr=127.0.0.1 time_ms=8160 duration=8.160297511s size=67 referer=
logger=context t=2023-09-28T10:31:53.778486655Z level=error msg=“invalid API key” error=“context canceled” traceID=
logger=context userId=0 orgId=0 uname= t=2023-09-28T10:31:53.778665025Z level=error msg=“Request Completed” method=GET path=/api/auth/key status=500 remote_addr=127.0.0.1 time_ms=5031 duration=5.031578726s size=67 referer=
logger=context userId=0 orgId=1 uname= t=2023-09-28T10:31:53.778753786Z level=warn msg=“failed to update last use date for api key” id=94
logger=context userId=0 orgId=1 uname= t=2023-09-28T10:31:53.779290502Z level=error msg= error=“context canceled” traceID=
logger=context userId=0 orgId=1 uname= t=2023-09-28T10:31:53.779452903Z level=info msg=“Request Completed” method=GET path=/api/auth/key status=403 remote_addr=127.0.0.1 time_ms=8177 duration=8.177843464s size=39 referer=

size=39 referer=
logger=context userId=0 orgId=1 uname= t=2023-09-28T10:31:53.661641609Z level=warn msg=“failed to update last use date for api key” id=31
logger=context t=2023-09-28T10:31:53.662593353Z level=error msg=“invalid API key” error=“database is locked” traceID=
logger=context userId=0 orgId=0 uname= t=2023-09-28T10:31:53.66277235Z level=error msg=“Request Completed” method=GET path=/api/auth/key status=500 remote_addr=127.0.0.1 time_ms=5005 duration=5.005612613s size=67 referer=

Expected Result:

As per PMM recommendation, PMM should ideally never crash with 1000+ onboardings.
We have a huge HW with 32 core CPU, 128GB RAM and 6TB SSD.

Questions

When PMM 2.40 will be released?
Will PMM 2.40 have a HA feature?
Will the above crash issue be fixed in the PMM 2.40 version?
As we already have 100+ clients onboarded to PMM 2.39 with a lot of data which is configured in a VM, in case we are upgrading to PMM 2.40 to fix the above issue how to migrate existing data from 2.30 to 2.40?
In case we upgrade PMM server 2.39 to 2.40, is it mandatory to upgrade all the PMM clients to 2.40 as well OR PMM server 2.40 is compatible with older client versions?

Hi @pravata_dash ,

Kindly find the below in-line comments for your questions:

When PMM 2.40 will be released?

It should be released in 1 week.

Will the above crash issue be fixed in the PMM 2.40 version?

Yes it will fix the above crash issue. Kindly find the below JIRA link for the same.
https://jira.percona.com/browse/PMM-12173

As we already have 100+ clients onboarded to PMM 2.39 with a lot of data which is configured in a VM, in case we are upgrading to PMM 2.40 to fix the above issue how to migrate existing data from 2.30 to 2.40?

I assume you are using docker for PMM server. Kindly refer the documentation for upgradation procedure.

In case we upgrade PMM server 2.39 to 2.40, is it mandatory to upgrade all the PMM clients to 2.40 as well OR PMM server 2.40 is compatible with older client versions?

You will be able to use pmm-client 2.39 version, but it is highly recommended to use the same version as the pmm server.

Regards,
Parag

1 Like

@Parag_Bhayani
Thanks for all the shared details.

Can you please share the below detail as well?

When percona is planning to release the High Availability for PMM 2.X?

Hi @pravata_dash ,

Just to update, PMM 2.40 version has been released. You can find the release notes here.

For HA, currently it is in roadmap and it will be available soon (no ETA as of now).

Regards,
Parag

Thanks for the clarification.