I am running a PXC 3 node cluster using everest. After a few days I am getting this error.
terminated
Reason:Reason: OOMKilled - exit code: 137
for the pmm-client container alone and this is happening for the first node alone.
Any help on how to address this or can someone point me to how to debug this thing.
OOM means there is not enough memory on that node. Can you increase the amount of memory? Use PMM to track memory usage and see which process is using too much.
This is a 32 gb instance. and there are no other significant pods scheduled in the node except for this 4 cpu 7gb pxc-0 db.
I did check here pmm sharing the screen shots
My question here would be if resources for pmm client alone can be increased?
Also i have no idea why the pxc-0 memory is hitting the limit so quickly either. i was doing some imports to tables but as you can see storage is just about 10gb so it is not a big database.
Later i turned off the pmm monitoring since the restarts keep happening. when i turn it back on it keeps happening again
Previously i got the same issue i uninstalled everest and did full install again it was fine for a week and then this happens after some data load of few gb.
Hi @beta-auction ,
Can you share:
K8s platform and version
Everest version
PMM server version
PXC version
How do you create a cluster in Everest, I mean the resources and nodes and do you set some custom database settings?
Thank you!
I am using RKE2 cluster, 1.30.2.
Everest used 1.1.0 (updated to 1.1.1 and still the same issue after turning monitoring off and then on)
PMM is 2.42 ( latest version)
PXC Version: 8.0.36-28.1 ( latest available in everest)
Cluster is created on aws, db nodes are are 3 ( all are m6a.2xlarge ). The percona cluster is 3 node medium config as specified in everest ui.
The db settings i have as
[mysqld]
max_connections = 2000
thread_cache_size = 48
thread_pool_size = 8
table_open_cache_instances = 8
wait_timeout = 900
interactive_timeout = 900
default_time_zone = '+00:00'
Apart from usual db operation for my apps, i had done few imports(all less than few gb). And as you can see in the pics above max db size less 10 gb. Few tables have around million rows.
balu
October 7, 2024, 6:44am
6
I notice similar behavior with an OVH cluster.
Everest Version: 1.2.0
pod description:
Init Containers:
pxc-init:
[...]
Image: docker.io/percona/percona-xtradb-cluster-operator:1.15.0
Image ID: docker.io/percona/percona-xtradb-cluster-operator@sha256:6f7d8d4e472b8c4d166573cc7bb714bbb0fdf1535142b6138c62fdecbf881df9
Port: <none>
Host Port: <none>
Command:
/pxc-init-entrypoint.sh
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 07 Oct 2024 08:03:34 +0200
Finished: Mon, 07 Oct 2024 08:03:37 +0200
Ready: True
Restart Count: 0
Limits:
cpu: 50m
memory: 50M
Requests:
cpu: 50m
memory: 50M
Environment: <none>
Mounts:
/var/lib/mysql from datadir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4p64t (ro)
Containers:
pmm-client:
[...]
Image: percona/pmm-client:2
Image ID: docker.io/percona/pmm-client@sha256:18dea613445566c9037134335a74f0ff2f93c5612054d4f83dfd4e0e89e2bbc6
Ports: 7777/TCP, 30100/TCP, 30101/TCP, 30102/TCP, 30103/TCP, 30104/TCP, 30105/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
State: Running
Started: Mon, 07 Oct 2024 08:27:14 +0200
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Mon, 07 Oct 2024 08:03:39 +0200
Finished: Mon, 07 Oct 2024 08:27:12 +0200
Ready: True
Restart Count: 1
Limits:
cpu: 100m
memory: 107374182400m
Requests:
cpu: 95m
memory: 101994987520m
Liveness: http-get http://:7777/local/Status delay=60s timeout=5s period=10s #success=1 #failure=3
Environment Variables from:
<DB_NAME>-env-vars-pxc Secret Optional: true
Environment:
PMM_SERVER: <PMM_URL>
CLIENT_PORT_LISTEN: 7777
CLIENT_PORT_MIN: 30100
CLIENT_PORT_MAX: 30105
POD_NAME: <POD_NAME> (v1:metadata.name)
POD_NAMESPASE: <NAMESPACE> (v1:metadata.namespace)
PMM_AGENT_SERVER_ADDRESS: <PMM_URL>
PMM_AGENT_SERVER_USERNAME: api_key
PMM_AGENT_SERVER_PASSWORD: <set to the key 'pmmserverkey' in secret 'internal-<DB_NAME>'> Optional: false
PMM_AGENT_LISTEN_PORT: 7777
PMM_AGENT_PORTS_MIN: 30100
PMM_AGENT_PORTS_MAX: 30105
PMM_AGENT_CONFIG_FILE: /usr/local/percona/pmm2/config/pmm-agent.yaml
PMM_AGENT_SERVER_INSECURE_TLS: 1
PMM_AGENT_LISTEN_ADDRESS: 0.0.0.0
PMM_AGENT_SETUP_METRICS_MODE: push
PMM_AGENT_SETUP: 1
PMM_AGENT_SETUP_FORCE: 1
PMM_AGENT_SETUP_NODE_TYPE: container
PMM_AGENT_SETUP_NODE_NAME: $(POD_NAMESPASE)-$(POD_NAME)
DB_TYPE: mysql
DB_USER: monitor
DB_PASSWORD: <set to the key 'monitor' in secret 'internal-<DB_NAME>'> Optional: false
DB_ARGS: --query-source=perfschema
DB_CLUSTER: pxc
DB_HOST: localhost
DB_PORT: 33062
CLUSTER_NAME: <CLUSTERNAME>
PMM_ADMIN_CUSTOM_PARAMS:
PMM_AGENT_PRERUN_SCRIPT: /var/lib/mysql/pmm-prerun.sh
PMM_AGENT_SIDECAR: true
PMM_AGENT_SIDECAR_SLEEP: 5
PMM_AGENT_PATHS_TEMPDIR: /tmp
Mounts:
/var/lib/mysql from datadir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4p64t (ro)
pxc:
[...]
Image: percona/percona-xtradb-cluster:8.0.36-28.1
Image ID: docker.io/percona/percona-xtradb-cluster@sha256:b5cc4034ccfb0186d6a734cb749ae17f013b027e9e64746b2c876e8beef379b3
Ports: 3306/TCP, 4444/TCP, 4567/TCP, 4568/TCP, 33062/TCP, 33060/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
Command:
/var/lib/mysql/pxc-entrypoint.sh
Args:
mysqld
State: Running
Started: Mon, 07 Oct 2024 08:03:40 +0200
Ready: True
Restart Count: 0
Limits:
cpu: 1500m
memory: 5G
Requests:
cpu: 1500m
memory: 5G
Liveness: exec [/var/lib/mysql/liveness-check.sh] delay=300s timeout=450s period=10s #success=1 #failure=3
Readiness: exec [/var/lib/mysql/readiness-check.sh] delay=15s timeout=450s period=30s #success=1 #failure=5
Environment Variables from:
<DB_NAME>-env-vars-pxc Secret Optional: true
Environment:
PXC_SERVICE: <DB_NAME>-pxc-unready
MONITOR_HOST: %
MYSQL_ROOT_PASSWORD: <set to the key 'root' in secret 'internal-<DB_NAME>'> Optional: false
XTRABACKUP_PASSWORD: <set to the key 'xtrabackup' in secret 'internal-<DB_NAME>'> Optional: false
MONITOR_PASSWORD: <set to the key 'monitor' in secret 'internal-<DB_NAME>'> Optional: false
CLUSTER_HASH: 1541532
OPERATOR_ADMIN_PASSWORD: <set to the key 'operator' in secret 'internal-<DB_NAME>'> Optional: false
LIVENESS_CHECK_TIMEOUT: 450
READINESS_CHECK_TIMEOUT: 450
DEFAULT_AUTHENTICATION_PLUGIN: caching_sha2_password
Mounts:
/etc/my.cnf.d from auto-config (rw)
/etc/mysql/init-file from mysql-init-file (rw)
/etc/mysql/mysql-users-secret from mysql-users-secret-file (rw)
/etc/mysql/ssl from ssl (rw)
/etc/mysql/ssl-internal from ssl-internal (rw)
/etc/mysql/vault-keyring-secret from vault-keyring-secret (rw)
/etc/percona-xtradb-cluster.conf.d from config (rw)
/tmp from tmp (rw)
/var/lib/mysql from datadir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4p64t (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
The pmm agent gets OOM’d when I try to import a table structure.
The RAM request also seems rather tight for the agent, doesn’t it?