Not the answer you need?
Register and ask your own question!

Memory management and RAM OOM kill for PXC pods

Hello,

In our company, we are using Percona XtraDB Cluster operator to provide one MySQL cluster for each one of our customers due to internal reasons not worth diving into right now.

As our tests approach the final stages of development, we set Kubernetes resource limits to cap the RAM each PXC pod would be able to use but we had to remove this specific configuration because it appears the PXC pods can't/won't respect any memory boundaries. Once an OOM kills any PXC pod, usually in the middle of a regular workload, the cluster can't come back online on it's own and is precisely why we just turned the limit off for the time being.

As our undestanding of RAM usage in MySQL is limited and PXC also deploys the Galera replication, not to mention the RAM management done by Kubernetes could be a factor here, we are reaching out to see what could be done to enforce the resources as some configuration to mysqld and/or other programs so they never even allocate more RAM than they should, avoiding any OOM kill in the first place.

Currently, we are using my.cnf to configure the PXC pod, but to no avail. Any time we try to restore a database with huge INSERT INTO queries or other operations like that the pod inevitably allocates however much memory it wants to... Also, once allocated, the pods never deallocate it, hence possibly being a Kubernetes behavior after some memory spike.

This is a huge problem since, as there will be a whole lot of clusters in our environment, we need to set a reasonably low memory usage to each pod as CPU/RAM is the bulk of our cost structure. Even considering the memory limit is an arbitrary value, the fact that we can't even set 1GB as per the example is the problem we are trying to figure a solution for.

Regards from Brazil, DAVI

Tagged:

Comments

  • DGBDGB Inactive User Role
    Hi Davi!
    Did you set both pxc.resources.requests.memory and pxc.resources.limits.memory ?
  • davidavi Current User Role Contributor
    I did, yes.
    Keeping to the example configuration, I set them both to 1GB of RAM.
    The OOM kill occurs once the pod goes beyond that limit. We see this happening anytime a mysqldump is used to restore a database.
  • DGBDGB Inactive User Role
    All right, thanks for confirming. This is indeed suspicious. Would you mind filing a bug report here https://jira.percona.com/projects/K8SPXC ? If not i can do it for yourself. Obrigado!
  • vadimtkvadimtk Contributor Percona Staff Role
    can you show your my.cnf ?
    If you are using huge INSERT INTO that may force Galera to allocate memory to accommodate your transaction. How huge is huge INSERT INTO?
  • davidavi Current User Role Contributor
    edited September 12
    @DGB I've opened a report, the link is: https://jira.percona.com/browse/K8SPXC-441
    @vadimtk I'm attaching the my.cnf file below and I did more tests to gather information I hope helps in some capacity.
    From one of the mysqldump files we have, I've isolated one table that usually triggers the memory spike and I've restored only said table. The SQL file (plaintext) has ~1.3G divided into 1255 INSERT INTO queires, yielding roughly 1.04MB per query (a simple division). This specific table does not have any blob column, it's all text.
    Using the configuration at the end, I imported the 1.3GB SQL. The result was a database with total size amounting to 1.72 GB (308.55 MB of index, 1.42 MB of data). And now the RAM is perennially at 1180 MB, 18% above the would be limit of 1GB in the example CR configuration.
    Here is the file (mainly the default values we see in PXC):
    $ cat /etc/percona-xtradb-cluster.conf.d/init.cnf 
    [mysqld]
    default-time-zone='-03:00'
    default_authentication_plugin=mysql_native_password
    lower_case_table_names=1
    partial_revokes=on
    sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION
    tls_version=TLSv1,TLSv1.1,TLSv1.2
    innodb_buffer_pool_size = 640M
    innodb_flush_method     = O_DIRECT
    join_buffer_size        = 256K
    key_buffer_size         = 8M
    max_allowed_packet      = 24M
    max_connections         = 20
    read_buffer_size        = 128K
    read_rnd_buffer_size    = 256K
    sort_buffer_size        = 256K
    thread_cache_size       = 10
    thread_stack            = 256K
    tmp_table_size          = 16M
    log_bin_trust_function_creators=1
    wsrep_provider_options="gcache.size=1G; gcache.recover=yes"


  • vadimtkvadimtk Contributor Percona Staff Role
    @davi I think this is amount of memory that Galera requires to process mysqldump. It would be better if replication could respect memory limits, but it will allocate memory needed to handle transactions.
  • lucasfernandeslucasfernandes Current User Role Novice

    Hi, @vadimtk


    We are facing a similar issue as @davi , trying to create some "resource manager" for our pods.

    Does Galera does not explicitly limit about resource allocation? We found some performance tips, but no "hard limits".


    The memory sees to increase and do not release, and the most consuming query (like a big insert or update) sets the watermark (~1.5GB RAM)


    Thanks

Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.