PXC on Kubernetes - ProxySQL load balancing

Hi Community,

We’re currently working on a MySQL cluster setup in order to replace our current MySQL standalone server.
We’ve setup a K8S cluster last year, and we wanted to test PXC on this cluster.

We started from PXC operator Helm chart + PXC db Helm Chart to make things easier.
DB cluster seems OK, I have 3 db pods running and synchronised, whith fault tolerance.

However, I am surprised about ProxySQL native configuration (which I enabled in replacement of HAproxy), more specificly about default configuration of load balancing.

bash-4.4$ proxysql-admin --status

mysql_galera_hostgroups row for writer-hostgroup: 11
+--------+--------+---------------+---------+--------+-------------+-----------------------+------------------+
| writer | reader | backup-writer | offline | active | max_writers | writer_is_also_reader | max_trans_behind |
+--------+--------+---------------+---------+--------+-------------+-----------------------+------------------+
| 11     | 10     | 12            | 13      | 1      | 1           | 1                     | 100              |
+--------+--------+---------------+---------+--------+-------------+-----------------------+------------------+

mysql_servers rows for this configuration
+---------------+-------+------------------------------------------------------------------------------------------+------+---------+---------+----------+---------+-----------+
| hostgroup     | hg_id | hostname                                                                                 | port | status  | weight  | max_conn | use_ssl | gtid_port |
+---------------+-------+------------------------------------------------------------------------------------------+------+---------+---------+----------+---------+-----------+
| reader        | 10    | db-cluster01-prod-pxc-pxc-0.db-cluster01-prod-pxc-pxc.db-cluster1-prod.svc.cluster.local | 3306 | ONLINE  | 1000000 | 1000     | 0       | 0         |
| reader        | 10    | db-cluster01-prod-pxc-pxc-1.db-cluster01-prod-pxc-pxc.db-cluster1-prod.svc.cluster.local | 3306 | ONLINE  | 1000    | 1000     | 0       | 0         |
| reader        | 10    | db-cluster01-prod-pxc-pxc-2.db-cluster01-prod-pxc-pxc.db-cluster1-prod.svc.cluster.local | 3306 | ONLINE  | 1000    | 1000     | 0       | 0         |
| writer        | 11    | db-cluster01-prod-pxc-pxc-0.db-cluster01-prod-pxc-pxc.db-cluster1-prod.svc.cluster.local | 3306 | ONLINE  | 1000000 | 1000     | 0       | 0         |
| writer        | 11    | db-cluster01-prod-pxc-pxc-1.db-cluster01-prod-pxc-pxc.db-cluster1-prod.svc.cluster.local | 3306 | SHUNNED | 1000    | 1000     | 0       | 0         |
| writer        | 11    | db-cluster01-prod-pxc-pxc-2.db-cluster01-prod-pxc-pxc.db-cluster1-prod.svc.cluster.local | 3306 | SHUNNED | 1000    | 1000     | 0       | 0         |
| backup-writer | 12    | db-cluster01-prod-pxc-pxc-1.db-cluster01-prod-pxc-pxc.db-cluster1-prod.svc.cluster.local | 3306 | ONLINE  | 1000    | 1000     | 0       | 0         |
| backup-writer | 12    | db-cluster01-prod-pxc-pxc-2.db-cluster01-prod-pxc-pxc.db-cluster1-prod.svc.cluster.local | 3306 | ONLINE  | 1000    | 1000     | 0       | 0         |
+---------------+-------+------------------------------------------------------------------------------------------+------+---------+---------+----------+---------+-----------+

The same pod (pxc-0) is used for both write & read request because of the weight. I can undertsand this config for writer hostgroup (to avoid conflict), but to get better performances and scalability I exepected SELECT request to be balanced on other pods.

So here are my questions:

  • Is it a bad idea to try to split writes & select ? Percona is better than me in mysql stuff, I guess we have this default config for a good reason…
  • If OK, what is the best way to perform this split ? I tried an 'UPDATE mysql_servers SET weight=‘100’ WHERE hostgroup_id=‘10’ AND hostname=‘db-cluster01-prod-pxc-pxc-0.db-cluster01-prod-pxc-pxc.db-cluster1-prod.svc.cluster.local’; but the value is instantly reset to 1000000.

I read the following page, which seems to talk about the same problem, but seems that there is no solution ? Seems strange, ProxySQL is not very usefull if not able to split SQL requests
https://jira.percona.com/browse/K8SPXC-735

Thanks :slight_smile:

– Helm chart:

allowUnsafeConfigurations: false
enableCRValidationWebhook: false
finalizers:
  - delete-pxc-pods-in-order
  - delete-proxysql-pvc
  - delete-pxc-pvc
fullnameOverride: ''
haproxy:
  enabled: false
ignoreAnnotations: null
ignoreLabels: null
initImage: ''
logcollector:
  enabled: true
  image: ''
  imagePullSecrets: []
  resources:
    limits: {}
    requests:
      cpu: 200m
      memory: 100M
nameOverride: ''
operatorImageRepository: percona/percona-xtradb-cluster-operator
pause: false
pmm:
  enabled: true
  image:
    repository: percona/pmm-client
    tag: 2.28.0
  imagePullSecrets: []
  resources:
    limits: {}
    requests:
      cpu: 300m
      memory: 150M
  serverHost: xxxxxxxxxx
  serverUser: admin
proxysql:
  affinity:
    antiAffinityTopologyKey: kubernetes.io/hostname
  annotations: {}
  enabled: true
  gracePeriod: 30
  image: ''
  imagePullSecrets: []
  labels: {}
  livenessDelaySec: 300
  nodeSelector: {}
  persistence:
    accessMode: ReadWriteOnce
    enabled: true
    size: 7Gi
  podDisruptionBudget:
    maxUnavailable: 1
  readinessDelaySec: 15
  resources:
    limits: {}
    requests:
      cpu: 800m
      memory: 2G
  sidecarPVCs: []
  sidecarResources:
    limits: {}
    requests: {}
  sidecarVolumes: []
  sidecars: []
  size: 3
  tolerations: []
  configuration: |
    datadir="/var/lib/proxysql"
    admin_variables =
    {
      admin_credentials="proxyadmin:XXXXXXXXXXXXXX"
      mysql_ifaces="0.0.0.0:6032"
      refresh_interval=2000

      cluster_username="proxyadmin"
      cluster_password="XXXXXXXXXXXX"
      checksum_admin_variables=false
      checksum_ldap_variables=false
      checksum_mysql_variables=false
      cluster_check_interval_ms=200
      cluster_check_status_frequency=100
      cluster_mysql_query_rules_save_to_disk=true
      cluster_mysql_servers_save_to_disk=true
      cluster_mysql_users_save_to_disk=true
      cluster_proxysql_servers_save_to_disk=true
      cluster_mysql_query_rules_diffs_before_sync=1
      cluster_mysql_servers_diffs_before_sync=1
      cluster_mysql_users_diffs_before_sync=1
      cluster_proxysql_servers_diffs_before_sync=1
    }

    mysql_variables=
    {
      monitor_password="XXXXXXXXXXXX"
      monitor_galera_healthcheck_interval=1000
      threads=2
      max_connections=2048
      default_query_delay=0
      default_query_timeout=600000
      poll_timeout=2000
      interfaces="0.0.0.0:3306;0.0.0.0:33062"
      default_schema="information_schema"
      stacksize=1048576
      connect_timeout_server=10000
      monitor_history=60000
      monitor_connect_interval=20000
      monitor_ping_interval=10000
      ping_timeout_server=200
      commands_stats=true
      sessions_sort=true
      have_ssl=false
      ssl_p2s_ca=""
      ssl_p2s_cert=""
      ssl_p2s_key=""
      ssl_p2s_cipher="ECDHE-RSA-AES128-GCM-SHA256"
    }
pxc:
  affinity:
    antiAffinityTopologyKey: kubernetes.io/hostname
  annotations: {}
  autoRecovery: true
  certManager: false
  disableTLS: true
  gracePeriod: 600
  image:
    repository: percona/percona-xtradb-cluster
    tag: 8.0.27-18.1
  imagePullSecrets: []
  labels: {}
  livenessDelaySec: 300
  livenessProbes:
    failureThreshold: 3
    initialDelaySeconds: 300
    periodSeconds: 10
    successThreshold: 1
    timeoutSeconds: 5
  nodeSelector: {}
  persistence:
    accessMode: ReadWriteOnce
    enabled: true
    size: 220Gi
  podDisruptionBudget:
    maxUnavailable: 1
  readinessDelaySec: 15
  readinessProbes:
    failureThreshold: 5
    initialDelaySeconds: 15
    periodSeconds: 30
    successThreshold: 1
    timeoutSeconds: 15
  resources:
    limits: {}
    requests:
      cpu: 2
      memory: 30G
  sidecarPVCs: []
  sidecarResources:
    limits: {}
    requests: {}
  sidecarVolumes: []
  sidecars: []
  size: 3
  tolerations: []
  configuration: >
    [mysqld]

     ### Specific mysql.cnf settings
secrets:
  passwords: xxxxxxxx
  tls: {}
tls: {}
updateStrategy: SmartUpdate
upgradeOptions:
  apply: 8.0-recommended
  schedule: 0 4 * * *
  versionServiceEndpoint: https://check.percona.com

I think it’s similar to my post around HG query rules:

Hi, I just faced a similar issue. It “seems” the content of runtime_mysql_servers frequently overwrites mysql_servers. I don’t know why. However, if you are quick like using:

mysql> update mysql_servers set weight = 1000 where hostgroup_id = 10;load mysql servers to runtime;

It works like 90% of the times (roughly).

I agree with you guys. I can see several issues here:

  1. It sounds weird to me that Percona ProxySQL Admin Tool has different default config than PXC Operator.
  2. I haven’t found any configuration option in PXC CRD, to change WRITERS_ARE_READERS=’yes’ back to ‘backup’. In my opinion “backup” sounds more reasonable, because if we have single writer node, that can also do select operations if they are in transactions with other operations. So backup writers are less utilized and could be preferred to use for read-only operations.
  3. ProxySQL Admin tool has a SQL: “UPDATE mysql_servers SET weight = 1000000 WHERE hostname = ‘$writer_ws_ip’ AND port = $writer_ws_port AND hostgroup_id = $WRITER_HOSTGROUP_ID;". But I haven’t seen any line to update weight of master node on reader hostgroup. This seems to be a bug for me, which should be solved by Percona.
  4. I found that WRITERS_ARE_READERS was set to “yes” in K8SPXC-553. Regarding the description of ticket, this was a workaround on an edge case where cluster is running on single node, but this parameter obviously has effect on the whole cluster’s performance. I recommend to reconsider the real fix on this issue, or at least give option to the users to change these config parameters.

Yes, PXC operator is reverting manual changes frequently, so this SQL is useful only to validate the configuration issue.

I’m wondering if anyone using PXC & ProxySQL in production, because this thread is more than 2 years old now.