XtraDB cross site replication not working

Hi,

I’ve recently started exploring XtraDB to replace a MySQL RDS instance. I’ve deployed the Percona Operator to my Kubernetes cluster and deployed an pxc cluster. The next requirement is to configure cross site replication to improve availability. I have a single Kubernetes cluster deployed in different DCs/Locations and I’m using taints and tolerations to schedule where my workload should run.

Using the operator, I’ve configure my primary pxc cluster replication:

pxc:
  size: 3
  image:
    repository: percona/percona-xtradb-cluster
    tag: 8.0.32-24.2
  autoRecovery: true
  expose:
    enabled: true
    type: ClusterIP
  replicationChannels:
  - name: cluster1_to_cluster2
    isSource: true

I then configured my secondary/standby pxc cluster:

pxc:
  size: 3
  image:
    repository: percona/percona-xtradb-cluster
    tag: 8.0.32-24.2
  autoRecovery: true
  replicationChannels:
  - name: cluster1_to_cluster2
    isSource: false
    configuration:
      sourceRetryCount: 3
      sourceConnectRetry: 60
      ssl: true
      sslSkipVerify: true
      ca: '/etc/mysql/ssl/ca.crt'
    sourcesList:
    - host: cluster1-pxc-0
      port: 3306
      weight: 100
    - host: cluster1-pxc-1
      port: 3306
      weight: 90
    - host: cluster1-pxc-2
      port: 3306
      weight: 80

However, I’m not able to get the replication to work and always seeing certificate verify failed messages even though “sslSkipVerify” is set to true.

Certificates on the cluster is generated by cert-manager. Both clusters have their own CA. So this make sense that cluster2 will not be able to use its CA to verify the certificate from Cluster1.

{"log":"2023-11-13T11:56:30.759345Z 32 [ERROR] [MY-010584] [Repl] Slave I/O for channel 'cluster1_to_cluster2': error connecting to master 'replication@cluster1-pxc-1:3306' - retry-time: 60 retries: 1 message: SSL connection error: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed, Error_code: MY-002026\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2023-11-13T11:57:30.819491Z 32 [ERROR] [MY-010584] [Repl] Slave I/O for channel 'cluster1_to_cluster2': error connecting to master 'replication@cluster1-pxc-1:3306' - retry-time: 60 retries: 2 message: SSL connection error: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed, Error_code: MY-002026\n","file":"/var/lib/mysql/mysqld-error.log"}
{"log":"2023-11-13T11:58:30.882113Z 32 [ERROR] [MY-010584] [Repl] Slave I/O for channel 'cluster1_to_cluster2': error connecting to master 'replication@cluster1-pxc-1:3306' - retry-time: 60 retries: 3 message: SSL connection error: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed, Error_code: MY-002026\n","file":"/var/lib/mysql/mysqld-error.log"}

My questions are:

  1. Why is “sslSkipVerify: true” not working?
  2. Can I get both clusters to use the same CA?
  3. Any additional guidance on how to configure cross site replication?
1 Like

Ok, I’ve managed to answer my one question. In your values.yaml file, right at he bottom there is a secret section with tls. You can specify the secret to the first/primary cluster CA in here, then it will not create a new CA for the secondary cluster and instead use the CA of the primary cluster.

....
  tls:
    # This should be the name of a secret that contains certificates.
    # it should have the following keys: `ca.crt`, `tls.crt`, `tls.key`
    # If not set the Helm chart will attempt to create certificates
    # for you [not recommended for prod]:
    cluster: cluster1-ca-cert

    # This should be the name of a secret that contains certificates.
    # it should have the following keys: `ca.crt`, `tls.crt`, `tls.key`
    # If not set the Helm chart will attempt to create certificates
    # for you [not recommended for prod]:
    internal: cluster1-ssl-internal

With the above in place, I’ve managed to moved past the certificate verify failed error, however now getting an auth error:

{"log":"2023-11-13T13:03:34.435554Z 27 [ERROR] [MY-010584] [Repl] Slave I/O for channel 'cluster1_to_cluster2': error connecting to master 'replication@cluster1:3306' - retry-time: 60 retries: 1 message: Access denied for user 'replication'@'cluster2' (using password: YES), Error_code: MY-001045\n","file":"/var/lib/mysql/mysqld-error.log"}

The auth error is because it is trying to use the secondary cluster “replication” user password to login to the primary cluster. To get around this, set manual password for all users in the values.yaml file.

secrets:
  ## You should be overriding these with your own or specify name for clusterSecretName.
  passwords:
    root: xxxxxxxxxxxxxxx
    xtrabackup: xxxxxxxxxxxxxxx
    monitor: xxxxxxxxxxxxxxx
    clustercheck: xxxxxxxxxxxxxxx
    replication: xxxxxxxxxxxxxxx

I guess these al noob errors, but instructions from the documentation was not very clear on this. Anyway, all is working now.

2 Likes