Node unable to join 8.0 cluster - docker

I just managed to replicate this on a different set of servers running inside network without any firewall restrictions, freshly created.

Original servers running on Ubuntu 20.04.3 LTS, Docker version 24.0.4, build 3713ee1
New servers running on Ubuntu 22.04.2 LTS, same docker version (Docker version 24.0.4, build 3713ee1).

Issue replication process:

  • Perform xtrabackup of original database
  • Copy xtrabackup files to new server
  • Make sure /etc/hosts file is filled to allow elk1, elk2 and elk3 to point to new servers IP addresses
  • Create docker-compoe.yml with same structure as on original servers (contents of file are in original post)
  • Start server with version 5.7
  • Join other two nodes - successfully.

Things i did on original nodes:

  • Stop all nodes
  • Change image tag to use 8.0
  • Change parameter inside command from --tls-version=invalid to --tls-version=“”
  • Start the first node - upgrade successful
  • Start second node - issue occurres as described in original post. Same issue is with third node.

Things i did on new nodes:

  1. Stop node 1
  2. Change image tag to use 8.0
  3. Change parameter inside command from --tls-version=invalid to --tls-version=“”
  4. Start node 1, allow it to join 5.7 cluster
  5. Upgrade successful, repeat steps 1-4 on node 2
  6. Upgrade successful, repeat steps 1-3 on node 3, allow it to join 8.0 cluster,
  7. Issue occurs as described in original post

Now when i stop the second node, and allow it to join 8.0 cluster, same issue will happen with it, just like it did with node3.

So it appears there is an issue joining 8.0 cluster, there might be something I’m missing but cannot figure out what.

When looking at this post i followed ‘Things to be cautious about’ and in both cases set pxc_encrypt_cluster_traffic to OFF and pxc_strict_mode to PERMISSIVE

This appears to be due to ‘–wsrep_node_address’ parameter.

Note, if i do not set this parameter, it binds to docker container address and other containers fail to reach themselves. And still, this works in 5.7, why not in 8.0?

EDIT:
It appears that i may have found a workaround, though still unsure why is this needed in 8.0 but not in 5.7.

  • In docker-compose.yml, added hostname: <hostname_of_docker_host>
  • Create a file in folder that is supposed to be mounted in /etc/percona-xtradb-cluster.conf.d on container, name should end in .cnf (perhaps)
  • Add following:
[mysqld]
wsrep_provider_options='ist.recv_addr=<hostname_of_docker_host>:4568;'

If you already have a similar file, just add wsrep_provider_options… below [mysqld] section.