“Galera Cluster: Same server_id on all nodes and different gtid_domain_id — is this safe?”

Sergey_DSV · March 23, 2026, 12:53pm

I have a MariaDB (Galera) cluster that has been running in production for about 5 years without any noticeable issues. The cluster consists of 5 nodes.

However, I recently noticed an unusual configuration:

All nodes in the cluster have the same server_id = 1
Each node has a different gtid_domain_id (for example: node1 = 11, node2 = 22, etc.)
There are external replicas replicating from this cluster

My questions are:

Is it correct or safe to have the same server_id on all nodes in a Galera cluster?
Is it a valid approach to assign different gtid_domain_id values per node in this setup?
What potential problems or risks could arise from this configuration, especially regarding replication and failover?
Could this setup lead to conflicts, data inconsistency, or issues with GTID-based replication in the future?

I would appreciate any clarification or best practices regarding this kind of configuration.

anil.joshi · March 23, 2026, 1:51pm

@Sergey_DSV

Is it correct or safe to have the same server_id on all nodes in a Galera cluster?

It’s totally fine. PXC/Galera uses its own certification-based replication to interact and sync changes rather the the binary logs.

Are you using it for performing any non-Galera-based writes or any kind of multi-source replication on any of the Galera nodes?

Since you are using a MariaDB cluster, let me add the MariaDB manual, which clearly explains the use of server_id in Galera-based setups. I am sure this clears your doubt.

Is it a valid approach to assign different gtid_domain_id values per node in this setup?

Yes, it’s fine to have a different gtid_domain_id. This prevents the node from using the same domain as Galera-based write sets when assigning GTIDs to non-Galera transactions.

On the other hand, wsrep_gtid_domain_id should be the same across all nodes within a cluster so that each node uses the same domain when assigning GTIDs for Galera Cluster-based write sets.

Reference - Using MariaDB GTIDs with MariaDB Galera Cluster | Galera Cluster | MariaDB Documentation

What potential problems or risks could arise from this configuration, especially regarding
replication and failover?

To ensure the async replica applies all change streams, it is recommended to enable [log_slave_updates/log_replica_updates] on the Galera cluster so binary logs contain all GTID sets. gtid_domain_id is already set with a different value on all 3 nodes, so the TRX origin can still be identified and differentiated.

How exactly are your async nodes connected? Are they always connected to a dedicated Galera node, or do they have some failover there to switch between galera nodes ?

Could this setup lead to conflicts, data inconsistency, or issues with GTID-based replication in the future?

As mentioned, there are some edge case/requirement- Using MariaDB Replication with MariaDB Galera Cluster | Galera Cluster | MariaDB Documentation where having different server-id make sense. You can verify the same if using or not.

I don’t see any other issue.

Further, can you please clarify how exactly your external replicas are connected to the Galera cluster ? Is there any multi-source replication kind of topology also used in this ? Are you using GTID or file/position-based replication?

Please share your Galera/Async configuration file also.

Sergey_DSV · March 23, 2026, 3:09pm

There are replications only from one of the cluster nodes. No data is replicated to the cluster.

I’m resetting the configurations of two cluster nodes. The rest are the same.


[mysqld]
# Don't resolve hostnames. All hostnames are IP's or 'localhost'
skip-name-resolve

# Binary logs will be purged after expire_logs_days days
expire_logs_days = 7

# Binary log will be rotated automatically when the size exceeds this value
max_binlog_size = 100M

# The number of simultaneous clients allowed
max_connections = 3000
max_connect_errors = 1000000
back_log = 512
thread_cache_size = 100

innodb_io_capacity = 2000
innodb_io_capacity_max = 4000


query_cache_type = 0
query_cache_size = 0

thread_stack = 512K
thread_handling = pool-of-threads
thread_pool_size = 16
thread_pool_max_threads = 2000

max_allowed_packet = 512M

[mariadb]
default_time_zone = 'UTC'
sql_mode=ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
key_buffer_size=10M
event_scheduler=ON

innodb_buffer_pool_size = 60G
innodb_buffer_pool_instances = 16
innodb_log_file_size = 2G
innodb_log_files_in_group = 2
innodb_log_buffer_size = 64M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_file_per_table = 1
innodb_open_files = 4096
innodb_read_io_threads = 4
innodb_write_io_threads = 4
innodb_thread_concurrency = 16
innodb_autoinc_lock_mode = 2

sort_buffer_size=2M
read_buffer_size=256K
read_rnd_buffer_size=256K
join_buffer_size=256K
max_heap_table_size = 64M
tmp_table_size = 64M

# Slow query
log_output=FILE
slow_query_log
slow_query_log_file=slow-queries.log
long_query_time=1

# Error log
log_error=/var/log/mysql_error.log

# Tells the slave to log the updates from the slave thread to the binary log
log_slave_updates=ON
log_bin=/var/log/mariadb/log.bin

# What form of binary logging the master will use
binlog_format=ROW

default-storage-engine=innodb
innodb_autoinc_lock_mode=2

gtid_domain_id=11
server_id=1

# Galera Provider Configuration
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_gtid_mode=ON
wsrep_gtid_domain_id=1

# Galera Cluster Configuration
wsrep_cluster_name="ringostat_cluster"
wsrep_cluster_address="gcomm://10.0.0.1,10.0.0.2,10.0.0.3,10.0.0.4,10.0.0.5"
# Galera Synchronization Configuration
wsrep_sst_method=rsync

# Galera Node Configuration
wsrep_node_address="10.0.0.1"
wsrep_node_name="dbn1"




[mysqld]
# Don't resolve hostnames. All hostnames are IP's or 'localhost'
skip-name-resolve

# Binary logs will be purged after expire_logs_days days
expire_logs_days = 7

# Binary log will be rotated automatically when the size exceeds this value
max_binlog_size = 100M

# The number of simultaneous clients allowed
max_connections = 3000
max_connect_errors = 1000000
back_log = 512
thread_cache_size = 100

innodb_io_capacity = 2000
innodb_io_capacity_max = 4000

query_cache_type = 0
query_cache_size = 0

thread_stack = 512K
thread_handling = pool-of-threads
thread_pool_size = 16
thread_pool_max_threads = 2000

max_allowed_packet = 512M

[mariadb]
default_time_zone = 'UTC'
sql_mode=ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
key_buffer_size=10M # 2G
event_scheduler=ON

innodb_buffer_pool_size = 60G
innodb_buffer_pool_instances = 16
innodb_log_file_size = 2G
innodb_log_files_in_group = 2
innodb_log_buffer_size = 64M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_file_per_table = 1
innodb_open_files = 4096
innodb_read_io_threads = 4
innodb_write_io_threads = 4
innodb_thread_concurrency = 16
innodb_autoinc_lock_mode = 2

sort_buffer_size=2M
read_buffer_size=256K
read_rnd_buffer_size=256K
join_buffer_size=256K
#max_heap_table_size=16M
max_heap_table_size = 64M
tmp_table_size = 64M

# Slow query
log_output=FILE
slow_query_log
slow_query_log_file=slow-queries.log
long_query_time=1

# Error log
log_error=/var/log/mysql_error.log

# Tells the slave to log the updates from the slave thread to the binary log
log_slave_updates=ON
log_bin=/var/log/mariadb/log.bin

# What form of binary logging the master will use
binlog_format=ROW

default-storage-engine=innodb
innodb_autoinc_lock_mode=2

gtid_domain_id=22
server_id=1

# Galera Provider Configuration
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_gtid_mode=ON
wsrep_gtid_domain_id=1

# Galera Cluster Configuration
wsrep_cluster_name="ringostat_cluster"
wsrep_cluster_address="gcomm://10.0.0.1,10.0.0.2,10.0.0.3,10.0.0.4,10.0.0.5"
# Galera Synchronization Configuration
wsrep_sst_method=rsync

# Galera Node Configuration
wsrep_node_address="10.0.0.2"
wsrep_node_name="dbn2"

anil.joshi · March 24, 2026, 1:35pm

@Sergey_DSV

There are replications only from one of the cluster nodes. No data is replicated to the cluster.

This looks fine.

The other parameters also looks righly placed. Rest, I would suggest going through the official docs previously shared, just in case you have any specific use case.

log_slave_updates=ON

wsrep_gtid_domain_id=1

Just a side note: if you plan any configuration changes, always test in a lower/testing environment to better assess behaviour in advance.

Sergey_DSV · March 24, 2026, 2:01pm

I currently have version 10.4.32. I’m preparing for the update in stages. What issues might I encounter with the update? Did I choose the right update stages?

10.4 → 10.5
10.5 → 10.6
10.6 → 10.11
10.11 → 11.4
11.4 → 11.8

yunus.uyanik · March 26, 2026, 8:13am

Your path seems okay. Every upgrade needs proper testing, since environments and workloads vary. I remembered that there was some big change in 11.4, and I checked, which is good to see that my memory is still reliable.

Be carefull specifically for this upgrade:

The Query Optimizer was rewritten in version 11.0. While performance is generally better, query plans can change. It is vital to perform validation (as described in the Staged Rollout section) to catch regressions before production deployment. You should also run ANALYZE TABLE on major tables after upgrading to update statistics.

Apart from that, reading release notes to see any deprecated features/variables to prevent issues would be useful for sure.

Sergey_DSV · April 22, 2026, 8:15pm

I have a 5-node MariaDB Galera cluster running on MariaDB 10.4, and I’m planning to upgrade it to MariaDB 10.5.

During testing, I encountered the following issue:

When I try to stop a node using systemctl stop mariadb, the process hangs indefinitely. If I forcefully terminate it using kill -9 and then upgrade to 10.5, the server fails to start with the error:
redo log was created with MariaDB 10.4.34
Upgrade after a crash is not supported

To recover, I had to revert back to 10.4 and resync the node from another cluster member.

My questions:

Is systemctl stop mariadb sufficient for gracefully stopping a Galera node, or are there additional steps required to properly prepare a node for upgrade (for example, desyncing it from the cluster or ensuring a clean InnoDB shutdown)?
Is the following upgrade approach safe:
- Upgrade the node to 10.5
- Remove /var/lib/mysql
- Start MariaDB
- Let the node resync from the cluster
- Run mariadb-upgrade

Are there any risks of data inconsistency, cluster instability, or other potential issues with this method?

Any recommendations or best practices for upgrading a Galera cluster between major MariaDB versions would be appreciated.

Topic		Replies	Views
MariaDB FailOver Cluster using GTID? Other MySQL® Questions	0	735	September 6, 2016
Missing/Errant GTIDs in Multi-Master Galera Cluster Percona XtraDB Cluster 5.x closed-no-reply	0	739	March 31, 2023
Galera synchronization broken Percona XtraDB Cluster 5.x	2	613	May 19, 2015
mariadb 10.1.22 node failure on bad SQL cauases cluster to eventually fail. Other MySQL® Questions	0	440	March 22, 2017
How safe is it to turn on GTID after x time? Percona XtraDB Cluster 5.x mysql , percona	3	976	July 28, 2021

“Galera Cluster: Same server_id on all nodes and different gtid_domain_id — is this safe?”

Related topics