Hi,
We are one step further. I think we assumed a commit either failed, or succeeded, and that’s it. Further testing revealed that sometimes, halfway the transaction, we got an error 1213 (“WSREP detected deadlock/conflict and aborted the transaction. Try restarting the transaction”).
Now, aborting a transaction halfway leads to a new transaction if you don’t stop immediately. So that explains why some records got inserted, and some didn’t because when you eventually reach your COMMIT, the new transaction commits fine. But in my mind this defies the purpose of transactions; I would like to get a failed COMMIT in the end instead of an abort halfway the transaction. To support such an halfway-abort would mean changing code in hundreds if not thousands of places and it would basically mean we’re replicating transaction logic in the application. If WSREP would not close transaction but just let it all pass and fail to COMMIT in the end, all would be fine.
But maybe… we just messed up some setting which causes this behaviour, and maybe it can be mitigated. Below is our my.cnf. Is there a way to avoid this?
Regards,
Hidde
===========================================================
[mysql]
CLIENT
port = 3306
socket = /var/run/mysqld/mysqld.sock
[mysqld]
sql_mode = only_full_group_by,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
show_compatibility_56 = on
ssl-ca = [redacted]
ssl-cert = [redacted]
ssl-key = [redacted]
GENERAL
user = mysql
default_storage_engine = innodb
socket = /var/run/mysqld/mysqld.sock
pid_file = /var/run/mysqld/mysqld.pid
tmpdir = [redacted]
MyISAM
key_buffer_size = 32M
SAFETY
max_allowed_packet = 128M
max_connect_errors = 1000000
sysdate_is_now = 1
innodb = FORCE
innodb_strict_mode = 1
log_bin_trust_function_creators = 1
DATA STORAGE
datadir = [redacted]
BINARY LOGGING
log_bin = [redacted]
expire_logs_days = 1
sync_binlog = 1
CACHES AND LIMITS
tmp_table_size = 32M
max_heap_table_size = 32M
query_cache_type = 0
query_cache_size = 0
max_connections = 200
thread_cache_size = 50
open_files_limit = 65535
table_definition_cache = 4096
table_open_cache = 10240
INNODB
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 512M
innodb_flush_log_at_trx_commit = 1
innodb_file_per_table = 1
innodb_buffer_pool_size = 50G
innodb_stats_sample_pages = 100
innodb_stats_persistent_sample_pages=100
innodb_stats_transient_sample_pages=100
LOGGING
log_error = [redacted]mysql-error.log
log_queries_not_using_indexes = 0
slow_query_log = 0
slow_query_log_file = [redacted]mysql-slow.log
server_id=1
wsrep_cluster_address=“gcomm://node2,node3”
wsrep_provider=/usr/lib/libgalera_smm.so
wsrep_provider_options = “gmcast.listen_addr=tcp://node1; gmcast.segment=1; evs.keepalive_period=PT1S; evs.inactive_check_period=PT0.5S; evs.suspect_timeout=PT5S; evs.inactive_timeout=PT15S; evs.install_timeout=PT15S; socket.ssl_cert=/[redacted]/percona-cert.pem; socket.ssl_key=/[redacted]/percona-key.pem; socket.ssl_cipher=AES128-SHA; socket.ssl_compression=no; evs.send_window=512; evs.user_send_window=512; gmcast.time_wait=PT1M; gcache.size=256M”
wsrep_slave_threads=2
wsrep_cluster_name=[redacted]
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=[redacted]:[redacted]
wsrep_node_name=node1
wsrep_node_incoming_address=“node1:4567”
wsrep_sst_receive_address=“node1:4444”
wsrep_node_address=“node1”
innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
binlog_format=ROW
wsrep_notify_cmd=[redacted]
wsrep_retry_autocommit=20
wsrep_auto_increment_control=OFF
auto_increment_increment=3
auto_increment_offset=1
[mysqldump]
quick
quote-names
max_allowed_packet = 1024M
[sst]
inno-backup-opts=“–skip-ssl”
tca=/[redacted]/clusternodessl.crt
tcert=/[redacted]/clusternodessl.pem
encrypt=2