Not the answer you need?
Register and ask your own question!

2 of 3 nodes crash

sergeymsergeym EntrantCurrent User Role Beginner
Hi,

we have cluster in LAN, 3 nodes.We run group of test(create DB, create tables, run coherently a lot of transactions, Drop DB).All write/read requests are going to first node. Sometimes, other two nodes simultaneously crash with such errors on both :
2015-06-23 21:36:30 30464 [ERROR] Slave SQL: Could not execute Write_rows event on table test-rain-17165.machine_services; Cannot add or update a child row: a foreign key constraint fails (`test-rain-17165`.`machine_services`, CONSTRAINT `machine_registration_fk` FOREIGN KEY (`machine_id`) REFERENCES `machines` (`id
`) ON DELETE NO ACTION ON UPDATE NO ACTION), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's master log FIRST, end_log_pos 209, Error_code: 1452
2015-06-23 21:36:30 30464 [Warning] WSREP: RBR event 3 Write_rows apply warning: 151, 2948798
2015-06-23 21:36:30 30464 [ERROR] Slave SQL: Could not execute Write_rows event on table test-rain-17165.machine_services; Cannot add or update a child row: a foreign key constraint fails (`test-rain-17165`.`machine_services`, CONSTRAINT `machine_registration_fk` FOREIGN KEY (`machine_id`) REFERENCES `machines` (`id
`) ON DELETE NO ACTION ON UPDATE NO ACTION), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's master log FIRST, end_log_pos 209, Error_code: 1452
2015-06-23 21:36:30 30464 [Warning] WSREP: RBR event 3 Write_rows apply warning: 151, 2948796
2015-06-23 21:36:30 30464 [Warning] WSREP: Failed to apply app buffer: seqno: 2948798, status: 1
         at galera/src/trx_handle.cpp:apply():351
Retrying 3th time
2015-06-23 21:36:30 30464 [Warning] WSREP: Failed to apply app buffer: seqno: 2948796, status: 1
         at galera/src/trx_handle.cpp:apply():351
Retrying 3th time
2015-06-23 21:36:30 30464 [ERROR] Slave SQL: Could not execute Write_rows event on table test-rain-17165.machine_services; Cannot add or update a child row: a foreign key constraint fails (`test-rain-17165`.`machine_services`, CONSTRAINT `machine_registration_fk` FOREIGN KEY (`machine_id`) REFERENCES `machines` (`id
`) ON DELETE NO ACTION ON UPDATE NO ACTION), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's master log FIRST, end_log_pos 209, Error_code: 1452
2015-06-23 21:36:30 30464 [Warning] WSREP: RBR event 3 Write_rows apply warning: 151, 2948798
2015-06-23 21:36:30 30464 [ERROR] Slave SQL: Could not execute Write_rows event on table test-rain-17165.machine_services; Cannot add or update a child row: a foreign key constraint fails (`test-rain-17165`.`machine_services`, CONSTRAINT `machine_registration_fk` FOREIGN KEY (`machine_id`) REFERENCES `machines` (`id
`) ON DELETE NO ACTION ON UPDATE NO ACTION), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's master log FIRST, end_log_pos 209, Error_code: 1452
2015-06-23 21:36:30 30464 [Warning] WSREP: RBR event 3 Write_rows apply warning: 151, 2948796
2015-06-23 21:36:30 30464 [Warning] WSREP: Failed to apply app buffer: seqno: 2948798, status: 1
         at galera/src/trx_handle.cpp:apply():351
Retrying 4th time
2015-06-23 21:36:30 30464 [ERROR] Slave SQL: Could not execute Write_rows event on table test-rain-17165.machine_services; Cannot add or update a child row: a foreign key constraint fails (`test-rain-17165`.`machine_services`, CONSTRAINT `machine_registration_fk` FOREIGN KEY (`machine_id`) REFERENCES `machines` (`id
`) ON DELETE NO ACTION ON UPDATE NO ACTION), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's master log FIRST, end_log_pos 209, Error_code: 1452
2015-06-23 21:36:30 30464 [Warning] WSREP: RBR event 3 Write_rows apply warning: 151, 2948798
But it doesn't reproduce 100%, we can run similar test two or three time(each time new DB) and only after fourth time 2 or 1 nodes crashed.
Also,there is no errors on first node(where we run test) and all test finished succesfully.
I tried to look into audit logs of global logs, but there is nothing. What else I can do? and what this errors means?

Comments

  • sergeymsergeym Entrant Current User Role Beginner
    OS:centos 6.6
    Percona 5.6 latest.
  • ShahriyarRShahriyarR Contributor Inactive User Role Beginner
    Seems to be the same with BUG report:
    https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1264992

    Can you provide all my.cnf files from 3 nodes? as well as detailed information about "secondary" or "donor" nodes?
    Was there any crash recovery with your nodes?
  • sergeymsergeym Entrant Current User Role Beginner
    All three nodes have similar HW and SW configurations. After crash I just restart failed node with full SST.
    Configurations below:

    Node 1
    [mysqld_safe]
    
    [mysqld]
    datadir = /percona_data
    user = mysql
    
    collation-server=utf8_unicode_ci
    binlog_format=ROW
    
    wsrep_provider=/usr/lib64/libgalera_smm.so
    wsrep_cluster_address=gcomm://10.251.0.55,10.251.0.56,10.251.0.57
    
    # Node #1 address
    wsrep_node_address = 10.251.0.55
    #Cluser name
    wsrep_node_name=ru-db-01
    wsrep_slave_threads=32
    wsrep_cluster_name=001
    
    wsrep_retry_autocommit = 3
    
    # SST method
    #wsrep_sst_method=xtrabackup
    #wsrep_sst_auth=replication_user:jGFcv97JnkMJRw
    wsrep_sst_method=rsync
    
    # MyISAM storage engine has only experimental support
    
    default_storage_engine=InnoDB
    
    # This is a recommended tuning variable for performance
    
    innodb_locks_unsafe_for_binlog=1
    
    # This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
    innodb_autoinc_lock_mode=2
    
    character-set-server=utf8
    
    log-error=/var/log/mysql-errors.log
    
    # innodb engine settings
    innodb-file-per-table = 1
    innodb_buffer_pool_size = 256M
    innodb_additional_mem_pool_size = 128M
    innodb_flush_log_at_trx_commit = 0
    innodb_flush_method = O_DIRECT
    transaction-isolation = READ-COMMITTED
    innodb-data-file-path = ibdata1:10M:autoextend
    innodb-log-file-size = 128M
    innodb_lock_wait_timeout = 3600
    
    # performance settings
    skip-name-resolve
    
    query_cache_size                = 32M
    query_cache_type                = 1
    query_cache_min_res_unit        = 1024
    
    join_buffer_size        = 1M
    key_buffer              = 160M
    table_open_cache             = 4000
    table_definition_cache  = 4000
    sort_buffer             = 1M
    thread_cache            = 128
    thread_stack            = 262144
    tmp_table_size          = 128M
    max_heap_table_size     = 128M
    wait-timeout            = 120
    read_buffer_size        = 32M
    max_connections         = 3500
    
    table_open_cache = 3000
    read-buffer-size = 524288
    
    # network
    connect_timeout = 600000
    wait_timeout = 28800
    max_allowed_packet = 512M
    max_connect_errors = 1000
    net_read_timeout=600000
    connect_timeout=600000
    net_write_timeout=600000
    
    # safety settings
    skip-show-database = 1
    skip-symbolic-links = 1
    local-infile = 0
    
    event_scheduler=on
    #optimizer_switch='derived_merge=off,derived_with_keys=off'
    #wsrep_provider_options="gcache.size=5G; repl.commit_order=1;" #gmcast.segment=2"
    wsrep_provider_options="gcache.size=5G;"
    
    long_query_time = 4
    slow_query_log=1
    slow_query_log_file=/var/log/mysql-slow-query.log
    #log-slow-queries = /var/log/mysqld-slow-query.log
    

    Node 2
    [mysqld_safe]
    
    [mysqld]
    datadir = /percona_data
    user = mysql
    
    collation-server=utf8_unicode_ci
    binlog_format=ROW
    
    wsrep_provider=/usr/lib64/libgalera_smm.so
    wsrep_cluster_address=gcomm://10.251.0.55,10.251.0.56,10.251.0.57
    
    # Node #1 address
    wsrep_node_address = 10.251.0.56
    #Cluser name
    wsrep_node_name=ru-db-02
    wsrep_slave_threads=32
    wsrep_cluster_name=001
    
    wsrep_retry_autocommit = 3
    
    # SST method
    #wsrep_sst_method=xtrabackup
    wsrep_sst_method=rsync
    wsrep_sst_auth=replication_user:jGFcv97JnkMJRw
    
    #wsrep_sst_method=rsync
    
    # MyISAM storage engine has only experimental support
    
    default_storage_engine=InnoDB
    
    # This is a recommended tuning variable for performance
    
    innodb_locks_unsafe_for_binlog=1
    
    # This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
    innodb_autoinc_lock_mode=2
    
    character-set-server=utf8
    
    log-error=/var/log/mysql-errors.log
    
    # innodb engine settings
    innodb-file-per-table = 1
    innodb_buffer_pool_size = 256M
    innodb_additional_mem_pool_size = 128M
    innodb_flush_log_at_trx_commit = 0
    innodb_flush_method = O_DIRECT
    transaction-isolation = READ-COMMITTED
    innodb-data-file-path = ibdata1:10M:autoextend
    innodb-log-file-size = 128M
    innodb_lock_wait_timeout=3600
    
    # performance settings
    skip-name-resolve
    
    query_cache_size                = 32M
    query_cache_type                = 1
    query_cache_min_res_unit        = 1024
    
    join_buffer_size        = 1M
    key_buffer              = 160M
    table_open_cache             = 4000
    table_definition_cache  = 4000
    sort_buffer             = 1M
    thread_cache            = 128
    thread_stack            = 262144
    tmp_table_size          = 128M
    max_heap_table_size     = 128M
    wait-timeout            = 120
    read_buffer_size        = 32M
    max_connections         = 2000
    
    table_open_cache = 3000
    read-buffer-size = 524288
    
    # network
    connect_timeout = 600000
    wait_timeout = 28800
    max_allowed_packet = 512M
    max_connect_errors = 1000
    net_read_timeout=600000
    connect_timeout=600000
    net_write_timeout=600000
    
    # safety settings
    skip-show-database = 1
    skip-symbolic-links = 1
    local-infile = 0
    
    event_scheduler=on
    #optimizer_switch='derived_merge=off,derived_with_keys=off'
    #wsrep_provider_options="gcache.size=5G; repl.commit_order=1;" #gmcast.segment=2"
    wsrep_provider_options="gcache.size=5G;"
    
    long_query_time = 0
    slow_query_log=1
    slow_query_log_file=/var/log/mysql-slow-query.log
    #log-slow-queries = /var/log/mysqld-slow-query.log
    
    Node 3
    [mysqld_safe]
    
    [mysqld]
    datadir = /percona_data
    user = mysql
    
    collation-server=utf8_unicode_ci
    binlog_format=ROW
    log-bin = mysql-bin
    
    wsrep_provider=/usr/lib64/libgalera_smm.so
    wsrep_cluster_address=gcomm://10.251.0.55,10.251.0.56,10.251.0.57
    
    # Node #1 address
    wsrep_node_address = 10.251.0.57
    #Cluser name
    wsrep_node_name=ru-db-03
    wsrep_slave_threads=32
    wsrep_cluster_name=001
    
    wsrep_retry_autocommit = 3
    
    # SST method
    #wsrep_sst_method=xtrabackup
    wsrep_sst_method=rsync
    wsrep_sst_auth=replication_user:jGFcv97JnkMJRw
    
    #wsrep_sst_method=rsync
    
    # MyISAM storage engine has only experimental support
    
    default_storage_engine=InnoDB
    
    # This is a recommended tuning variable for performance
    
    innodb_locks_unsafe_for_binlog=1
    
    # This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
    innodb_autoinc_lock_mode=2
    
    character-set-server=utf8
    
    log-error=/var/log/mysql-errors.log
    
    # innodb engine settings
    innodb-file-per-table = 1
    innodb_buffer_pool_size = 256M
    innodb_additional_mem_pool_size = 128M
    innodb_flush_log_at_trx_commit = 0
    innodb_flush_method = O_DIRECT
    transaction-isolation = READ-COMMITTED
    innodb-data-file-path = ibdata1:10M:autoextend
    innodb-log-file-size = 128M
    innodb_lock_wait_timeout=3600
    
    # performance settings
    skip-name-resolve
    
    query_cache_size                = 32M
    query_cache_type                = 1
    query_cache_min_res_unit        = 1024
    
    join_buffer_size        = 1M
    key_buffer              = 160M
    table_open_cache             = 4000
    table_definition_cache  = 4000
    sort_buffer             = 1M
    thread_cache            = 128
    thread_stack            = 262144
    tmp_table_size          = 128M
    max_heap_table_size     = 128M
    wait-timeout            = 120
    read_buffer_size        = 32M
    max_connections         = 2000
    
    table_open_cache = 3000
    read-buffer-size = 524288
    
    # network
    connect_timeout = 600000
    wait_timeout = 28800
    max_allowed_packet = 512M
    max_connect_errors = 1000
    net_read_timeout=600000
    connect_timeout=600000
    net_write_timeout=600000
    
    # safety settings
    skip-show-database = 1
    skip-symbolic-links = 1
    local-infile = 0
    
    event_scheduler=on
    #optimizer_switch='derived_merge=off,derived_with_keys=off'
    #wsrep_provider_options="gcache.size=5G; repl.commit_order=1;" #gmcast.segment=2"
    wsrep_provider_options="gcache.size=5G;"
    long_query_time = 0
    slow_query_log=1
    slow_query_log_file=/var/log/mysql-slow-query.log
    #log-slow-queries = /var/log/mysqld-slow-query.log
    
    server-id=1
    
  • ShahriyarRShahriyarR Contributor Inactive User Role Beginner
    Error indicates Foreign key issue :
    Slave SQL: Could not execute Write_rows event on table test-rain-17165.machine_services; Cannot add or update a child row: a foreign key constraint fails (`test-rain-17165`.`machine_services`, CONSTRAINT `machine_registration_fk` FOREIGN KEY (`machine_id`) REFERENCES `machines` (`id `) ON DELETE NO ACTION ON UPDATE NO ACTION), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's master log FIRST, end_log_pos 209, Error_code: 1452 2015-06-23 21:36:30 30464 [Warning] WSREP: RBR event 3 Write_rows apply warning: 151, 294879
    
  • sergeymsergeym Entrant Current User Role Beginner
    GRA file on failed node look like this:
    /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
    /*!40019 SET @@session.max_insert_delayed_threads=0*/;
    /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
    DELIMITER /*!*/;
    # at 4
    #140909 16:18:49 server id 2  end_log_pos 120 CRC32 0x186105da     Start: binlog v 4, server v 5.6.20-68.0-56-log created 140909 16:18:49 at startup
    # Warning: this binlog is either in use or was not closed properly.
    ROLLBACK/*!*/;
    BINLOG '
    KfAOVA8CAAAAdAAAAHgAAAABAAQANS42LjIwLTY4LjAtNTYtbG9nAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAAAp8A5UEzgNAAgAEgAEBAQEEgAAXAAEGggAAAAICAgCAAAACgoKGRkAAdoF
    YRg=
    '/*!*/;
    # at 120
    #140909 16:18:49 server id 2  end_log_pos 231 CRC32 0x746c0eae     Previous-GTIDs
    # 7b2c47a7-34d4-11e4-8087-0800272e951c:1,
    # 88de2b68-cb2b-ee1b-4ac5-b9ce06787689:1-25
    # at 231
    #150624 14:59:51 server id 0  end_log_pos 83 CRC32 0x4e494745     Query    thread_id=658    exec_time=0    error_code=0
    SET TIMESTAMP=1435147191/*!*/;
    SET @@session.pseudo_thread_id=658/*!*/;
    SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
    SET @@session.sql_mode=1073741824/*!*/;
    SET @@session.auto_increment_increment=3, @@session.auto_increment_offset=1/*!*/;
    /*!\C utf8 *//*!*/;
    SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=192/*!*/;
    SET @@session.lc_time_names=0/*!*/;
    SET @@session.collation_database=DEFAULT/*!*/;
    B
    /*!*/;
    # at 314
    #150624 14:59:51 server id 0  end_log_pos 151 CRC32 0x04000303     Table_map: `test-rain-1002`.`machine_services` mapped to number 2489
    # at 382
    #150624 14:59:51 server id 0  end_log_pos 207 CRC32 0x00000001     Write_rows: table id 2489 flags: STMT_END_F
    
    BINLOG '
    t5uKVRMAAAAARAAAAJcAAAAAALkJAAAAAAEADnRlc3QtcmFpbi0xMDAyABBtYWNoaW5lX3NlcnZp
    Y2VzAAQICAMDAAQ=
    t5uKVR4AAAAAOAAAAM8AAAAAALkJAAAAAAEAAgAE//AcAAAAAAAAAIsAAAAAAAAAAQAAAAEAAAA=
    '/*!*/;
    # at 438
    #150624 14:59:51 server id 0  end_log_pos 275 CRC32 0x04000303     Table_map: `test-rain-1002`.`machine_services` mapped to number 2489
    # at 506
    #150624 14:59:51 server id 0  end_log_pos 331 CRC32 0x00000001     Write_rows: table id 2489 flags: STMT_END_F
    
    BINLOG '
    t5uKVRMAAAAARAAAABMBAAAAALkJAAAAAAEADnRlc3QtcmFpbi0xMDAyABBtYWNoaW5lX3NlcnZp
    Y2VzAAQICAMDAAQ=
    t5uKVR4AAAAAOAAAAEsBAAAAALkJAAAAAAEAAgAE//AfAAAAAAAAAIsAAAAAAAAAAgAAAAEAAAA=
    '/*!*/;
    DELIMITER ;
    # End of log file
    ROLLBACK /* added by mysqlbinlog */;
    /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
    /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
    
  • sergeymsergeym Entrant Current User Role Beginner
    Here is more detailed:
    ysqlbinlog -vvv GRA_bin.log 
    /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
    /*!40019 SET @@session.max_insert_delayed_threads=0*/;
    /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
    DELIMITER /*!*/;
    # at 4
    #140909 16:18:49 server id 2  end_log_pos 120 CRC32 0x186105da     Start: binlog v 4, server v 5.6.20-68.0-56-log created 140909 16:18:49 at startup
    # Warning: this binlog is either in use or was not closed properly.
    ROLLBACK/*!*/;
    BINLOG '
    KfAOVA8CAAAAdAAAAHgAAAABAAQANS42LjIwLTY4LjAtNTYtbG9nAAAAAAAAAAAAAAAAAAAAAAAA
    AAAAAAAAAAAAAAAAAAAp8A5UEzgNAAgAEgAEBAQEEgAAXAAEGggAAAAICAgCAAAACgoKGRkAAdoF
    YRg=
    '/*!*/;
    # at 120
    #140909 16:18:49 server id 2  end_log_pos 231 CRC32 0x746c0eae     Previous-GTIDs
    # 7b2c47a7-34d4-11e4-8087-0800272e951c:1,
    # 88de2b68-cb2b-ee1b-4ac5-b9ce06787689:1-25
    # at 231
    #150624 14:59:51 server id 0  end_log_pos 83 CRC32 0x4e494745     Query    thread_id=658    exec_time=0    error_code=0
    SET TIMESTAMP=1435147191/*!*/;
    SET @@session.pseudo_thread_id=658/*!*/;
    SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
    SET @@session.sql_mode=1073741824/*!*/;
    SET @@session.auto_increment_increment=3, @@session.auto_increment_offset=1/*!*/;
    /*!\C utf8 *//*!*/;
    SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=192/*!*/;
    SET @@session.lc_time_names=0/*!*/;
    SET @@session.collation_database=DEFAULT/*!*/;
    B
    /*!*/;
    # at 314
    #150624 14:59:51 server id 0  end_log_pos 151 CRC32 0x04000303     Table_map: `test-rain-1002`.`machine_services` mapped to number 2489
    # at 382
    #150624 14:59:51 server id 0  end_log_pos 207 CRC32 0x00000001     Write_rows: table id 2489 flags: STMT_END_F
    
    BINLOG '
    t5uKVRMAAAAARAAAAJcAAAAAALkJAAAAAAEADnRlc3QtcmFpbi0xMDAyABBtYWNoaW5lX3NlcnZp
    Y2VzAAQICAMDAAQ=
    t5uKVR4AAAAAOAAAAM8AAAAAALkJAAAAAAEAAgAE//AcAAAAAAAAAIsAAAAAAAAAAQAAAAEAAAA=
    '/*!*/;
    ### INSERT INTO `test-rain-1002`.`machine_services`
    ### SET
    ###   @1=28 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @2=139 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @3=1 /* INT meta=0 nullable=0 is_null=0 */
    ###   @4=***Corrupted replication event was detected. Not printing the value***
    # at 438
    #150624 14:59:51 server id 0  end_log_pos 275 CRC32 0x04000303     Table_map: `test-rain-1002`.`machine_services` mapped to number 2489
    # at 506
    #150624 14:59:51 server id 0  end_log_pos 331 CRC32 0x00000001     Write_rows: table id 2489 flags: STMT_END_F
    
    BINLOG '
    t5uKVRMAAAAARAAAABMBAAAAALkJAAAAAAEADnRlc3QtcmFpbi0xMDAyABBtYWNoaW5lX3NlcnZp
    Y2VzAAQICAMDAAQ=
    t5uKVR4AAAAAOAAAAEsBAAAAALkJAAAAAAEAAgAE//AfAAAAAAAAAIsAAAAAAAAAAgAAAAEAAAA=
    '/*!*/;
    ### INSERT INTO `test-rain-1002`.`machine_services`
    ### SET
    ###   @1=31 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @2=139 /* LONGINT meta=0 nullable=0 is_null=0 */
    ###   @3=2 /* INT meta=0 nullable=0 is_null=0 */
    ###   @4=***Corrupted replication event was detected. Not printing the value***
    DELIMITER ;
    # End of log file
    ROLLBACK /* added by mysqlbinlog */;
    /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
    /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
    
  • ShahriyarRShahriyarR Contributor Inactive User Role Beginner
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.