Not the answer you need?
Register and ask your own question!

Is my cluster back in sync?

nightflynightfly EntrantInactive User Role Beginner
Hello,

I have a 2 node percona-xtradb-cluster-56 in multi-master mode. The server was crashed (both nodes) then bootstrapped, maybe in the wrong order because the users noticed that some data wasn't up to date. It's not relevant, what I would like to know if it is back to normal synchronized mode right now or it's still in some flaky unpredictable state.

Here are the variables for NODE1:

+
+
+
| Variable_name | Value |
+
+
+
| wsrep_local_state_uuid | 53ef5e93-33de-11e5-adb5-1b37c6b643bd |
| wsrep_protocol_version | 6 |
| wsrep_last_committed | 912789 |
| wsrep_replicated | 0 |
| wsrep_replicated_bytes | 0 |
| wsrep_repl_keys | 0 |
| wsrep_repl_keys_bytes | 0 |
| wsrep_repl_data_bytes | 0 |
| wsrep_repl_other_bytes | 0 |
| wsrep_received | 776 |
| wsrep_received_bytes | 56548807 |
| wsrep_local_commits | 0 |
| wsrep_local_cert_failures | 0 |
| wsrep_local_replays | 0 |
| wsrep_local_send_queue | 0 |
| wsrep_local_send_queue_max | 2 |
| wsrep_local_send_queue_min | 0 |
| wsrep_local_send_queue_avg | 0.111111 |
| wsrep_local_recv_queue | 0 |
| wsrep_local_recv_queue_max | 151 |
| wsrep_local_recv_queue_min | 0 |
| wsrep_local_recv_queue_avg | 15.442010 |
| wsrep_local_cached_downto | 912023 |
| wsrep_flow_control_paused_ns | 458 |
| wsrep_flow_control_paused | 0.000000 |
| wsrep_flow_control_sent | 1 |
| wsrep_flow_control_recv | 1 |
| wsrep_cert_deps_distance | 24.422425 |
| wsrep_apply_oooe | 0.000000 |
| wsrep_apply_oool | 0.000000 |
| wsrep_apply_window | 1.000000 |
| wsrep_commit_oooe | 0.000000 |
| wsrep_commit_oool | 0.000000 |
| wsrep_commit_window | 1.000000 |
| wsrep_local_state | 4 |
| wsrep_local_state_comment | Synced |
| wsrep_cert_index_size | 7 |
| wsrep_causal_reads | 0 |
| wsrep_cert_interval | 0.234681 |
| wsrep_incoming_addresses | <removed> |
| wsrep_evs_delayed | |
| wsrep_evs_evict_list | |
| wsrep_evs_repl_latency | 0.000237055/0.000330571/0.00051501/8.63108e-05/13 |
| wsrep_evs_state | OPERATIONAL |
| wsrep_gcomm_uuid | fa66fe2c-34f5-11e5-8c84-035250e696ad |
| wsrep_cluster_conf_id | 4 |
| wsrep_cluster_size | 2 |
| wsrep_cluster_state_uuid | 53ef5e93-33de-11e5-adb5-1b37c6b643bd |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
| wsrep_local_bf_aborts | 0 |
| wsrep_local_index | 1 |
| wsrep_provider_name | Galera |
| wsrep_provider_vendor | Codership Oy <[email protected]> |
| wsrep_provider_version | 3.8(rf6147dd) |
| wsrep_ready | ON |
+
+
+

Comments

  • nightflynightfly Entrant Inactive User Role Beginner
    Here are the variables for NODE2:

    +
    +
    +
    | Variable_name | Value |
    +
    +
    +
    | wsrep_local_state_uuid | 53ef5e93-33de-11e5-adb5-1b37c6b643bd |
    | wsrep_protocol_version | 6 |
    | wsrep_last_committed | 922781 |
    | wsrep_replicated | 922780 |
    | wsrep_replicated_bytes | 23855221085 |
    | wsrep_repl_keys | 3071856 |
    | wsrep_repl_keys_bytes | 45801377 |
    | wsrep_repl_data_bytes | 6849937076 |
    | wsrep_repl_other_bytes | 0 |
    | wsrep_received | 7261 |
    | wsrep_received_bytes | 60045 |
    | wsrep_local_commits | 922777 |
    | wsrep_local_cert_failures | 0 |
    | wsrep_local_replays | 0 |
    | wsrep_local_send_queue | 0 |
    | wsrep_local_send_queue_max | 10 |
    | wsrep_local_send_queue_min | 0 |
    | wsrep_local_send_queue_avg | 0.002493 |
    | wsrep_local_recv_queue | 0 |
    | wsrep_local_recv_queue_max | 2 |
    | wsrep_local_recv_queue_min | 0 |
    | wsrep_local_recv_queue_avg | 0.003443 |
    | wsrep_local_cached_downto | 921845 |
    | wsrep_flow_control_paused_ns | 43461847996 |
    | wsrep_flow_control_paused | 0.000357 |
    | wsrep_flow_control_sent | 0 |
    | wsrep_flow_control_recv | 72 |
    | wsrep_cert_deps_distance | 19.291316 |
    | wsrep_apply_oooe | 0.057334 |
    | wsrep_apply_oool | 0.000002 |
    | wsrep_apply_window | 1.086623 |
    | wsrep_commit_oooe | 0.000000 |
    | wsrep_commit_oool | 0.000000 |
    | wsrep_commit_window | 1.029616 |
    | wsrep_local_state | 4 |
    | wsrep_local_state_comment | Synced |
    | wsrep_cert_index_size | 29 |
    | wsrep_causal_reads | 0 |
    | wsrep_cert_interval | 0.093278 |
    | wsrep_incoming_addresses | <REMOVED> |
    | wsrep_evs_delayed | |
    | wsrep_evs_evict_list | |
    | wsrep_evs_repl_latency | 0.000275245/0.00125926/0.00935097/0.000899017/1104 |
    | wsrep_evs_state | OPERATIONAL |
    | wsrep_gcomm_uuid | 53eef647-33de-11e5-9145-eade2fa688ff |
    | wsrep_cluster_conf_id | 4 |
    | wsrep_cluster_size | 2 |
    | wsrep_cluster_state_uuid | 53ef5e93-33de-11e5-adb5-1b37c6b643bd |
    | wsrep_cluster_status | Primary |
    | wsrep_connected | ON |
    | wsrep_local_bf_aborts | 0 |
    | wsrep_local_index | 0 |
    | wsrep_provider_name | Galera |
    | wsrep_provider_vendor | Codership Oy <[email protected]> |
    | wsrep_provider_version | 3.8(rf6147dd) |
    | wsrep_ready | ON |
    +
    +
    +


    The wsrep_local_state_comment say it is Synced but it even say that if I shut one node down. When I create database and load data in on any side that gets replicated all right to the other side. What worries me is this missing data the users said and what I have found in the documentation "How to recover a PXC cluster Scenario6".

    If I look into this grastate.dat on the nodes:

    # GALERA saved state
    version: 2.1
    uuid: 53ef5e93-33de-11e5-adb5-1b37c6b643bd
    seqno: -1
    cert_index:

    The seqno is -1 instead of the last valid sequence number, now I don't know if this is normal.

    Also the logs were full of warnings such as (on node2):

    2015-07-28 06:25:02 30682 [Warning] InnoDB: Cannot open table mydb/field_revision_field_decision_nl_mydb_one from the internal data dictionary of InnoDB though the .frm file for the table exists. See http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting.html for how you can resolve the problem.

    I would just really like to know if this is back to normal state so we can continue the work or not.

    Can someone help?

    Thanks
  • jriverajrivera Percona Support Engineer Percona Staff Role
    You can clean up node2's datadir and then restart node2 to SST from node1 to make sure you have consistent data between the nodes. Check to make sure you do not use MyISAM tables, if so convert them to InnoDB. Add a Garbd node to avoid split brain when one node can't communicate to the other node.
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.