Not the answer you need?
Register and ask your own question!

2node - 1 arbitrator switched to non-primary when i shutted down 1 node

ethanielethaniel EntrantInactive User Role Contributor
Hello,

I run Percona Cluster on 2 nodes: 10.0.0.91 (pxc-01) and 10.0.0.92 (pxc-02) + 1 arbitrator (garb) on 10.0.0.10.
Today, I shutted down the 10.0.0.92 (pxc-02) machine.
10.0.0.91 (pxc-01) became non-primary.

I though arbitrator was supposed to take over and declare 10.0.0.91 (pxc-01) primary to keep it working?

After 1 minute I had to run "SET GLOBAL wsrep_provider_options='pc.bootstrap=true';" on 10.0.0.91 to solve the situation.
Any ideas why?

Here are the logs from the arbitrator:
2016-03-24 05:39:35.073  INFO: (b23eccb0, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.0.0.92:4567
2016-03-24 05:39:36.573  INFO: (b23eccb0, 'tcp://0.0.0.0:4567') reconnecting to c2c3d173 (tcp://10.0.0.92:4567), attempt 0
2016-03-24 05:39:36.972  INFO: evs::proto(b23eccb0, OPERATIONAL, view_id(REG,0a880fcf,7)) suspecting node: c2c3d173
2016-03-24 05:39:36.972  INFO: evs::proto(b23eccb0, OPERATIONAL, view_id(REG,0a880fcf,7)) suspected node without join message, declaring inactive
2016-03-24 05:39:44.472  WARN: evs::proto(b23eccb0, GATHER, view_id(REG,0a880fcf,7)) install timer expired
2016-03-24 05:39:44.472  INFO: no install message received
2016-03-24 05:39:44.472  INFO: view(view_id(NON_PRIM,0a880fcf,7) memb {
        b23eccb0,0
} joined {
} left {
} partitioned {
        0a880fcf,2
        c2c3d173,2
})
2016-03-24 05:39:44.472  INFO: view(view_id(NON_PRIM,b23eccb0,8) memb {
        b23eccb0,0
} joined {
} left {
} partitioned {
        0a880fcf,2
        c2c3d173,2
})
2016-03-24 05:39:44.472  INFO: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2016-03-24 05:39:44.472  INFO: Flow-control interval: [9999999, 9999999]
2016-03-24 05:39:44.472  INFO: Received NON-PRIMARY.
2016-03-24 05:39:44.472  INFO: Shifting SYNCED -> OPEN (TO: 154125793)
2016-03-24 05:39:44.472  INFO: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2016-03-24 05:39:44.472  INFO: Flow-control interval: [9999999, 9999999]
2016-03-24 05:39:44.472  INFO: Received NON-PRIMARY.
2016-03-24 05:40:03.978  INFO: declaring 0a880fcf at tcp://10.0.0.91:4567 stable
2016-03-24 05:40:03.979  INFO: view(view_id(NON_PRIM,0a880fcf,9) memb {
        0a880fcf,2
        b23eccb0,0
} joined {
} left {
} partitioned {
        c2c3d173,2
})
2016-03-24 05:40:03.979  INFO: New COMPONENT: primary = no, bootstrap = no, my_idx = 1, memb_num = 2
2016-03-24 05:40:03.979  INFO: Flow-control interval: [9999999, 9999999]
2016-03-24 05:40:03.979  INFO: Received NON-PRIMARY.
2016-03-24 05:40:50.271  INFO: view(view_id(PRIM,0a880fcf,9) memb {
        0a880fcf,2
        b23eccb0,0
} joined {
} left {
} partitioned {
        c2c3d173,2
})
2016-03-24 05:40:50.271  INFO: save pc into disk
2016-03-24 05:40:50.271  WARN: open file(./gvwstate.dat.tmp) failed(Permission denied)
2016-03-24 05:40:50.271  INFO: forgetting c2c3d173 (tcp://10.0.0.92:4567)
2016-03-24 05:40:50.271  INFO: New COMPONENT: primary = yes, bootstrap = yes, my_idx = 1, memb_num = 2
2016-03-24 05:40:50.271  INFO: (b23eccb0, 'tcp://0.0.0.0:4567') turning message relay requesting off
2016-03-24 05:40:50.271  INFO: STATE EXCHANGE: Waiting for state UUID.
2016-03-24 05:40:50.272  INFO: STATE EXCHANGE: sent state msg: d2125aae-f169-11e5-a54f-2f2fb5523976
2016-03-24 05:40:50.272  INFO: STATE EXCHANGE: got state msg: d2125aae-f169-11e5-a54f-2f2fb5523976 from 0 (pxc-01)
2016-03-24 05:40:50.272  INFO: STATE EXCHANGE: got state msg: d2125aae-f169-11e5-a54f-2f2fb5523976 from 1 (garb)
2016-03-24 05:40:50.272  WARN: Quorum: No node with complete state:

2016-03-24 05:40:50.272  INFO: Partial re-merge of primary b28e0200-efcc-11e5-894c-7bf586b54a55 found: 2 of 3.
2016-03-24 05:40:50.272  INFO: Quorum results:
        version    = 3,
        component  = PRIMARY,
        conf_id    = 7,
        members    = 2/2 (joined/total),
        act_id     = 154125793,
        last_appl. = 154125696,
        protocols  = 0/7/3 (gcs/repl/appl),
        group UUID = 7f16b5ae-d1f6-11e5-9382-4ec823ecea7a
2016-03-24 05:40:50.272  INFO: Flow-control interval: [9999999, 9999999]
2016-03-24 05:40:50.272  INFO: Restored state OPEN -> SYNCED (154125793)
2016-03-24 05:40:53.076  INFO:  cleaning up c2c3d173 (tcp://10.0.0.92:4567)

Here are logs from remaining node (pxc-01):
2016-03-24 05:39:35 1811 [Note] WSREP: (0a880fcf, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.0.0.92:4567
2016-03-24 05:39:36 1811 [Note] WSREP: (0a880fcf, 'tcp://0.0.0.0:4567') reconnecting to c2c3d173 (tcp://10.0.0.92:4567), attempt 0
2016-03-24 05:40:01 1811 [Note] WSREP: evs::proto(0a880fcf, GATHER, view_id(REG,0a880fcf,7)) suspecting node: c2c3d173
2016-03-24 05:40:01 1811 [Note] WSREP: evs::proto(0a880fcf, GATHER, view_id(REG,0a880fcf,7)) suspected node without join message, declaring inactive
2016-03-24 05:40:02 1811 [Note] WSREP: view(view_id(NON_PRIM,0a880fcf,7) memb {
        0a880fcf,2
} joined {
} left {
} partitioned {
        b23eccb0,0
        c2c3d173,2
})
2016-03-24 05:40:02 1811 [Note] WSREP: view(view_id(NON_PRIM,0a880fcf,8) memb {
        0a880fcf,2
} joined {
} left {
} partitioned {
        b23eccb0,0
        c2c3d173,2
})
2016-03-24 05:40:02 1811 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2016-03-24 05:40:02 1811 [Note] WSREP: Flow-control interval: [16, 16]
2016-03-24 05:40:02 1811 [Note] WSREP: Received NON-PRIMARY.
2016-03-24 05:40:02 1811 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 154125793)
2016-03-24 05:40:02 1811 [Note] WSREP: New cluster view: global state: 7f16b5ae-d1f6-11e5-9382-4ec823ecea7a:154125793, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3
2016-03-24 05:40:02 1811 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2016-03-24 05:40:02 1811 [Note] WSREP: Flow-control interval: [16, 16]
2016-03-24 05:40:02 1811 [Note] WSREP: Received NON-PRIMARY.
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 1119, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 336, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 1816, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 516, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 504, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 504, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 528, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 504, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 504, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 914, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 336, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 873, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 336, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 347, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 347, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 1570, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 771, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 783, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 843, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 440, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 440, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 409, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 440, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 415, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 509, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 500, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 538, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Note] WSREP: New cluster view: global state: 7f16b5ae-d1f6-11e5-9382-4ec823ecea7a:154125793, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3
2016-03-24 05:40:03 1811 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 805, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Warning] WSREP: Send action {(nil), 538, TORDERED} returned -107 (Transport endpoint is not connected)
2016-03-24 05:40:03 1811 [Note] WSREP: declaring b23eccb0 at tcp://10.0.0.10:4567 stable
2016-03-24 05:40:03 1811 [Note] WSREP: view(view_id(NON_PRIM,0a880fcf,9) memb {
        0a880fcf,2
        b23eccb0,0
} joined {
} left {
} partitioned {
        c2c3d173,2
})
2016-03-24 05:40:03 1811 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 2
2016-03-24 05:40:03 1811 [Note] WSREP: Flow-control interval: [23, 23]
2016-03-24 05:40:03 1811 [Note] WSREP: Received NON-PRIMARY.
2016-03-24 05:40:03 1811 [Note] WSREP: New cluster view: global state: 7f16b5ae-d1f6-11e5-9382-4ec823ecea7a:154125793, view# -1: non-Primary, number of nodes: 2, my index: 0, protocol version 3
2016-03-24 05:40:03 1811 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-24 05:40:50 1811 [Note] WSREP: view(view_id(PRIM,0a880fcf,9) memb {
        0a880fcf,2
        b23eccb0,0
} joined {
} left {
} partitioned {
        c2c3d173,2
})
2016-03-24 05:40:50 1811 [Note] WSREP: save pc into disk
2016-03-24 05:40:50 1811 [Note] WSREP: forgetting c2c3d173 (tcp://10.0.0.92:4567)
2016-03-24 05:40:50 1811 [Note] WSREP: deleting entry tcp://10.0.0.92:4567
2016-03-24 05:40:50 1811 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = yes, my_idx = 0, memb_num = 2
2016-03-24 05:40:50 1811 [Note] WSREP: (0a880fcf, 'tcp://0.0.0.0:4567') turning message relay requesting off
2016-03-24 05:40:50 1811 [Note] WSREP: STATE_EXCHANGE: sent state UUID: d2125aae-f169-11e5-a54f-2f2fb5523976
2016-03-24 05:40:50 1811 [Note] WSREP: STATE EXCHANGE: sent state msg: d2125aae-f169-11e5-a54f-2f2fb5523976
2016-03-24 05:40:50 1811 [Note] WSREP: STATE EXCHANGE: got state msg: d2125aae-f169-11e5-a54f-2f2fb5523976 from 0 (pxc-01)
2016-03-24 05:40:50 1811 [Note] WSREP: STATE EXCHANGE: got state msg: d2125aae-f169-11e5-a54f-2f2fb5523976 from 1 (garb)
2016-03-24 05:40:50 1811 [Warning] WSREP: Quorum: No node with complete state:


        Version      : 3
        Flags        : 0x7
        Protocols    : 0 / 7 / 3
        State        : NON-PRIMARY
        Prim state   : SYNCED
        Prim UUID    : b28e0200-efcc-11e5-894c-7bf586b54a55
        Prim  seqno  : 7
        First seqno  : 154012102
        Last  seqno  : 154125793
        Prim JOINED  : 3
        State UUID   : d2125aae-f169-11e5-a54f-2f2fb5523976
        Group UUID   : 7f16b5ae-d1f6-11e5-9382-4ec823ecea7a
        Name         : 'pxc-01'
        Incoming addr: '10.0.0.91:3306'

        Version      : 3
        Flags        : 0xe
        Protocols    : 0 / 127 / 127
        State        : NON-PRIMARY
        Prim state   : SYNCED
        Prim UUID    : b28e0200-efcc-11e5-894c-7bf586b54a55
        Prim  seqno  : 7
        First seqno  : -1
        Last  seqno  : 154125793
        Prim JOINED  : 3
        State UUID   : d2125aae-f169-11e5-a54f-2f2fb5523976
        Group UUID   : 7f16b5ae-d1f6-11e5-9382-4ec823ecea7a
        Name         : 'garb'
        Incoming addr: ''

2016-03-24 05:40:50 1811 [Note] WSREP: Partial re-merge of primary b28e0200-efcc-11e5-894c-7bf586b54a55 found: 2 of 3.
2016-03-24 05:40:50 1811 [Note] WSREP: Quorum results:
        version    = 3,
        component  = PRIMARY,
        conf_id    = 7,
        members    = 2/2 (joined/total),
        act_id     = 154125793,
        last_appl. = 154125672,
        protocols  = 0/7/3 (gcs/repl/appl),
        group UUID = 7f16b5ae-d1f6-11e5-9382-4ec823ecea7a
2016-03-24 05:40:50 1811 [Note] WSREP: Flow-control interval: [23, 23]
2016-03-24 05:40:50 1811 [Note] WSREP: Restored state OPEN -> SYNCED (154125793)
2016-03-24 05:40:50 1811 [Note] WSREP: New cluster view: global state: 7f16b5ae-d1f6-11e5-9382-4ec823ecea7a:154125793, view# 8: Primary, number of nodes: 2, my index: 0, protocol version 3
2016-03-24 05:40:50 1811 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-24 05:40:50 1811 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-03-24 05:40:50 1811 [Note] WSREP: Service thread queue flushed.
2016-03-24 05:40:50 1811 [Note] WSREP: Assign initial position for certification: 154125793, protocol version: 3
2016-03-24 05:40:50 1811 [Note] WSREP: Service thread queue flushed.
2016-03-24 05:40:50 1811 [Note] WSREP: Synchronized with group, ready for connections
2016-03-24 05:40:50 1811 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-24 05:40:53 1811 [Note] WSREP:  cleaning up c2c3d173 (tcp://10.0.0.92:4567)

Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.