vadimtk: this one is a little better I cannot delete the previous post, but I deleted the data dir and started over because the donor was not crashing in the past. (still setting this up). This is a brand new instance run with the same commands, from the donor:
2017-03-20T21:10:44.506085Z 0 [Note] WSREP: (6673b6dc, 'tcp://0.0.0.0:4567') connection established to ae6ea4bf tcp://10.20.1.35:4567
2017-03-20T21:10:44.509764Z 0 [Note] WSREP: (6673b6dc, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2017-03-20T21:10:45.005918Z 0 [Note] WSREP: declaring ae6ea4bf at tcp://10.20.1.35:4567 stable
2017-03-20T21:10:45.009498Z 0 [Note] WSREP: Node 6673b6dc state prim
2017-03-20T21:10:45.012880Z 0 [Note] WSREP: view(view_id(PRIM,6673b6dc,12) memb {
6673b6dc,0
ae6ea4bf,0
} joined {
} left {
} partitioned {
})
2017-03-20T21:10:45.012910Z 0 [Note] WSREP: save pc into disk
2017-03-20T21:10:45.013390Z 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
2017-03-20T21:10:45.013884Z 0 [Note] WSREP: STATE_EXCHANGE: sent state UUID: aebdc4ca-0db1-11e7-9c0f-db7e1711d337
2017-03-20T21:10:45.017229Z 0 [Note] WSREP: STATE EXCHANGE: sent state msg: aebdc4ca-0db1-11e7-9c0f-db7e1711d337
2017-03-20T21:10:45.020509Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: aebdc4ca-0db1-11e7-9c0f-db7e1711d337 from 0 (c8e314fdc5d4)
2017-03-20T21:10:45.504119Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: aebdc4ca-0db1-11e7-9c0f-db7e1711d337 from 1 (92495e38bf2d)
2017-03-20T21:10:45.504162Z 0 [Note] WSREP: Quorum results:
version = 4,
component = PRIMARY,
conf_id = 11,
members = 1/2 (joined/total),
act_id = 14,
last_appl. = 0,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = 581a154c-0db1-11e7-9a69-ff24de2d16d2
2017-03-20T21:10:45.504175Z 0 [Note] WSREP: Flow-control interval: [23, 23]
2017-03-20T21:10:45.504421Z 4 [Note] WSREP: New cluster view: global state: 581a154c-0db1-11e7-9a69-ff24de2d16d2:14, view# 12: Primary, number of nodes: 2, my index: 0, protocol version 3
2017-03-20T21:10:45.504444Z 4 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-03-20T21:10:45.504476Z 4 [Note] WSREP: REPL Protocols: 7 (3, 2)
2017-03-20T21:10:45.504488Z 4 [Note] WSREP: Assign initial position for certification: 14, protocol version: 3
2017-03-20T21:10:45.504511Z 0 [Note] WSREP: Service thread queue flushed.
2017-03-20T21:10:45.993086Z 0 [Note] WSREP: Member 1.0 (92495e38bf2d) requested state transfer from '*any*'. Selected 0.0 (c8e314fdc5d4)(SYNCED) as donor.
2017-03-20T21:10:45.993133Z 0 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 14)
2017-03-20T21:10:45.993306Z 4 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-03-20T21:10:45.993431Z 0 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.17.0.2:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' '' --gtid '581a154c-0db1-11e7-9a69-ff24de2d16d2:14''
2017-03-20T21:10:45.994039Z 4 [Note] WSREP: sst_donor_thread signaled with 0
WSREP_SST: [INFO] The xtrabackup version is 2.4.6 (20170320 21:10:46.038)
WSREP_SST: [INFO] Streaming with xbstream (20170320 21:10:46.229)
WSREP_SST: [INFO] Using socat as streamer (20170320 21:10:46.232)
WSREP_SST: [INFO] Using /tmp/tmp.eS5cYXKuIu as innobackupex temporary directory (20170320 21:10:46.245)
WSREP_SST: [INFO] Streaming GTID file before SST (20170320 21:10:46.250)
WSREP_SST: [INFO] Evaluating xbstream -c ${FILE_TO_STREAM} | socat -u stdio TCP:172.17.0.2:4444; RC=( ${PIPESTATUS[@]} ) (20170320 21:10:46.252)
2017/03/20 21:10:46 socat[2225] E connect(6, AF=2 172.17.0.2:4444, 16): Connection refused
WSREP_SST: [ERROR] Error while sending data to joiner node: exit codes: 141 1 (20170320 21:10:46.258)
WSREP_SST: [ERROR] Cleanup after exit with status:32 (20170320 21:10:46.260)
WSREP_SST: [INFO] Cleaning up temporary directories (20170320 21:10:46.263)
2017-03-20T21:10:46.269260Z 0 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.17.0.2:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' '' --gtid '581a154c-0db1-11e7-9a69-ff24de2d16d2:14'
2017-03-20T21:10:46.269308Z 0 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.17.0.2:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' '' --gtid '581a154c-0db1-11e7-9a69-ff24de2d16d2:14': 32 (Broken pipe)
2017-03-20T21:10:46.269393Z 0 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.17.0.2:4444/xtrabackup_sst//1' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' '' --gtid '581a154c-0db1-11e7-9a69-ff24de2d16d2:14'
2017-03-20T21:10:46.273444Z 0 [Warning] WSREP: 0.0 (c8e314fdc5d4): State transfer to 1.0 (92495e38bf2d) failed: -32 (Broken pipe)
2017-03-20T21:10:46.273473Z 0 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 14)
2017-03-20T21:10:46.276951Z 0 [Note] WSREP: Member 0.0 (c8e314fdc5d4) synced with group.
2017-03-20T21:10:46.276965Z 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 14)
2017-03-20T21:10:46.277035Z 4 [Note] WSREP: Synchronized with group, ready for connections
2017-03-20T21:10:46.277050Z 4 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-03-20T21:10:47.278086Z 0 [Note] WSREP: forgetting ae6ea4bf (tcp://10.20.1.35:4567)
2017-03-20T21:10:47.278171Z 0 [Note] WSREP: Node 6673b6dc state prim
2017-03-20T21:10:47.278221Z 0 [Note] WSREP: view(view_id(PRIM,6673b6dc,13) memb {
6673b6dc,0
} joined {
} left {
} partitioned {
ae6ea4bf,0
})
2017-03-20T21:10:47.278237Z 0 [Note] WSREP: save pc into disk
2017-03-20T21:10:47.278757Z 0 [Note] WSREP: forgetting ae6ea4bf (tcp://10.20.1.35:4567)
2017-03-20T21:10:47.278776Z 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
2017-03-20T21:10:47.279500Z 0 [Note] WSREP: STATE_EXCHANGE: sent state UUID: b0176ff9-0db1-11e7-895f-d653150a4e73
2017-03-20T21:10:47.279531Z 0 [Note] WSREP: STATE EXCHANGE: sent state msg: b0176ff9-0db1-11e7-895f-d653150a4e73
2017-03-20T21:10:47.279540Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: b0176ff9-0db1-11e7-895f-d653150a4e73 from 0 (c8e314fdc5d4)
2017-03-20T21:10:47.279551Z 0 [Note] WSREP: Quorum results:
version = 4,
component = PRIMARY,
conf_id = 12,
members = 1/1 (joined/total),
act_id = 14,
last_appl. = 0,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = 581a154c-0db1-11e7-9a69-ff24de2d16d2
2017-03-20T21:10:47.279558Z 0 [Note] WSREP: Flow-control interval: [16, 16]
2017-03-20T21:10:47.279714Z 1 [Note] WSREP: New cluster view: global state: 581a154c-0db1-11e7-9a69-ff24de2d16d2:14, view# 13: Primary, number of nodes: 1, my index: 0, protocol version 3
2017-03-20T21:10:47.279743Z 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-03-20T21:10:47.279756Z 1 [Note] WSREP: REPL Protocols: 7 (3, 2)
2017-03-20T21:10:47.279766Z 1 [Note] WSREP: Assign initial position for certification: 14, protocol version: 3
2017-03-20T21:10:47.279787Z 0 [Note] WSREP: Service thread queue flushed.
2017-03-20T21:10:47.761362Z 0 [Note] WSREP: (6673b6dc, 'tcp://0.0.0.0:4567') turning message relay requesting off
2017-03-20T21:10:51.006660Z 0 [Note] WSREP: (6673b6dc, 'tcp://0.0.0.0:4567') connection established to ae6ea4bf tcp://10.20.1.35:4567
2017-03-20T21:10:51.006697Z 0 [Warning] WSREP: discarding established (time wait) ae6ea4bf (tcp://10.20.1.35:4567)
2017-03-20T21:10:52.762135Z 0 [Note] WSREP: cleaning up ae6ea4bf (tcp://10.20.1.35:4567)