Full explanation for log files

Is there anywhere documentations to full explanation of log files? It’s kinda hard to understand what is actually happening in a cluster.

For example:
2017-01-17T12:59:24.715516Z 0 [Note] WSREP: evs:: proto(e28fb8ab, OPERATIONAL, view_id(REG,09a67365,13)): delayed 09a67365 requesting range [3640680,3640682]
2017-01-17T12:59:34.715786Z 0 [Note] WSREP: evs:: proto(e28fb8ab, OPERATIONAL, view_id(REG,09a67365,13)): delayed 09a67365 requesting range [3640680,3640686]

What is evs:: proto, what are these hashes e28fb8ab or ids 09a67365,13, ids of what, for what etc. As i understand 09a67365 is some peer id, where it’s generated. How to get all peer ids of all cluster nodes? 13 is some kind of status?

2017-01-17T12:59:43.714341Z 0 [Note] WSREP: (e28fb8ab, ‘tcp://0.0.0.0:4567’) turning message relay requesting on, nonlive peers: tcp://161.171.42.50:4567

What is it turning on, what relay, to where?

==========================================Problematic Log================================================

mysql3 was down for 3 minutes:
mysql3 log:
2017-01-17T12:59:24.715516Z 0 [Note] WSREP: evs:: proto(e28fb8ab, OPERATIONAL, view_id(REG,09a67365,13)): delayed 09a67365 requesting range [3640680,3640682]
2017-01-17T12:59:34.715786Z 0 [Note] WSREP: evs:: proto(e28fb8ab, OPERATIONAL, view_id(REG,09a67365,13)): delayed 09a67365 requesting range [3640680,3640686]
2017-01-17T12:59:43.711936Z 0 [Note] WSREP: (e28fb8ab, ‘tcp://0.0.0.0:4567’) connection to peer 09a67365 with addr tcp://mysql3_ip:4567 timed out, no messages seen in PT30S
2017-01-17T12:59:43.714341Z 0 [Note] WSREP: (e28fb8ab, ‘tcp://0.0.0.0:4567’) turning message relay requesting on, nonlive peers: tcp://mysql3_ip:4567
2017-01-17T12:59:44.716015Z 0 [Note] WSREP: evs:: proto(e28fb8ab, OPERATIONAL, view_id(REG,09a67365,13)): delayed 09a67365 requesting range [3640680,3640689]
2017-01-17T12:59:44.753789Z 0 [Note] WSREP: (e28fb8ab, ‘tcp://0.0.0.0:4567’) reconnecting to 09a67365 (tcp://mysql3_ip:4567), attempt 0
2017-01-17T12:59:44.758887Z 0 [Note] WSREP: (e28fb8ab, ‘tcp://0.0.0.0:4567’) connection established to 09a67365 tcp://mysql3_ip:4567
2017-01-17T13:00:14.806302Z 0 [Note] WSREP: (e28fb8ab, ‘tcp://0.0.0.0:4567’) turning message relay requesting off

mysql2 log:
2017-01-17T12:59:24.500776Z 0 [Note] WSREP: evs:: proto(09a67365, OPERATIONAL, view_id(REG,09a67365,13)): delayed 2e1e43d6 requesting range [3640681,3640683]
2017-01-17T12:59:24.500888Z 0 [Note] WSREP: evs:: proto(09a67365, OPERATIONAL, view_id(REG,09a67365,13)): delayed e28fb8ab requesting range [3640681,3640683]
2017-01-17T12:59:34.501010Z 0 [Note] WSREP: evs:: proto(09a67365, OPERATIONAL, view_id(REG,09a67365,13)): delayed 2e1e43d6 requesting range [3640681,3640686]
2017-01-17T12:59:34.501126Z 0 [Note] WSREP: evs:: proto(09a67365, OPERATIONAL, view_id(REG,09a67365,13)): delayed e28fb8ab requesting range [3640681,3640686]
2017-01-17T12:59:44.503415Z 0 [Note] WSREP: evs:: proto(09a67365, OPERATIONAL, view_id(REG,09a67365,13)): delayed 2e1e43d6 requesting range [3640681,3640689]
2017-01-17T12:59:44.503531Z 0 [Note] WSREP: evs:: proto(09a67365, OPERATIONAL, view_id(REG,09a67365,13)): delayed e28fb8ab requesting range [3640681,3640689]
2017-01-17T12:59:44.740189Z 0 [Note] WSREP: (09a67365, ‘tcp://0.0.0.0:4567’) connection established to 2e1e43d6 tcp://mysql2_ip:4567
2017-01-17T12:59:44.752165Z 0 [Note] WSREP: (09a67365, ‘tcp://0.0.0.0:4567’) turning message relay requesting on, nonlive peers:
2017-01-17T12:59:44.758794Z 0 [Note] WSREP: (09a67365, ‘tcp://0.0.0.0:4567’) connection established to e28fb8ab tcp://mysql3_ip:4567
2017-01-17T13:00:14.983109Z 0 [Note] WSREP: (09a67365, ‘tcp://0.0.0.0:4567’) turning message relay requesting off

mysql1 log:
2017-01-17T12:59:24.532756Z 0 [Note] WSREP: evs:: proto(2e1e43d6, OPERATIONAL, view_id(REG,09a67365,13)): delayed 09a67365 requesting range [3640680,3640682]
2017-01-17T12:59:34.532994Z 0 [Note] WSREP: evs:: proto(2e1e43d6, OPERATIONAL, view_id(REG,09a67365,13)): delayed 09a67365 requesting range [3640680,3640686]
2017-01-17T12:59:43.703099Z 0 [Note] WSREP: (2e1e43d6, ‘tcp://0.0.0.0:4567’) connection to peer 09a67365 with addr tcp://mysql3_ip:4567 timed out, no messages seen in PT30S
2017-01-17T12:59:43.705747Z 0 [Note] WSREP: (2e1e43d6, ‘tcp://0.0.0.0:4567’) turning message relay requesting on, nonlive peers: tcp://mysql3_ip:4567
2017-01-17T12:59:44.534839Z 0 [Note] WSREP: evs:: proto(2e1e43d6, OPERATIONAL, view_id(REG,09a67365,13)): delayed 09a67365 requesting range [3640680,3640689]
2017-01-17T12:59:44.705858Z 0 [Note] WSREP: (2e1e43d6, ‘tcp://0.0.0.0:4567’) reconnecting to 09a67365 (tcp://mysql3_ip:4567), attempt 0
2017-01-17T12:59:44.747527Z 0 [Note] WSREP: (2e1e43d6, ‘tcp://0.0.0.0:4567’) connection established to 09a67365 tcp://mysql3_ip:4567
2017-01-17T13:00:15.210251Z 0 [Note] WSREP: (2e1e43d6, ‘tcp://0.0.0.0:4567’) turning message relay requesting off

Would be nice to find full spec or docs about logs. I want to know what is happening in the cluster. Soon i’ll have to make monitoring for cluster and maybe visualize cluster state. Would be nice to parse logs as well for this, but without knowing what is what…

Anyone is alive in this forum? Or it is dead?

Hi,

We’re looking into simplifying and explaining/documenting some of these logs but it’s a work in progress.

evs :: proto is the implementation of the Extended Virtual Synchrony (EVS), this is built on top of their implementation of the Totem consensus protocol. Unfortunately, it’s not that well documented.

From the logs above, e28f8ab is part of the UUID for the node. I believe this is the wsrep_gcomm_uuid status variable.

The view information is (view_type, view_id, view_seqno). In your logs above, the view_seqno is 13, which means there has been 13 changes to the view (or group membership).