Not the answer you need?
Register and ask your own question!

First node fails to restart after bootstrap

VincentVincent ContributorInactive User Role Beginner
Hello,

I'm following the instruction to set up a Percona XtraDB Cluster with 3 nodes. After bootstraping the first node, I stop and start the mysql service and it fails to start again. Relevant information below:

/etc/mysql/my.cnf
https://paste.kde.org/p2v9sdfg4

/var/log/mysql/error.log
https://paste.kde.org/ptsjrg0xx

At this point the mysql service is running after the bootstrap process. Now I stop the service with "systemctl stop mysql" and the log continues like this:
https://paste.kde.org/psoixjhhg

At this point the mysql service is stopped. Now I try to start the service with "systemctl start mysql" and it fails. The log continues like this:
https://paste.kde.org/ptnihg5kr

Any idea what could be the problem?

Thanks in advance.

Comments

  • vadimtkvadimtk Contributor Percona Staff Role
    Vincent,

    Are there correct addresses?
    wsrep_cluster_address=gcomm://pxc-node-1.zone-a.mydomain.com,pxc-node-2.zone-b.mydomain.com,pxc-node-3.zone-b.mydomain.com

    you need to use valid IP addresses or hostnames of your nodes.
  • VincentVincent Contributor Inactive User Role Beginner
    vadimtk wrote: »
    Vincent,

    Are there correct addresses?
    wsrep_cluster_address=gcomm://pxc-node-1.zone-a.mydomain.com,pxc-node-2.zone-b.mydomain.com,pxc-node-3.zone-b.mydomain.com

    you need to use valid IP addresses or hostnames of your nodes.

    Yes, of course, I'm using valid hostnames there. I just changed them a little bit to preserve my privacy.
    $ ping pxc-node-1
    PING pxc-node-1.****.****.org (192.168.1.170) 56(84) bytes of data.
    64 bytes from pxc-node-1.****.****.org (192.168.1.170): icmp_seq=1 ttl=64 time=0.487 ms
    64 bytes from pxc-node-1.****.****.org (192.168.1.170): icmp_seq=2 ttl=64 time=0.263 ms
    64 bytes from pxc-node-1.****.****.org (192.168.1.170): icmp_seq=3 ttl=64 time=0.296 ms
    64 bytes from pxc-node-1.****.****.org (192.168.1.170): icmp_seq=4 ttl=64 time=0.285 ms
    ^C
    --- pxc-node-1.****.****.org ping statistics ---
    4 packets transmitted, 4 received, 0% packet loss, time 3000ms
    rtt min/avg/max/mdev = 0.263/0.332/0.487/0.092 ms
    
  • VincentVincent Contributor Inactive User Role Beginner
    Any idea why this could be happening?
  • vadimtkvadimtk Contributor Percona Staff Role
    Lines
    1. 2017-03-07T15:52:01.532235Z 0 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
    2. at gcomm/src/pc.cpp:connect():158
    3. 2017-03-07T15:52:01.532259Z 0 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
    4. 2017-03-07T15:52:01.532352Z 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1407: Failed to open channel 'pxc-cluster-1' at 'gcomm://pxc-node-1.zone-a.mydomain.com,pxc-node-2.zone-b.mydomain.com,pxc-node-3.zone-b.mydomain.com': -110 (Connection timed out)
    5. 2017-03-07T15:52:01.532370Z 0 [ERROR] WSREP: gcs connect failed: Connection timed out
    6. 2017-03-07T15:52:01.532382Z 0 [ERROR] WSREP: wsrep::connect(gcomm://pxc-node-1.zone-a.mydomain.com,pxc-node-2.zone-b.mydomain.com,pxc-node-3.zone-b.mydomain.com) failed: 7
    indicate that there is some problem with network.
    Please check that nodes are really available and also firewall does not block connections and tcp ports are accessible.
    https://www.percona.com/doc/percona-...xtradb-cluster
  • unixroninunixronin Contributor Current User Role Beginner
    Vincent,
    Pardon me if I'm missing something here, but this is what it appears you are describing doing:

    1. Start node 1 using bootstrap_pxc to create a cluster
    2. Stop node 1 before joining any additional nodes, terminating the cluster
    3. Start node 1 again using 'start' instead of 'bootstrap_pxc' and expect it to join a cluster that doesn't have any running nodes

    Have you tried joining additional nodes to the cluster BEFORE you stop the bootstrapped node? Do they successfully join?
  • VincentVincent Contributor Inactive User Role Beginner
    vadimtk wrote: »
    Lines
    1. 2017-03-07T15:52:01.532235Z 0 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
    2. at gcomm/src/pc.cpp:connect():158
    3. 2017-03-07T15:52:01.532259Z 0 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
    4. 2017-03-07T15:52:01.532352Z 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1407: Failed to open channel 'pxc-cluster-1' at 'gcomm://pxc-node-1.zone-a.mydomain.com,pxc-node-2.zone-b.mydomain.com,pxc-node-3.zone-b.mydomain.com': -110 (Connection timed out)
    5. 2017-03-07T15:52:01.532370Z 0 [ERROR] WSREP: gcs connect failed: Connection timed out
    6. 2017-03-07T15:52:01.532382Z 0 [ERROR] WSREP: wsrep::connect(gcomm://pxc-node-1.zone-a.mydomain.com,pxc-node-2.zone-b.mydomain.com,pxc-node-3.zone-b.mydomain.com) failed: 7
    indicate that there is some problem with network.
    Please check that nodes are really available and also firewall does not block connections and tcp ports are accessible.
    https://www.percona.com/doc/percona-...xtradb-cluster

    That cannot be the problem, since there isn't any firewall stopping connections to those ports. Anyway, to simplify things I just started from scratch and installed Percona XtraDB Cluster in a new Debian machine. The configuration is simple, just this node in the cluster. So, let's bootstrap and then start this first (and only) node:
    root@pxc-node-1:~# /etc/init.d/mysql bootstrap-pxc
    [ ok ] Bootstrapping Percona XtraDB Cluster database server: mysqld ..
    root@pxc-node-1:~# /etc/init.d/mysql start        
    [ ok ] Starting mysql (via systemctl): mysql.service.
    

    Let's see the listening ports:
    root@pxc-node-1:~# netstat -putan | grep mysqld
    tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN      7972/mysqld     
    tcp        0      0 0.0.0.0:4567            0.0.0.0:*               LISTEN      7972/mysqld
    

    The IP address of that machine is 192.168.154.50:
    root@pxc-node-1:~# ip addr | grep inet
        inet 127.0.0.1/8 scope host lo
        inet6 ::1/128 scope host
        inet [B]192.168.154.50[/B]/24 brd 192.168.154.255 scope global eth0
        inet6 fe80::216:3eff:fe45:7abd/64 scope link
    

    From a different machine I check if those ports are open:
    $ nmap 192.168.154.50 | grep open
    22/tcp   open  ssh
    3306/tcp open  mysql
    4567/tcp open  tram
    

    Now, before restarting the mysql server let's have a look to the my.cnf file:
    root@pxc-node-1:~# cat /etc/mysql/my.cnf
    [client]
    port        = 3306
    socket        = /var/run/mysqld/mysqld.sock
    
    [mysqld_safe]
    socket        = /var/run/mysqld/mysqld.sock
    nice        = 0
    
    [mysqld]
    user        = mysql
    pid-file    = /var/run/mysqld/mysqld.pid
    socket        = /var/run/mysqld/mysqld.sock
    port        = 3306
    basedir        = /usr
    datadir        = /var/lib/mysql
    tmpdir        = /tmp
    lc-messages-dir    = /usr/share/mysql
    skip-external-locking
    bind-address        = 0.0.0.0
    max_allowed_packet    = 16M
    thread_stack        = 192K
    thread_cache_size       = 8
    query_cache_limit    = 1M
    query_cache_size        = 16M
    log_error = /var/log/mysql/error.log
    expire_logs_days    = 10
    max_binlog_size         = 100M
    wsrep_provider=/usr/lib/libgalera_smm.so
    wsrep_cluster_name=pxc-cluster-1
    wsrep_cluster_address=gcomm://192.168.154.50
    wsrep_node_name=pxc-node-1
    wsrep_node_address=192.168.154.50
    wsrep_sst_method=xtrabackup-v2
    wsrep_sst_auth=sstuser:sstpass
    pxc_strict_mode=ENFORCING
    binlog_format=ROW
    default_storage_engine=InnoDB
    innodb_autoinc_lock_mode=2
    
    [mysqldump]
    quick
    quote-names
    max_allowed_packet    = 16M
    
    [mysql]
    
    [isamchk]
    
    !includedir /etc/mysql/conf.d/
    

    Ok, let's restart it:
    root@pxc-node-1:~# /etc/init.d/mysql restart
    [....] Restarting mysql (via systemctl): mysql.serviceJob for mysql.service failed. See 'systemctl status mysql.service' and 'journalctl -xn' for details.
     failed!
    

    This is the log:
    root@pxc-node-1:~# cat /var/log/mysql/error.log  
    2017-03-24T16:37:23.302908Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
    2017-03-24T16:37:23.303756Z 0 [Note] /usr/sbin/mysqld (mysqld 5.7.17-11-57) starting as process 8764 ...
    2017-03-24T16:37:23.306318Z 0 [Note] WSREP: Read nil XID from storage engines, skipping position init
    2017-03-24T16:37:23.306336Z 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'
    2017-03-24T16:37:23.309949Z 0 [Note] WSREP: wsrep_load(): Galera 3.20(r7e383f7) by Codership Oy <info&#64;codership.com> loaded successfully.
    2017-03-24T16:37:23.310016Z 0 [Note] WSREP: CRC-32C: using hardware acceleration.
    2017-03-24T16:37:23.310370Z 0 [Note] WSREP: Found saved state: 82016f02-10aa-11e7-a0c8-f3e63a8d05f1:3, safe_to_bootsrap: 1
    2017-03-24T16:37:23.327889Z 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.154.50; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery = 1; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = PT30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 7; socket.checksum = 2; socket.recv_buf_size = 212992;
    2017-03-24T16:37:23.337329Z 0 [Note] WSREP: GCache history reset: old(82016f02-10aa-11e7-a0c8-f3e63a8d05f1:0) -> new(82016f02-10aa-11e7-a0c8-f3e63a8d05f1:3)
    2017-03-24T16:37:23.343982Z 0 [Note] WSREP: Assign initial position for certification: 3, protocol version: -1
    2017-03-24T16:37:23.344014Z 0 [Note] WSREP: wsrep_sst_grab()
    2017-03-24T16:37:23.344024Z 0 [Note] WSREP: Start replication
    2017-03-24T16:37:23.344045Z 0 [Note] WSREP: Setting initial position to 82016f02-10aa-11e7-a0c8-f3e63a8d05f1:3
    2017-03-24T16:37:23.344153Z 0 [Note] WSREP: protonet asio version 0
    2017-03-24T16:37:23.344324Z 0 [Note] WSREP: Using CRC-32C for message checksums.
    2017-03-24T16:37:23.344400Z 0 [Note] WSREP: backend: asio
    2017-03-24T16:37:23.344498Z 0 [Note] WSREP: gcomm thread scheduling priority set to other:0
    2017-03-24T16:37:23.344672Z 0 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
    2017-03-24T16:37:23.344690Z 0 [Note] WSREP: restore pc from disk failed
    2017-03-24T16:37:23.345586Z 0 [Note] WSREP: GMCast version 0
    2017-03-24T16:37:23.345944Z 0 [Note] WSREP: (283cbdd2, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
    2017-03-24T16:37:23.345967Z 0 [Note] WSREP: (283cbdd2, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
    2017-03-24T16:37:23.346540Z 0 [Note] WSREP: EVS version 0
    2017-03-24T16:37:23.346654Z 0 [Note] WSREP: gcomm: connecting to group 'pxc-cluster-1', peer '192.168.154.50:'
    2017-03-24T16:37:23.347540Z 0 [Note] WSREP: (283cbdd2, 'tcp://0.0.0.0:4567') connection established to 283cbdd2 tcp://192.168.154.50:4567
    2017-03-24T16:37:23.347560Z 0 [Warning] WSREP: (283cbdd2, 'tcp://0.0.0.0:4567') address 'tcp://192.168.154.50:4567' points to own listening address, blacklisting
    2017-03-24T16:37:26.347492Z 0 [Note] WSREP: (283cbdd2, 'tcp://0.0.0.0:4567') connection to peer 283cbdd2 with addr tcp://192.168.154.50:4567 timed out, no messages seen in PT3S
    2017-03-24T16:37:26.347762Z 0 [Warning] WSREP: no nodes coming from prim view, prim not possible
    2017-03-24T16:37:26.347797Z 0 [Note] WSREP: view(view_id(NON_PRIM,283cbdd2,1) memb {
        283cbdd2,0
    } joined {
    } left {
    } partitioned {
    })
    2017-03-24T16:37:26.847913Z 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50141S), skipping check
    2017-03-24T16:37:56.352374Z 0 [Note] WSREP: view((empty))
    2017-03-24T16:37:56.352577Z 0 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
         at gcomm/src/pc.cpp:connect():158
    2017-03-24T16:37:56.352998Z 0 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
    2017-03-24T16:37:56.353076Z 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1437: Failed to open channel 'pxc-cluster-1' at 'gcomm://192.168.154.50': -110 (Connection timed out)
    2017-03-24T16:37:56.353092Z 0 [ERROR] WSREP: gcs connect failed: Connection timed out
    2017-03-24T16:37:56.353099Z 0 [ERROR] WSREP: wsrep::connect(gcomm://192.168.154.50) failed: 7
    2017-03-24T16:37:56.353104Z 0 [ERROR] Aborting
    
    2017-03-24T16:37:56.353111Z 0 [Note] Giving 0 client threads a chance to die gracefully
    2017-03-24T16:37:56.353120Z 0 [Note] WSREP: Service disconnected.
    2017-03-24T16:37:59.353323Z 0 [Note] WSREP: Some threads may fail to exit.
    2017-03-24T16:37:59.353371Z 0 [Note] Binlog end
    2017-03-24T16:37:59.353470Z 0 [Note] /usr/sbin/mysqld: Shutdown complete
    

    Any idea?
  • VincentVincent Contributor Inactive User Role Beginner
    unixronin wrote: »
    Vincent,
    Pardon me if I'm missing something here, but this is what it appears you are describing doing:

    1. Start node 1 using bootstrap_pxc to create a cluster
    2. Stop node 1 before joining any additional nodes, terminating the cluster
    3. Start node 1 again using 'start' instead of 'bootstrap_pxc' and expect it to join a cluster that doesn't have any running nodes

    Have you tried joining additional nodes to the cluster BEFORE you stop the bootstrapped node? Do they successfully join?

    Let's try it. First check the IP address of the nodes 1 and 2:
    root&#64;pxc-node-1:~# ip addr | grep inet
        inet 127.0.0.1/8 scope host lo
        inet6 ::1/128 scope host
        inet [B]192.168.154.50[/B]/24 brd 192.168.154.255 scope global eth0
        inet6 fe80::216:3eff:fe45:7abd/64 scope link
    
    root&#64;pxc-node-2:~# ip addr | grep inet
        inet 127.0.0.1/8 scope host lo
        inet6 ::1/128 scope host
        inet [B]192.168.154.53[/B]/24 brd 192.168.154.255 scope global eth0
        inet6 fe80::216:3eff:fe78:5d73/64 scope link
    

    This is the my.cnf file:
    root&#64;pxc-node-1:~# cat /etc/mysql/my.cnf
    [client]
    port        = 3306
    socket        = /var/run/mysqld/mysqld.sock
    
    [mysqld_safe]
    socket        = /var/run/mysqld/mysqld.sock
    nice        = 0
    
    [mysqld]
    user        = mysql
    pid-file    = /var/run/mysqld/mysqld.pid
    socket        = /var/run/mysqld/mysqld.sock
    port        = 3306
    basedir        = /usr
    datadir        = /var/lib/mysql
    tmpdir        = /tmp
    lc-messages-dir    = /usr/share/mysql
    skip-external-locking
    bind-address        = 0.0.0.0
    max_allowed_packet    = 16M
    thread_stack        = 192K
    thread_cache_size       = 8
    query_cache_limit    = 1M
    query_cache_size        = 16M
    log_error = /var/log/mysql/error.log
    expire_logs_days    = 10
    max_binlog_size         = 100M
    wsrep_provider=/usr/lib/libgalera_smm.so
    wsrep_cluster_name=pxc-cluster-1
    wsrep_cluster_address=gcomm://192.168.154.50,192.168.154.53
    wsrep_node_name=pxc-node-1
    wsrep_node_address=192.168.154.50
    wsrep_sst_method=xtrabackup-v2
    wsrep_sst_auth=sstuser:sstpass
    pxc_strict_mode=ENFORCING
    binlog_format=ROW
    default_storage_engine=InnoDB
    innodb_autoinc_lock_mode=2
    
    [mysqldump]
    quick
    quote-names
    max_allowed_packet    = 16M
    
    [mysql]
    
    [isamchk]
    
    !includedir /etc/mysql/conf.d/
    

    Now let's bootstrap node-1 and check the cluster is synced and ready:
    root&#64;pxc-node-1:~# /etc/init.d/mysql bootstrap-pxc
    [ ok ] Bootstrapping Percona XtraDB Cluster database server: mysqld ..
    root&#64;pxc-node-1:~# systemctl start mysql
    root&#64;pxc-node-1:~# mysql -u root -p
    Enter password:
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 7
    Server version: 5.7.17-11-57 Percona XtraDB Cluster (GPL), Release rel11, Revision e2a7fdd, WSREP version 27.20, wsrep_27.20
    
    Copyright (c) 2009-2016 Percona LLC and/or its affiliates
    Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
    
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    mysql> show status like 'wsrep%';
    +------------------------------+--------------------------------------+
    | Variable_name                | Value                                |
    +------------------------------+--------------------------------------+
    | wsrep_local_state_uuid       | 82016f02-10aa-11e7-a0c8-f3e63a8d05f1 |
    | wsrep_protocol_version       | 7                                    |
    | wsrep_last_committed         | 4                                    |
    | wsrep_replicated             | 0                                    |
    | wsrep_replicated_bytes       | 0                                    |
    | wsrep_repl_keys              | 0                                    |
    | wsrep_repl_keys_bytes        | 0                                    |
    | wsrep_repl_data_bytes        | 0                                    |
    | wsrep_repl_other_bytes       | 0                                    |
    | wsrep_received               | 2                                    |
    | wsrep_received_bytes         | 148                                  |
    | wsrep_local_commits          | 0                                    |
    | wsrep_local_cert_failures    | 0                                    |
    | wsrep_local_replays          | 0                                    |
    | wsrep_local_send_queue       | 0                                    |
    | wsrep_local_send_queue_max   | 1                                    |
    | wsrep_local_send_queue_min   | 0                                    |
    | wsrep_local_send_queue_avg   | 0.000000                             |
    | wsrep_local_recv_queue       | 0                                    |
    | wsrep_local_recv_queue_max   | 1                                    |
    | wsrep_local_recv_queue_min   | 0                                    |
    | wsrep_local_recv_queue_avg   | 0.000000                             |
    | wsrep_local_cached_downto    | 0                                    |
    | wsrep_flow_control_paused_ns | 0                                    |
    | wsrep_flow_control_paused    | 0.000000                             |
    | wsrep_flow_control_sent      | 0                                    |
    | wsrep_flow_control_recv      | 0                                    |
    | wsrep_flow_control_interval  | [ 16, 16 ]                           |
    | wsrep_cert_deps_distance     | 0.000000                             |
    | wsrep_apply_oooe             | 0.000000                             |
    | wsrep_apply_oool             | 0.000000                             |
    | wsrep_apply_window           | 0.000000                             |
    | wsrep_commit_oooe            | 0.000000                             |
    | wsrep_commit_oool            | 0.000000                             |
    | wsrep_commit_window          | 0.000000                             |
    | wsrep_local_state            | 4                                    |
    | wsrep_local_state_comment    | Synced                               |
    | wsrep_cert_index_size        | 0                                    |
    | wsrep_cert_bucket_count      | 22                                   |
    | wsrep_gcache_pool_size       | 1320                                 |
    | wsrep_causal_reads           | 0                                    |
    | wsrep_cert_interval          | 0.000000                             |
    | wsrep_incoming_addresses     | 192.168.154.50:3306                  |
    | wsrep_desync_count           | 0                                    |
    | wsrep_evs_delayed            |                                      |
    | wsrep_evs_evict_list         |                                      |
    | wsrep_evs_repl_latency       | 0/0/0/0/0                            |
    | wsrep_evs_state              | OPERATIONAL                          |
    | wsrep_gcomm_uuid             | 91c7bee0-10b2-11e7-aea7-4f13680ff88c |
    | wsrep_cluster_conf_id        | 1                                    |
    | wsrep_cluster_size           | 1                                    |
    | wsrep_cluster_state_uuid     | 82016f02-10aa-11e7-a0c8-f3e63a8d05f1 |
    | wsrep_cluster_status         | Primary                              |
    | wsrep_connected              | ON                                   |
    | wsrep_local_bf_aborts        | 0                                    |
    | wsrep_local_index            | 0                                    |
    | wsrep_provider_name          | Galera                               |
    | wsrep_provider_vendor        | Codership Oy <info&#64;codership.com>    |
    | wsrep_provider_version       | 3.20(r7e383f7)                       |
    | wsrep_ready                  | ON                                   |
    +------------------------------+--------------------------------------+
    60 rows in set (0.00 sec)
    
    mysql> quit
    Bye
    

    Now from node-2, this is the only difference in my.cnf:
    wsrep_node_name=pxc-node-2
    wsrep_node_address=192.168.154.53
    

    Let's start mysql and see if it joins the cluster:
    root&#64;pxc-node-2:~# /etc/init.d/mysql start
    [....] Starting mysql (via systemctl): mysql.serviceJob for mysql.service failed. See 'systemctl status mysql.service' and 'journalctl -xn' for details.
     failed!
    

    This is the error.log: https://paste.kde.org/polvcg4ew

    And this is the error.log in node-1: https://paste.kde.org/p9m3ledhc
  • jriverajrivera Percona Support Engineer Percona Staff Role
    Did you create the SST user on the first node before trying to let node2 join the cluster?
  • VincentVincent Contributor Inactive User Role Beginner
    jrivera wrote: »
    Did you create the SST user on the first node before trying to let node2 join the cluster?

    Yes, I didn't put that information in the post, but I did create the sst user on both sides.
  • Kenn TakaraKenn Takara Percona Percona Staff Role
    Part of the logs on the donor, the innobackup.backup.log, may contain more information about what failed.



    WSREP_SST: [INFO] Evaluating xtrabackup --defaults-file=/etc/mysql/my.cnf --defaults-group=mysqld $tmpopts $INNOEXTRA $keyringbackupopt --backup --galera-info --binlog-info=ON --stream=$sfmt --target-dir=$itmpdir 2>${DATA}/innobackup.backup.log | socat -u stdio TCP:192.168.154.53:4444; RC=( ${PIPESTATUS[@]} ) (20170324 17:23:23.544)

    2017-03-24T17:23:23.615309Z 8 [Note] Aborted connection 8 to db: 'unconnected' user: 'sstuser' host: 'localhost' (Got an error reading communication packets)

    WSREP_SST: [ERROR] xtrabackup finished with error: 1. Check /var/lib/mysql//innobackup.backup.log (20170324 17:23:23.616)
  • VincentVincent Contributor Inactive User Role Beginner
    We are missing the most important point, which is the mysql service simply fails to restart after bootstraping, even before continuing with the process of bringing up new nodes.

    For instance, a my.cnf config file containing this (among the rest of the stuff, of course)...
    bind-address = 127.0.0.1
    wsrep_cluster_address = gcomm://127.0.0.1
    wsrep_node_address = 127.0.0.1
    

    ...also fails to restart with the same problem:
    2017-04-03T12:55:55.531777Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
    2017-04-03T12:55:55.532706Z 0 [Note] /usr/sbin/mysqld (mysqld 5.7.17-11-57) starting as process 2412 ...
    2017-04-03T12:55:55.535613Z 0 [Note] WSREP: Read nil XID from storage engines, skipping position init
    2017-04-03T12:55:55.535631Z 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'
    2017-04-03T12:55:55.539325Z 0 [Note] WSREP: wsrep_load(): Galera 3.20(r7e383f7) by Codership Oy <info&#64;codership.com> loaded successfully.
    2017-04-03T12:55:55.539391Z 0 [Note] WSREP: CRC-32C: using hardware acceleration.
    2017-04-03T12:55:55.539748Z 0 [Note] WSREP: Found saved state: 82016f02-10aa-11e7-a0c8-f3e63a8d05f1:4, safe_to_bootsrap: 1
    2017-04-03T12:55:55.577431Z 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 127.0.0.1; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery = 1; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = PT30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 7; socket.checksum = 2; socket.recv_buf_size = 212992;
    2017-04-03T12:55:55.587279Z 0 [Note] WSREP: GCache history reset: old(82016f02-10aa-11e7-a0c8-f3e63a8d05f1:0) -> new(82016f02-10aa-11e7-a0c8-f3e63a8d05f1:4)
    2017-04-03T12:55:55.593374Z 0 [Note] WSREP: Assign initial position for certification: 4, protocol version: -1
    2017-04-03T12:55:55.593407Z 0 [Note] WSREP: wsrep_sst_grab()
    2017-04-03T12:55:55.593416Z 0 [Note] WSREP: Start replication
    2017-04-03T12:55:55.593439Z 0 [Note] WSREP: Setting initial position to 82016f02-10aa-11e7-a0c8-f3e63a8d05f1:4
    2017-04-03T12:55:55.593518Z 0 [Note] WSREP: protonet asio version 0
    2017-04-03T12:55:55.593635Z 0 [Note] WSREP: Using CRC-32C for message checksums.
    2017-04-03T12:55:55.593675Z 0 [Note] WSREP: backend: asio
    2017-04-03T12:55:55.593739Z 0 [Note] WSREP: gcomm thread scheduling priority set to other:0
    2017-04-03T12:55:55.593846Z 0 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
    2017-04-03T12:55:55.593878Z 0 [Note] WSREP: restore pc from disk failed
    2017-04-03T12:55:55.594422Z 0 [Note] WSREP: GMCast version 0
    2017-04-03T12:55:55.594602Z 0 [Note] WSREP: (e0405f22, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
    2017-04-03T12:55:55.594614Z 0 [Note] WSREP: (e0405f22, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
    2017-04-03T12:55:55.595036Z 0 [Note] WSREP: EVS version 0
    2017-04-03T12:55:55.595156Z 0 [Note] WSREP: gcomm: connecting to group 'pxc-cluster-1', peer '127.0.0.1:'
    2017-04-03T12:55:55.596099Z 0 [Note] WSREP: (e0405f22, 'tcp://0.0.0.0:4567') connection established to e0405f22 tcp://127.0.0.1:4567
    2017-04-03T12:55:55.596119Z 0 [Warning] WSREP: (e0405f22, 'tcp://0.0.0.0:4567') address 'tcp://127.0.0.1:4567' points to own listening address, blacklisting
    2017-04-03T12:55:58.596032Z 0 [Note] WSREP: (e0405f22, 'tcp://0.0.0.0:4567') connection to peer e0405f22 with addr tcp://127.0.0.1:4567 timed out, no messages seen in PT3S
    2017-04-03T12:55:58.596148Z 0 [Warning] WSREP: no nodes coming from prim view, prim not possible
    2017-04-03T12:55:58.596170Z 0 [Note] WSREP: view(view_id(NON_PRIM,e0405f22,1) memb {
        e0405f22,0
    } joined {
    } left {
    } partitioned {
    })
    2017-04-03T12:55:59.096286Z 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50126S), skipping check
    2017-04-03T12:56:28.603257Z 0 [Note] WSREP: view((empty))
    2017-04-03T12:56:28.603414Z 0 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
         at gcomm/src/pc.cpp:connect():158
    2017-04-03T12:56:28.603430Z 0 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
    2017-04-03T12:56:28.603493Z 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1437: Failed to open channel 'pxc-cluster-1' at 'gcomm://127.0.0.1': -110 (Connection timed out)
    2017-04-03T12:56:28.603506Z 0 [ERROR] WSREP: gcs connect failed: Connection timed out
    2017-04-03T12:56:28.603514Z 0 [ERROR] WSREP: wsrep::connect(gcomm://127.0.0.1) failed: 7
    2017-04-03T12:56:28.603519Z 0 [ERROR] Aborting
    
    2017-04-03T12:56:28.603527Z 0 [Note] Giving 0 client threads a chance to die gracefully
    2017-04-03T12:56:28.603537Z 0 [Note] WSREP: Service disconnected.
    2017-04-03T12:56:31.603624Z 0 [Note] WSREP: Some threads may fail to exit.
    2017-04-03T12:56:31.603682Z 0 [Note] Binlog end
    2017-04-03T12:56:31.603828Z 0 [Note] /usr/sbin/mysqld: Shutdown complete
    

    How can it fail to connect to 127.0.0.1?
  • VincentVincent Contributor Inactive User Role Beginner
    Ok, I've made some progress here.

    After following the manual on the website to the letter, I've managed to have pxc-node-1 (192.168.154.40) and pxc-node-2 (192.168.154.119) synced and exchanging data. The manual I had was for the same version, but I downloaded it on PDF and it wasn't up-to-date. The privileges.

    One of the things I've noticed is that after running
    root&#64;pxc-node-1:~# ps aux | grep mysql
    root      4530  0.0  0.0   4328   756 pts/4    S    14:38   0:00 /bin/sh /usr/bin/mysqld_safe --wsrep-new-cluster
    mysql     4999  0.5  3.7 2033804 222440 pts/4  Sl   14:38   0:11 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --wsrep-provider=/usr/lib/libgalera_smm.so --wsrep-new-cluster --log-error=/var/log/mysql/error.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306 --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
    root      5844  0.0  0.0  11120   692 pts/4    S+   15:12   0:00 grep mysql
    
    root&#64;pxc-node-1:~# /etc/init.d/mysql status
    ● mysql.service - LSB: Start and stop the mysql (Percona XtraDB Cluster) daemon
       Loaded: loaded (/etc/init.d/mysql)
       Active: inactive (dead) since Mon 2017-04-03 14:31:08 UTC; 40min ago
      Process: 4320 ExecStop=/etc/init.d/mysql stop (code=exited, status=0/SUCCESS)
      Process: 3905 ExecStart=/etc/init.d/mysql start (code=exited, status=0/SUCCESS)
    

    After having the two nodes synced and checking the replication works correctly by creating databases, tables and insert some data from any of the nodes, I try to stop and start pxc-node-2 successfully:
    root&#64;pxc-node-2:~# /etc/init.d/mysql stop
    [ ok ] Stopping mysql (via systemctl): mysql.service.
    root&#64;pxc-node-2:~# /etc/init.d/mysql start
    [ ok ] Starting mysql (via systemctl): mysql.service.
    

    So pxc-node-2 is fine. The problem now is that I have the pxc-node-1 running, but I cannot stop it in a normal way:
    root&#64;pxc-node-1:~# /etc/init.d/mysql stop  
    [ ok ] Stopping mysql (via systemctl): mysql.service.
    root&#64;pxc-node-1:~# ps aux | grep mysql
    root      4530  0.0  0.0   4328   756 pts/4    S    14:38   0:00 /bin/sh /usr/bin/mysqld_safe --wsrep-new-cluster
    mysql     4999  0.4  3.7 2033804 222440 pts/4  Sl   14:38   0:11 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --wsrep-provider=/usr/lib/libgalera_smm.so --wsrep-new-cluster --log-error=/var/log/mysql/error.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306 --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
    root      5883  0.0  0.0  11120   692 pts/4    S+   15:18   0:00 grep mysql
    

    As you see the process are already running. Is there a special way to stop a just-bootstraped first node so it can be started in a normal way? (

    Instead of killing the processes what I did was to stop the LXC containers where pxc-node-2 and pxc-node-1 were running, so that way everything was stopped gracefully. Now, when I start the container for pxc-node-1 I have the "famous" error I've been talking about since the beginning of this thread:
    2017-04-03T15:33:57.147121Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
    2017-04-03T15:33:57.148445Z 0 [Note] /usr/sbin/mysqld (mysqld 5.7.17-11-57) starting as process 585 ...
    2017-04-03T15:33:57.151044Z 0 [Note] WSREP: Read nil XID from storage engines, skipping position init
    2017-04-03T15:33:57.151063Z 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'
    2017-04-03T15:33:57.154781Z 0 [Note] WSREP: wsrep_load(): Galera 3.20(r7e383f7) by Codership Oy <info&#64;codership.com> loaded successfully.
    2017-04-03T15:33:57.154846Z 0 [Note] WSREP: CRC-32C: using hardware acceleration.
    2017-04-03T15:33:57.155209Z 0 [Note] WSREP: Found saved state: 38103d13-187b-11e7-b05f-938dd425b3db:9, safe_to_bootsrap: 1
    2017-04-03T15:33:57.172886Z 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.154.40; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery = 1; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = PT30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 7; socket.checksum = 2; socket.recv_buf_size = 212992;
    2017-04-03T15:33:57.182531Z 0 [Note] WSREP: GCache history reset: old(38103d13-187b-11e7-b05f-938dd425b3db:0) -> new(38103d13-187b-11e7-b05f-938dd425b3db:9)
    2017-04-03T15:33:57.188962Z 0 [Note] WSREP: Assign initial position for certification: 9, protocol version: -1
    2017-04-03T15:33:57.188988Z 0 [Note] WSREP: wsrep_sst_grab()
    2017-04-03T15:33:57.188998Z 0 [Note] WSREP: Start replication
    2017-04-03T15:33:57.189012Z 0 [Note] WSREP: Setting initial position to 38103d13-187b-11e7-b05f-938dd425b3db:9
    2017-04-03T15:33:57.189097Z 0 [Note] WSREP: protonet asio version 0
    2017-04-03T15:33:57.189209Z 0 [Note] WSREP: Using CRC-32C for message checksums.
    2017-04-03T15:33:57.189247Z 0 [Note] WSREP: backend: asio
    2017-04-03T15:33:57.189302Z 0 [Note] WSREP: gcomm thread scheduling priority set to other:0
    2017-04-03T15:33:57.189404Z 0 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
    2017-04-03T15:33:57.189415Z 0 [Note] WSREP: restore pc from disk failed
    2017-04-03T15:33:57.189997Z 0 [Note] WSREP: GMCast version 0
    2017-04-03T15:33:57.192786Z 0 [Warning] WSREP: Failed to resolve tcp://192.168.154.119:4567
    2017-04-03T15:33:57.193056Z 0 [Note] WSREP: (f3b908c9, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
    2017-04-03T15:33:57.193077Z 0 [Note] WSREP: (f3b908c9, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
    2017-04-03T15:33:57.193745Z 0 [Note] WSREP: EVS version 0
    2017-04-03T15:33:57.194026Z 0 [Note] WSREP: gcomm: connecting to group 'pxc-cluster-1', peer '192.168.154.40:,192.168.154.119:'
    2017-04-03T15:33:57.195442Z 0 [Note] WSREP: (f3b908c9, 'tcp://0.0.0.0:4567') connection established to f3b908c9 tcp://192.168.154.40:4567
    2017-04-03T15:33:57.195468Z 0 [Warning] WSREP: (f3b908c9, 'tcp://0.0.0.0:4567') address 'tcp://192.168.154.40:4567' points to own listening address, blacklisting
    2017-04-03T15:34:00.195176Z 0 [Warning] WSREP: no nodes coming from prim view, prim not possible
    2017-04-03T15:34:00.195215Z 0 [Note] WSREP: view(view_id(NON_PRIM,f3b908c9,1) memb {
        f3b908c9,0
    } joined {
    } left {
    } partitioned {
    })
    2017-04-03T15:34:00.695109Z 0 [Note] WSREP: (f3b908c9, 'tcp://0.0.0.0:4567') connection to peer f3b908c9 with addr tcp://192.168.154.40:4567 timed out, no messages seen in PT3S
    2017-04-03T15:34:00.695479Z 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50179S), skipping check
    

    Even if I can't start the pxc-node-1 anymore I decide to start the container where pxc-node-2 lives, and again the service tries to start at boot time and it also fails:
    2017-04-03T15:40:17.475722Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
    2017-04-03T15:40:17.477019Z 0 [Note] /usr/sbin/mysqld (mysqld 5.7.17-11-57) starting as process 585 ...
    2017-04-03T15:40:17.479723Z 0 [Note] WSREP: Read nil XID from storage engines, skipping position init
    2017-04-03T15:40:17.479743Z 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'
    2017-04-03T15:40:17.483576Z 0 [Note] WSREP: wsrep_load(): Galera 3.20(r7e383f7) by Codership Oy <info&#64;codership.com> loaded successfully.
    2017-04-03T15:40:17.483645Z 0 [Note] WSREP: CRC-32C: using hardware acceleration.
    2017-04-03T15:40:17.484023Z 0 [Note] WSREP: Found saved state: 38103d13-187b-11e7-b05f-938dd425b3db:9, safe_to_bootsrap: 0
    2017-04-03T15:40:17.502731Z 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.154.119; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery = 1; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = PT30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 7; socket.checksum = 2; socket.recv_buf_size = 212992;
    2017-04-03T15:40:17.512730Z 0 [Note] WSREP: GCache history reset: old(38103d13-187b-11e7-b05f-938dd425b3db:0) -> new(38103d13-187b-11e7-b05f-938dd425b3db:9)
    2017-04-03T15:40:17.518921Z 0 [Note] WSREP: Assign initial position for certification: 9, protocol version: -1
    2017-04-03T15:40:17.518947Z 0 [Note] WSREP: wsrep_sst_grab()
    2017-04-03T15:40:17.518955Z 0 [Note] WSREP: Start replication
    2017-04-03T15:40:17.518970Z 0 [Note] WSREP: Setting initial position to 38103d13-187b-11e7-b05f-938dd425b3db:9
    2017-04-03T15:40:17.519057Z 0 [Note] WSREP: protonet asio version 0
    2017-04-03T15:40:17.519174Z 0 [Note] WSREP: Using CRC-32C for message checksums.
    2017-04-03T15:40:17.519214Z 0 [Note] WSREP: backend: asio
    2017-04-03T15:40:17.519276Z 0 [Note] WSREP: gcomm thread scheduling priority set to other:0
    2017-04-03T15:40:17.519383Z 0 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
    2017-04-03T15:40:17.519395Z 0 [Note] WSREP: restore pc from disk failed
    2017-04-03T15:40:17.519970Z 0 [Note] WSREP: GMCast version 0
    2017-04-03T15:40:17.522754Z 0 [Warning] WSREP: Failed to resolve tcp://192.168.154.119:4567
    2017-04-03T15:40:17.522976Z 0 [Note] WSREP: (d66ac8a9, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
    2017-04-03T15:40:17.522999Z 0 [Note] WSREP: (d66ac8a9, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
    2017-04-03T15:40:17.523707Z 0 [Note] WSREP: EVS version 0
    2017-04-03T15:40:17.523919Z 0 [Note] WSREP: gcomm: connecting to group 'pxc-cluster-1', peer '192.168.154.40:,192.168.154.119:'
    2017-04-03T15:40:20.525861Z 0 [Warning] WSREP: no nodes coming from prim view, prim not possible
    2017-04-03T15:40:20.525906Z 0 [Note] WSREP: view(view_id(NON_PRIM,d66ac8a9,1) memb {
        d66ac8a9,0
    } joined {
    } left {
    } partitioned {
    })
    2017-04-03T15:40:21.026013Z 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50235S), skipping check
    2017-04-03T15:40:50.533643Z 0 [Note] WSREP: view((empty))
    2017-04-03T15:40:50.533780Z 0 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
         at gcomm/src/pc.cpp:connect():158
    2017-04-03T15:40:50.533802Z 0 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
    2017-04-03T15:40:50.533922Z 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1437: Failed to open channel 'pxc-cluster-1' at 'gcomm://192.168.154.40,192.168.154.119': -110 (Connection timed out)
    2017-04-03T15:40:50.533945Z 0 [ERROR] WSREP: gcs connect failed: Connection timed out
    2017-04-03T15:40:50.533958Z 0 [ERROR] WSREP: wsrep::connect(gcomm://192.168.154.40,192.168.154.119) failed: 7
    2017-04-03T15:40:50.533967Z 0 [ERROR] Aborting
    
    2017-04-03T15:40:50.533980Z 0 [Note] Giving 0 client threads a chance to die gracefully
    2017-04-03T15:40:50.533996Z 0 [Note] WSREP: Service disconnected.
    2017-04-03T15:40:53.534149Z 0 [Note] WSREP: Some threads may fail to exit.
    2017-04-03T15:40:53.534200Z 0 [Note] Binlog end
    2017-04-03T15:40:53.534297Z 0 [Note] /usr/sbin/mysqld: Shutdown complete
    

    Is that the expected behaviour? I mean, do I need to always bootstrap one node when all nodes of the cluster have been stopped? I'm asking that because if I stop and start pxc-node-1 while pxc-node-2 is still running, it works perfectly.
  • VincentVincent Contributor Inactive User Role Beginner
    So, basically the problem here was that I was stopping all nodes of the cluster, even if "all nodes" mean "only one". And when you stop all the nodes you need to bootstrap the cluster again, choosing one node to start with.

    Could an admin/mod edit the title of this thread and add [SOLVED] to it? I'm unable to edit my own post.

    Thanks.
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.