Vagrant+Vbox, first steps to XtraDB. Bootstrap first node shows an error

rubendobrubendob EntrantActive Member Participant
Hi

I've deployed a small environment of test with Vagrant and Vbox with 3 nodes (ubuntu TLS)

IPs are private network, 1 per node

192.168.33.50
192.168.33.51
192.168.33.52

1 cpu and 1 GB RAM per node also
No firewall inside Ubuntu
each node can ping another and all knows the hostname of the others

I've read some guides about Xtradb Cluster, finally I choose:

http://redcrackle.com/blog/how-set-percona-xtradb-cluster-ubuntu

but when I wan to restart the first node with bootsrap it finish with an error:

2015-03-16 12:13:59 0 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead.
2015-03-16 12:13:59 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2015-03-16 12:13:59 10647 [Note] WSREP: Read nil XID from storage engines, skipping position init
2015-03-16 12:13:59 10647 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'
2015-03-16 12:13:59 10647 [Note] WSREP: wsrep_load(): Galera 3.9(r93aca2d) by Codership Oy <[email protected]> loaded successfully.
2015-03-16 12:13:59 10647 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
2015-03-16 12:13:59 10647 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
2015-03-16 12:13:59 10647 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.33.50; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false
2015-03-16 12:13:59 10647 [Note] WSREP: Service thread queue flushed.
2015-03-16 12:13:59 10647 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
2015-03-16 12:13:59 10647 [Note] WSREP: wsrep_sst_grab()
2015-03-16 12:13:59 10647 [Note] WSREP: Start replication
2015-03-16 12:13:59 10647 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
2015-03-16 12:13:59 10647 [Note] WSREP: protonet asio version 0
2015-03-16 12:13:59 10647 [Note] WSREP: Using CRC-32C for message checksums.
2015-03-16 12:13:59 10647 [Note] WSREP: backend: asio
2015-03-16 12:13:59 10647 [Note] WSREP: restore pc from disk successfully
2015-03-16 12:13:59 10647 [Note] WSREP: GMCast version 0
2015-03-16 12:13:59 10647 [Note] WSREP: (673c2adf, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2015-03-16 12:13:59 10647 [Note] WSREP: (673c2adf, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2015-03-16 12:13:59 10647 [Note] WSREP: EVS version 0
2015-03-16 12:13:59 10647 [Note] WSREP: gcomm: bootstrapping new group 'PXC'
2015-03-16 12:13:59 10647 [Note] WSREP: start_prim is enabled, turn off pc_recovery
2015-03-16 12:13:59 10647 [Note] WSREP: Node 673c2adf state prim
2015-03-16 12:13:59 10647 [Note] WSREP: view(view_id(PRIM,673c2adf,3) memb {
673c2adf,0
} joined {
} left {
} partitioned {
})
2015-03-16 12:13:59 10647 [Note] WSREP: save pc into disk
2015-03-16 12:13:59 10647 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.33.50:4567
2015-03-16 12:13:59 10647 [Note] WSREP: discarding pending addr proto entry 0x20e1d60
2015-03-16 12:13:59 10647 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.33.51:4567
2015-03-16 12:13:59 10647 [Note] WSREP: discarding pending addr proto entry 0x20ea4f0
2015-03-16 12:13:59 10647 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.33.52:4567
2015-03-16 12:13:59 10647 [Note] WSREP: discarding pending addr proto entry 0x20f2ce0
2015-03-16 12:13:59 10647 [Note] WSREP: clear restored view
2015-03-16 12:13:59 10647 [ERROR] WSREP: failed to open gcomm backend connection: 11: : 11 (Resource temporarily unavailable)
at gcs/src/gcs_gcomm.cpp:connect():253
2015-03-16 12:13:59 10647 [ERROR] WSREP: gcs/src/gcs_core.cpp:long int gcs_core_open(gcs_core_t*, const char*, const char*, bool)():206: Failed to open backend connection: -11 (Resource temporarily unavailable)
2015-03-16 12:13:59 10647 [ERROR] WSREP: gcs/src/gcs.cpp:long int gcs_open(gcs_conn_t*, const char*, const char*, bool)():1379: Failed to open channel 'PXC' at 'gcomm://192.168.33.50,192.168.33.51,192.168.33.52': -11 (Resource temporarily unavailable)
2015-03-16 12:13:59 10647 [ERROR] WSREP: gcs connect failed: Resource temporarily unavailable
2015-03-16 12:13:59 10647 [ERROR] WSREP: wsrep::connect() failed: 7
2015-03-16 12:13:59 10647 [ERROR] Aborting

2015-03-16 12:13:59 10647 [Note] WSREP: Service disconnected.
2015-03-16 12:14:00 10647 [Note] WSREP: Some threads may fail to exit.
2015-03-16 12:14:00 10647 [Note] Binlog end
2015-03-16 12:14:00 10647 [Note] /usr/sbin/mysqld: Shutdown complete

150316 12:14:00 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended


It's very hard for me to debug what is happening... any ideas?

Thanks!

Comments

  • rubendobrubendob Entrant Active Member Participant
    Hi again

    finally was able to fix these issue. Was related to available ram, I have setup next parameter into my.cnf and then restarted

    performance_schema = off

    * Bootstrapping Percona XtraDB Cluster database server mysqld [ OK ]

    Thanks
  • wagnerbianchiwagnerbianchi Remote DBA Active Member Advisor
    This is an interesting case. I've got a galera cluster with 3 nodes running on CentOS 6.6. I'm using Vagrant to run all the VMs as well. So, I didn't see the same behaviour here...let's take a look:

    #: enabling the performance_schema on the very first node and bootstrapping the cluster
    [vagrant&#64;sql01 ~]$ my_print_defaults mysqld
    --user=mysql
    --server_id=1
    --datadir=/var/lib/mysql
    --performance_schema=1   # enabling performance_schema
    --log_bin=node01-bin
    --log_slave_updates=true
    --enforce_gtid_consistency=true
    --binlog_format=ROW
    --default_storage_engine=innodb
    --innodb_locks_unsafe_for_binlog=1
    --innodb_autoinc_lock_mode=2
    --innodb_log_group_home_dir=/var/lib/mysql
    --innodb_log_files_in_group=2
    --innodb_log_file_size=48M
    --wsrep_provider=/usr/lib64/libgalera_smm.so
    --wsrep_cluster_name=sqlmag
    --wsrep_node_address=192.168.0.101
    --wsrep_node_name=node01
    --wsrep_cluster_address=gcomm://192.168.0.101:4567,192.168.0.102:4567,192.168.0.103:4567
    --wsrep_slave_threads=2
    --wsrep_sst_method=xtrabackup-v2
    --wsrep_sst_auth=sst:123
    --wsrep_provider_options=pc.ignore_quorum=true
    

    And bootstrapping the cluster...
    [vagrant&#64;sql01 ~]$ sudo service mysql bootstrap-pxc
    Bootstrapping PXC (Percona XtraDB Cluster)Starting MySQL (Percona XtraDB Cluster)....... SUCCESS!
    
    mysql> select &#64;&#64;performance_schema;
    +----------------------+
    | &#64;&#64;performance_schema |
    +----------------------+
    |                    1 |
    +----------------------+
    1 row in set (0.00 sec)
    

    BTW, going ahead to start the second node...
    [vagrant&#64;sql02 ~]$ sudo service mysql start
    Starting MySQL (Percona XtraDB Cluster)....... SUCCESS!
    [vagrant&#64;sql02 ~]$ mysql -u root -e "select &#64;&#64;performance_schema"  # performance_schema enabled
    +----------------------+
    | &#64;&#64;performance_schema |
    +----------------------+
    |                    1 |
    +----------------------+
    

    And finally the third node...
    [vagrant&#64;sql03 ~]$ sudo service mysql start
    Starting MySQL (Percona XtraDB Cluster).............. SUCCESS!
    [vagrant&#64;sql03 ~]$ mysql -u root -e "select &#64;&#64;performance_schema"  # performance_schema enabled
    +----------------------+
    | &#64;&#64;performance_schema |
    +----------------------+
    |                    1 |
    +----------------------+
    

    I don't have an environment where I'm running Galera on Ubuntu, but, not sure if it's different due to the packets.

    BTW, the critical on the logs you posted is this part below:
    2015-03-16 12:13:59 10647 [ERROR] WSREP: failed to open gcomm backend connection: 11: : 11 (Resource temporarily unavailable)
    at gcs/src/gcs_gcomm.cpp:connect():253
    2015-03-16 12:13:59 10647 [ERROR] WSREP: gcs/src/gcs_core.cpp:long int gcs_core_open(gcs_core_t*, const char*, const char*, bool)():206: Failed to open backend connection: -11 (Resource temporarily unavailable)
    2015-03-16 12:13:59 10647 [ERROR] WSREP: gcs/src/gcs.cpp:long int gcs_open(gcs_conn_t*, const char*, const char*, bool)():1379: Failed to open channel 'PXC' at 'gcomm://192.168.33.50,192.168.33.51,192.168.33.52': -11 (Resource temporarily unavailable)
    2015-03-16 12:13:59 10647 [ERROR] WSREP: gcs connect failed: Resource temporarily unavailable
    2015-03-16 12:13:59 10647 [ERROR] WSREP: wsrep::connect() failed: 7
    2015-03-16 12:13:59 10647 [ERROR] Aborting
    

    It seems something strange with the gcomm values, I meant, the configured IPs and ports or even...have you bootstrapped the cluster as I stated here on node01?

    Cheers,
  • rubendobrubendob Entrant Active Member Participant
    Hi

    it depends also on how much memory you give to your vms. In my case, as I only have a Mac Book Pro with 8 GB of RAM, I allocated just 1 GB RAM per node and with 0 GB of Swap.

    Maybe in your environtment your VMS have more RAM than my VMS?

    Cheers

    R
  • wagnerbianchiwagnerbianchi Remote DBA Active Member Advisor
    Hmmm, interesting point. Checking the configuration written to the my VM's Vagrantfile, I found this:
    # -*- mode: ruby -*-
    # vi: set ft=ruby :
    
    # Vagrantfile API/syntax version. Don't touch unless you know what you're doing!
    VAGRANTFILE_API_VERSION = "2"
    
    Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
    #  config.vm.provision "shell", path: "setup_env.sh"
      config.vm.define "node01" do |node01|
        node01.vm.box = "centos65-x86_64"
        node01.vm.hostname="sql01"
        node01.vm.network "private_network", virtualbox__intnet: "galera", ip: "192.168.0.101"
        node01.vm.provider "virtualbox" do |v|
          v.memory = 256
          v.cpus   = 1
        end
      end
      config.vm.define "node02" do |node02|
        node02.vm.box = "centos65-x86_64"
        node02.vm.hostname="sql02"
        node02.vm.network "private_network", virtualbox__intnet: "galera", ip: "192.168.0.102" 
        node02.vm.provider "virtualbox" do |v|
          v.memory = 256
          v.cpus   = 1
        end
      end
      config.vm.define "node03" do |node03|
        node03.vm.box="centos65-x86_64"
        node03.vm.hostname="sql03"
        node03.vm.network "private_network", virtualbox__intnet: "galera", ip: "192.168.0.103" 
        node03.vm.provider "virtualbox" do |v|
          v.memory = 256
          v.cpus   = 1
        end
      end
      end
    

    Checking the MemTotal form /proc/meminfo:
    [vagrant&#64;sql03 ~]$ cat /proc/meminfo | grep MemTotal
    MemTotal:         244176 kB
    
    [vagrant&#64;sql02 ~]$ cat /proc/meminfo | grep MemTotal
    MemTotal:         244176 kB
    
    [vagrant&#64;sql03 ~]$ cat /proc/meminfo | grep MemTotal
    MemTotal:         244176 kB
    

    All these things are showing us that we're a good direction to confirm that servers on VMs are using just 256MB RAM of my host which is a MacBookPro 8GB RAM as well. So, next action is to start up the Galera Cluster...
    [vagrant&#64;sql01 ~]$ sudo service mysql bootstrap-pxc
    Bootstrapping PXC (Percona XtraDB Cluster)Starting MySQL (Percona XtraDB Cluster).................................................. SUCCESS!
    [vagrant&#64;sql01 ~]$ mysql -u root -e "select &#64;&#64;performance_schema"
    +----------------------+
    | &#64;&#64;performance_schema |
    +----------------------+
    |                    1 |
    +----------------------+
    

    It's OK, now, I'm going to start up the second (SQL02) e third (SQL03) node:
    [vagrant&#64;sql02 ~]$ sudo service mysql start
    Starting MySQL (Percona XtraDB Cluster)................ SUCCESS!
    [vagrant&#64;sql02 ~]$ mysql -u root -e "select &#64;&#64;performance_schema"
    +----------------------+
    | &#64;&#64;performance_schema |
    +----------------------+
    |                    1 |
    +----------------------+
    
    [vagrant&#64;sql03 ~]$ sudo service mysql start
    Starting MySQL (Percona XtraDB Cluster)................. SUCCESS!
    [vagrant&#64;sql03 ~]$ mysql -u root -e "select &#64;&#64;performance_schema"
    +----------------------+
    | &#64;&#64;performance_schema |
    +----------------------+
    |                    1 |
    +----------------------+
    

    And finally...
    [vagrant&#64;sql01 ~]$ mysql -u root -e "show global status like 'wsrep_cluster_size'"
    +--------------------+-------+
    | Variable_name      | Value |
    +--------------------+-------+
    | wsrep_cluster_size | 3     |
    +--------------------+-------+
    

    I'm still thinking what is the resource that was temporarily unavailable in terms of gcomm which is the group communication system used by Galera API as per the documentation here http://galeracluster.com/documentation-webpages/architecture.html . I tried to simulate something so as to get the stated error - even configuring my gcomm with a inexistent IP address, it was able to bootstrap with clean start up logs, having the second node joined the cluster view.
    2015-03-18 18:51:01 3024 [Note] WSREP: Quorum results:
        version    = 3,
        component  = PRIMARY,
        conf_id    = 1,
        members    = 2/2 (joined/total),
        act_id     = 10166,
        last_appl. = 0,
        protocols  = 0/6/3 (gcs/repl/appl),
        group UUID = f9cdb70c-8483-11e4-86af-4b70fd8e7a88
    2015-03-18 18:51:01 3024 [Note] WSREP: Flow-control interval: [23, 23]
    
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.