Not the answer you need?
Register and ask your own question!

XtraDB Cluster loads, but no PID File, MySQL wont load.

lostnodelostnode ContributorCurrent User Role Beginner
Hi there, I recently setup Percona XtraDB Cluster on CentOS6.6 following the documentation.(http://www.percona.com/doc/percona-xtradb-cluster/5.6/howtos/cenots_howto.html) however, when I reboot the servers (virtual machines running 1 CPU and 512 Ram) it takes a long time to boot, once booted MySQL fails to start/restart with the following error:

# service mysql restart
Shutting down MySQL (Percona XtraDB Cluster) ERROR! MySQL (Percona XtraDB Cluster) PID file could not be found!
ERROR! MySQL (Percona XtraDB Cluster) is running but PID file could not be found
ERROR! Failed to restart server.
[[email protected] ~]# service mysql start
ERROR! MySQL (Percona XtraDB Cluster) is running but PID file could not be found
[[email protected] ~]# service mysql stop
Shutting down MySQL (Percona XtraDB Cluster) ERROR! MySQL (Percona XtraDB Cluster) PID file could not be found!

When I try and connect to the MySQL database I get told it could not connect via the sock



Which is obvious because MySQL isn;t actually running. But I can;t even start it witout Percona spitting out the No PID Error

Here is my config file (As per the instructions I followed at the link above)



datadir=/var/lib/mysql
user=mysql

# Path to Galera library
wsrep_provider=/usr/lib64/libgalera_smm.so

# Cluster connection URL contains the IPs of node#1, node#2 and node#3
wsrep_cluster_address=gcomm://192.168.5.101,192.168.5.102

# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW

# MyISAM storage engine has only experimental support
default_storage_engine=InnoDB

# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2

# Node #1 address
wsrep_node_address=192.168.5.101

# SST method
wsrep_sst_method=xtrabackup-v2

# Cluster name
wsrep_cluster_name=my_centos_cluster

# Authentication for SST method
wsrep_sst_auth="sstuser:s3cret"
.

I also added the following lines to the bottom of the file with no avail

# My Additions
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
socket=/var/lib/mysql/mysql.sock

My error logs show this:

150221 14:52:11 mysqld_safe Skipping wsrep-recover for 2a6bf596-b940-11e4-8a4a-6f644114be5a:2 pair
150221 14:52:11 mysqld_safe Assigning 2a6bf596-b940-11e4-8a4a-6f644114be5a:2 to wsrep_start_position
2015-02-21 14:52:15 0 [Note] WSREP: wsrep_start_position var submitted: '2a6bf596-b940-11e4-8a4a-6f644114be5a:2'
2015-02-21 14:52:15 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2015-02-21 14:52:16 1544 [Note] WSREP: Read nil XID from storage engines, skipping position init
2015-02-21 14:52:16 1544 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/libgalera_smm.so'
2015-02-21 14:52:16 1544 [Note] WSREP: wsrep_load(): Galera 3.8(rf6147dd) by Codership Oy <[email protected]> loaded successfully.
2015-02-21 14:52:16 1544 [Note] WSREP: CRC-32C: using hardware acceleration.
2015-02-21 14:52:16 1544 [Note] WSREP: Found saved state: 2a6bf596-b940-11e4-8a4a-6f644114be5a:2
2015-02-21 14:52:16 1544 [Note] WSREP: Passing config to GCS: base_host = 192.168.5.101; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recove
2015-02-21 14:52:17 1544 [Note] WSREP: Service thread queue flushed.
2015-02-21 14:52:17 1544 [Note] WSREP: Assign initial position for certification: 2, protocol version: -1
2015-02-21 14:52:17 1544 [Note] WSREP: wsrep_sst_grab()
2015-02-21 14:52:17 1544 [Note] WSREP: Start replication
2015-02-21 14:52:17 1544 [Note] WSREP: Setting initial position to 2a6bf596-b940-11e4-8a4a-6f644114be5a:2
2015-02-21 14:52:17 1544 [Note] WSREP: protonet asio version 0
2015-02-21 14:52:17 1544 [Note] WSREP: Using CRC-32C for message checksums.
2015-02-21 14:52:17 1544 [Note] WSREP: backend: asio
2015-02-21 14:52:17 1544 [Warning] WSREP: access file(gvwstate.dat) failed(No such file or directory)
2015-02-21 14:52:17 1544 [Note] WSREP: restore pc from disk failed
2015-02-21 14:52:17 1544 [Note] WSREP: GMCast version 0
2015-02-21 14:52:17 1544 [Note] WSREP: (23dfb830, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2015-02-21 14:52:17 1544 [Note] WSREP: (23dfb830, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2015-02-21 14:52:17 1544 [Note] WSREP: EVS version 0
2015-02-21 14:52:17 1544 [Note] WSREP: gcomm: connecting to group 'my_centos_cluster', peer '192.168.5.101:,192.168.5.102:'
2015-02-21 14:52:17 1544 [Warning] WSREP: (23dfb830, 'tcp://0.0.0.0:4567') address 'tcp://192.168.5.101:4567' points to own listening address, blacklisting
2015-02-21 14:52:17 1544 [Note] WSREP: (23dfb830, 'tcp://0.0.0.0:4567') address 'tcp://192.168.5.101:4567' pointing to uuid 23dfb830 is blacklisted, skipping
2015-02-21 14:52:17 1544 [Note] WSREP: (23dfb830, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2015-02-21 14:52:18 1544 [Note] WSREP: declaring 8fbe1659 at tcp://192.168.5.102:4567 stable
2015-02-21 14:52:18 1544 [Warning] WSREP: no nodes coming from prim view, prim not possible
2015-02-21 14:52:18 1544 [Note] WSREP: view(view_id(NON_PRIM,23dfb830,8) memb {
23dfb830,0
8fbe1659,0
} joined {
} left {
} partitioned {
a4d14ed5,0
bc924c22,0
eb831cca,0
})
2015-02-21 14:52:18 1544 [Note] WSREP: gcomm: connected
2015-02-21 14:52:18 1544 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2015-02-21 14:52:18 1544 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2015-02-21 14:52:18 1544 [Note] WSREP: Opened channel 'my_centos_cluster'
2015-02-21 14:52:18 1544 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 2
2015-02-21 14:52:18 1544 [Note] WSREP: Flow-control interval: [23, 23]
2015-02-21 14:52:18 1544 [Note] WSREP: Received NON-PRIMARY.
2015-02-21 14:52:18 1544 [Note] WSREP: Waiting for SST to complete.
2015-02-21 14:52:18 1544 [Note] WSREP: New cluster view: global state: 2a6bf596-b940-11e4-8a4a-6f644114be5a:2, view# -1: non-Primary, number of nodes: 2, my index: 0, protocol version -1
2015-02-21 14:52:18 1544 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2015-02-21 14:52:21 1544 [Note] WSREP: (23dfb830, 'tcp://0.0.0.0:4567') turning message relay requesting off

And it does not create a .pid file or a .sock file.

Any ideas?

Comments

  • lostnodelostnode Contributor Current User Role Beginner
    Also SELinux was disabled before install and never re-enabled. So that doesn;t seem to be the issue.
  • lostnodelostnode Contributor Current User Role Beginner
    Here arescreen capture of the prcess trying to start during the boot up process. Takes close to 15 minutes before it fails completely and continues till the login prompt appears.
  • mirfanmirfan Database Administrator Inactive User Role Beginner
    Did you bootstraped the first node as below:

    node1$ /etc/init.d/mysql bootstrap-pxc

    Also provide innobackup.backup.log to check what causing the issue.
  • lostnodelostnode Contributor Current User Role Beginner
    Hey Mirfan,

    Yes I did bootstrap the first node. Everything was running great, till I shutdown the virtual machines to go home. when I tried to reboot the next day, they just crashed, after about 25 minutes. I have attached the log file, renamed, as you cannot upload .log files.
  • mirfanmirfan Database Administrator Inactive User Role Beginner
    >>

    When you start node(s), It needs to connect to primary component. You shutdown virtual machines means no primary component.
    So, when starting nodes they don't have anything to join hence error. If no primary component exists you need to bootstrap first.
  • lostnodelostnode Contributor Current User Role Beginner
    Not sure what you mean, the first machine to get booted is the primary node, which was bootstrapped upon install... Do i need to bootstrap it again every time I boot it? The log was from the primary node.

    Regards,
    Koster
  • lostnodelostnode Contributor Current User Role Beginner
    Just tried this on a few cloud servers over at Digital Ocean, created a 1G swap file on both as the default memory sucks. Still the same issue... Upon reboot, it takes for ever to boot the first node, it fails, second node fails as well. I am running Cent OS 6.6 and the latest Percona installed via yum repositories.

    Attached are the logs and my.cnf files for both nodes (I am only using 2 instead of 3 nodes). I am trying to find a viable easy to use, reliable solution for syncing DBs. Percona is definitely easy to use, takes seconds to install, test, and use, works great right out of the box, but definitelly unreliable if I can't reboot my servers... I mean if there is a crash or something, I need to know my servers can come back online... Any help would be much appreciated.

    Note the second node didntt have an error file, but it had more xtrabackup_* files, the log ins included in note2.zip.
    FIrst node was bootstraped as per the instructions in the previously mentioned tutorial from your site.

    Regards,
    Koster
  • lostnodelostnode Contributor Current User Role Beginner
    Note, after killing all MySQL processes and removing the lock file, I can try and start it again, but it fails just like it does at boot. Its not loading something, or something is missing from the config files.
  • lostnodelostnode Contributor Current User Role Beginner
    Ok, i know the issue, its not bootstrapping on reboot, how do i force mysql to boot with bootstrap-pdc option?
  • lostnodelostnode Contributor Current User Role Beginner
    SOLVED! In order to make it boot properly, youneed to bootstrap. In in order for this to work in my specific case, as the primary node will be the onky node receiving changes, you need to delete the mysql (in my case S63mysql) entry from your rc.d/rc#.d directory (which ever is defualt, in CentoOS its rc3.d) then edit the /etc/rc.d/rc.local and add /etc/init.d/mysql bootstap-pxc on a line of its own... Now it boots flawlsssly
  • josgrelinjosgrelin Entrant Current User Role Beginner
    Problems
    Starting MySQL (Percona XtraDB Cluster) database server mysqld
    The server quit without updating PID file [fail]
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.