Unable to initialize cluster: "terminate called after throwing an instance of 'gu::NotFound'&qu

Greetings,

I’m at my wits end trying to get this working. I think I’ve tried everything short of downgrading to 5.5.20. Below you’ll find some details about my setup and other resources I’ve consulted.

Pretty sure there’s just some little config setting I’m missing, but can’t figure it out!

Thanks!

Erik Osterman


Using RPMs downloaded directly from Percona:

Percona-XtraDB-Cluster-client-5.5.23-23.5.333.rhel5Percona-XtraDB-Cluster-devel-5.5.23-23.5.333.rhel5Percona-XtraDB-Cluster-galera-2.0-1.109.rhel5Percona-XtraDB-Cluster-server-5.5.23-23.5.333.rhel5Percona-XtraDB-Cluster-shared-5.5.23-23.5.333.rhel5

With boost 1.41:

boost141-program-options-1.41.0-2.el5

With the following interface:

ifconfig | grep 10.254.167.5 inet addr:10.254.167.5 Bcast:10.254.167.255 Mask:255.255.254.0

Using the following wsrep.cnf:

[mysqld]wsrep_provider=/usr/lib/libgalera_smm.sowsrep_cluster_address=gcomm://wsrep_slave_threads=2wsrep_cluster_name=sentrywsrep_sst_method=rsyncwsrep_node_name=sentry1wsrep_node_address=10.254.167.5wsrep_sst_receive_address=10.254.167.5wsrep_provider_options="gmcast.listen_addr=10.254.167.5; ist.recv_addr=10.254.167.5"binlog_format=ROWinnodb_locks_unsafe_for_binlog=1innodb_autoinc_lock_mode=2

Fresh install of CentOS 5.4 (i386).
No iptables.
No selinux.

Found these two related postings, neither of which seems to fix my problems:
https://bugs.launchpad.net/percona-xtradb-cluster/+bug/91497 6
http://forum.percona.com/index.php?t=msg&goto=8318&

Checked the FAQs:
http://www.codership.com/wiki/doku.php?id=faq
http://www.percona.com/doc/percona-xtradb-cluster/faq.html

This cluster has never been started and no data has even been loaded other than that which gets installed by mysql_install_db.

Error log below:

120605 21:52:39 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql120605 21:52:39 [Note] Flashcache bypass: disabled120605 21:52:39 [Note] Flashcache setup error is : ioctl failed120605 21:52:39 [Note] WSREP: Read nil XID from storage engines, skipping position init120605 21:52:39 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so’120605 21:52:39 [Note] WSREP: wsrep_load(): Galera 2.1dev(r109) by Codership Oy <info@codership.com> loaded succesfully.120605 21:52:39 [Note] WSREP: Reusing existing ‘/var/lib/mysql//galera.cache’.120605 21:52:39 [Note] WSREP: Passing config to GCS: base_host = 10.254.167.5; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 2147483647; gcs.recv_q_soft_limit = 0.25; gmcast.listen_addr = 10.254.167.5; ist.recv_addr = 10.254.167.5; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3120605 21:52:39 [Note] WSREP: wsrep_sst_grab()120605 21:52:39 [Note] WSREP: Start replication120605 21:52:39 [Warning] WSREP: state file not found: /var/lib/mysql//grastate.dat120605 21:52:39 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1120605 21:52:39 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1terminate called after throwing an instance of 'gu::NotFound’04:52:39 UTC - mysqld got signal 6 ;

Contents of /var/lib/mysql/

ls -al /var/lib/mysql/total 280600drwxr-xr-x 5 mysql mysql 4096 2012-06-05 21:52 .drwxr-xr-x 3 27 27 50 2012-06-04 16:11 …-rw------- 1 mysql mysql 134219040 2012-06-05 19:27 galera.cache-rw-rw---- 1 mysql mysql 18874368 2012-06-05 00:53 ibdata1-rw-rw---- 1 mysql mysql 67108864 2012-06-05 00:53 ib_logfile0-rw-rw---- 1 mysql mysql 67108864 2012-06-04 16:11 ib_logfile1drwx------ 2 mysql mysql 4096 2012-06-04 16:11 mysqldrwx------ 2 mysql mysql 4096 2012-06-05 00:30 performance_schema-rw-r–r-- 1 root root 347 2012-06-05 00:30 RPM_UPGRADE_HISTORY-rw-r–r-- 1 root root 347 2012-06-05 00:30 RPM_UPGRADE_MARKER-LASTdrwx------ 2 mysql mysql 6 2012-06-04 16:11 test

Ahh, CentOS 5.4! Why 5.4, may I enquire, when there is 5.8 already?

But, besides 5.4 having buggy libstdc++, you have really over-configured. And made a subtle, but fatal error in configuration which causes this exception which can’t be caught by CentOS 5.4. :wink:

The error (well, not really an error, but CentOS 5.4 makes it such) is in

gmcast.listen_addr=10.254.167.5

it should be

gmcast.listen_addr=tcp://10.254.167.5

But, unless you have some very specific needs, all you need to set up is wsrep_node_address. It will be used for everything, unless explicitly overridden.

Regards,
Alex

Alex,

Thanks so much for the explanation of what’s going wrong. I tried with and without the tcp:// scheme for “gmcast.listen_addr” with the same unfortunate outcome.

I simplified the configuration as you suggested to the one below:

[mysqld]datadir=/var/lib/mysqluser=mysqlbinlog_format=ROWwsrep_debug=1wsrep_provider=/usr/lib/libgalera_smm.sowsrep_cluster_address="gcomm://"wsrep_slave_threads=2wsrep_node_name=node1wsrep_node_address=10.254.167.5wsrep_cluster_name=clusterwsrep_sst_method=rsyncbind_address=0.0.0.0default_storage_engine=InnoDBinnodb_locks_unsafe_for_binlog=1innodb_autoinc_lock_mode=2max_connections=10max_allowed_packet=32Mtable_cache=2048thread_cache_size=32query_cache_size=256Minnodb_buffer_pool_size=128Msort_buffer_size=64M read_rnd_buffer_size=2Minnodb_log_file_size = 64Minnodb_log_buffer_size = 8Mlog-bin=/mnt/mysql-binlogs/mysql-binlog-bin=/mnt/mysql-binlogs/mysql-bin.indexrelay-log=/mnt/mysql-binlogs/mysql-relay-binrelay-log-index=/mnt/mysql-binlogs/mysql-relay-bin.index

It terminated with the same outcome:

/usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql120607 12:07:56 [Note] Flashcache bypass: disabled120607 12:07:56 [Note] Flashcache setup error is : ioctl failed120607 12:07:56 [Note] WSREP: Read nil XID from storage engines, skipping position init120607 12:07:56 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so’120607 12:07:56 [Note] WSREP: wsrep_load(): Galera 2.1dev(r112) by Codership Oy <info@codership.com> loaded succesfully.120607 12:07:56 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1120607 12:07:56 [Note] WSREP: Reusing existing ‘/var/lib/mysql//galera.cache’.120607 12:07:56 [Note] WSREP: Passing config to GCS: base_host = 10.254.167.5; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 2147483647; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3120607 12:07:57 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1120607 12:07:57 [Note] WSREP: wsrep_sst_grab()120607 12:07:57 [Note] WSREP: Start replication120607 12:07:57 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1terminate called after throwing an instance of 'gu::NotFound’19:07:57 UTC - mysqld got signal 6 ;This could be because you hit a bug. It is also possible that this binaryor one of the libraries it was linked against is corrupt, improperly built,or misconfigured. This error can also be caused by malfunctioning hardware.We will try our best to scrape up some info that will hopefully helpdiagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.key_buffer_size=0read_buffer_size=131072max_used_connections=0max_threads=10thread_count=0connection_count=0It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 656721 K bytes of memoryHope that’s ok; if not, decrease some variables in the equation.Thread pointer: 0x0Attempting backtrace. You can use the following information to find outwhere mysqld died. If you see no messages after this, something wentterribly wrong…stack_bottom = 0 thread_stack 0x30000/usr/sbin/mysqld(my_print_stacktrace+0x33)[0x8409693]/usr/sbin/mysqld(handle_fatal_signal+0x48c)[0x82d309c][0xc5d420]/lib/i686/nosegneg/libc.so.6(abort+0x101)[0x941a21]/usr/lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x150)[0x20c4d0]/usr/lib/libstdc++.so.6[0x209f35]/usr/lib/libstdc++.so.6[0x209f72]/usr/lib/libstdc++.so.6[0x20a0aa]/usr/lib/libgalera_smm.so(_ZNK2gu3URI10get_optionERKSs+0xf6)[0x4c73e6]/usr/lib/libgalera_smm.so(_ZN9GCommConnC2ERKN2gu3URIERNS0_6ConfigE+0x25d)[0x5b8d2d]/usr/lib/libgalera_smm.so(gcs_gcomm_create+0xd4)[0x5b41b4]/usr/lib/libgalera_smm.so(gcs_backend_init+0xa9)[0x5a1cb9]/usr/lib/libgalera_smm.so(gcs_core_open+0x6f)[0x5a7eaf]/usr/lib/libgalera_smm.so(gcs_open+0x2c8)[0x5aeb58]/usr/lib/libgalera_smm.so(ZN6galera13ReplicatorSMM7connectERKSsS2_S2+0x296)[0x5f97d6]/usr/lib/libgalera_smm.so(galera_connect+0xae)[0x6146ae]/usr/sbin/mysqld(_Z23wsrep_start_replicationv+0x107)[0x8289cb7]/usr/sbin/mysqld(_Z18wsrep_init_startupb+0x7c)[0x828aa9c]/usr/sbin/mysqld[0x813b069]/usr/sbin/mysqld(_Z11mysqld_mainiPPc+0xa22)[0x813d402]/usr/sbin/mysqld(main+0x27)[0x8130e27]/lib/i686/nosegneg/libc.so.6(__libc_start_main+0xdc)[0x92ce9c]/usr/sbin/mysqld[0x8130d41]The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html containsinformation that should help you find out what is causing the crash.

I noticed that it’s always in the “get_option” call that it appears to crash, so I agree with you that there’s some “fatal error in configuration”!

As for why we’re running CentOS 5.4, it’s just due to a very large infrastructure already built on top of it. With over 26G of rpms comprised of 12K packages it’s a project we’re pushing off! With that said, much of the reason for all the packages is we’ve upgraded the OS well beyond 5.4 by backporting RPMs. Ofcourse, libstdc++ is not one of them =).

If upgrading to 5.8 might alleviate some of these issues, I can try it out for these particular servers. Also, fwiw, this is on EC2.

-Erik