Not the answer you need?
Register and ask your own question!

Problem initializing XtraDB Cluster 5.6.32-25.17.1 and 5.7.14-26.17.1

unixroninunixronin ContributorCurrent User Role Beginner
We are experiencing a problem with bringing up new XtraDB Cluster installations in both the latest 5.6 and 5.7 releases. The problem looks like this on PXC 5.7:
  • When bootstrapped from a newly created data directory, the first node does not properly enter PRIMARY state.
  • Because it's not primary, you can't execute any statements that change data.
  • You can't force the node primary by setting wsrep_provider_options='pc.bootstrap=true', because you can't do anything until you set a permanent superuser password to replace the temporary password created by MySQL 5.7.
  • You can't set a valid superuser password because with the node non-primary, you can't execute any statements. So you can't even look at global status to see what the state actually is.
So on 5.7.14, you're stuck in a catch-22 situation. You can't set the node primary without first setting a new superuser password, and you can't set a new superuser password without setting the node primary.


On 5.6.32, the problem seems to be related — you still end up with your bootstrapped node non-primary, though you don't have a forced temporary password to reset. You can execute the command to set the node primary, but it doesn't work:

mysql> set global wsrep_provider_options='pc.bootstrap=true';
ERROR 1210 (HY000): Incorrect arguments to SET

Investigating this further, it seems that this is because the pc.bootstrap parameter no longer exists:

2016-11-23T21:47:42.232304Z 2 [Warning] WSREP: Unknown parameter 'pc.bootstrap'
2016-11-23T21:47:42.232418Z 2 [ERROR] WSREP: Set options returned 7

I can find no documentation stating what has replaced it.


It looks as though the following MAY be why the node is not properly coming up primary:

2016-11-23T15:43:15.983090Z 0 [ERROR] Incorrect definition of table performance_schema.replication_connection_status: expected column 'RECEIVED_TRANSACTION_SET' at position 7 to have type longtext, found type text.
2016-11-23T15:43:15.984024Z 0 [ERROR] Incorrect definition of table performance_schema.replication_group_member_stats: expected column 'COUNT_TRANSACTIONS_ROWS_VALIDATING' at position 6, found 'COUNT_TRANSACTIONS_VALIDATING'.

It's not clear to me what can be done by the user to fix this.


Can anyone assist?

Comments

  • krunalbauskarkrunalbauskar Advisor Inactive User Role Novice
    Hunting reason behind why node is unable to start (or enter PRIMARY) state is core issue.
    That would solve both the problem you are facing (5.6 and 5.7 issue).

    a. Is this a fresh installation or you just upgraded (for 5.7 case)

    b. Can you share the complete log file for 5.6 and 5.7. I can take a look to see if there is anything wrong. Also, just ensure there are no other infrastructure issue given that both 5.6 and 5.7 are facing problem.
  • unixroninunixronin Contributor Current User Role Beginner
    Thank you Krunal. With further study we've found the following:
    — PXC 5.6.32 and 5.7.14 clusters updated from 5.6.28 or earlier seem to work fine
    — PXC 5.6.30, 5.6.32, 5.7.14 new installs using my.cnf wsrep configuration copied from the working clusters cannot form a cluster because no node can enter PRIMARY state

    I'm trying to attach log filed but the upload function says the logs are invalid files and won't let me attach them. I'll have to paste inline, and I'm going to have to do it one per reply because of the 10K-char comment limit.

    This log is from a new PXC 5.6.32 install with a brand new clean data directory populated using mysql_install_db:
    161129 20:07:03 mysqld_safe Starting mysqld daemon with databases from /data/mysql
    161129 20:07:03 mysqld_safe Skipping wsrep-recover for empty datadir: /data/mysql
    161129 20:07:03 mysqld_safe Assigning 00000000-0000-0000-0000-000000000000:-1 to wsrep_start_position
    2016-11-29 20:07:04 0 [Note] /usr/sbin/mysqld (mysqld 5.6.32-78.1-56) starting as process 14339 ...
    2016-11-29 20:07:04 14339 [Note] WSREP: Read nil XID from storage engines, skipping position init
    2016-11-29 20:07:04 14339 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/libgalera_smm.so'
    2016-11-29 20:07:04 14339 [Note] WSREP: wsrep_load(): Galera 3.17(r447d194) by Codership Oy <info&#64;codership.com> loaded successfully.
    2016-11-29 20:07:04 14339 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
    2016-11-29 20:07:04 14339 [Warning] WSREP: Could not open state file for reading: '/data/mysql//grastate.dat'
    2016-11-29 20:07:04 14339 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
    2016-11-29 20:07:04 14339 [Note] WSREP: Passing config to GCS: base_dir = /data/mysql/; base_host = 10.20.10.16; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /data/mysql/; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /data/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 5G; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 128; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.listen_addr = tcp://10.20.10.16; gmcast.segment = 0; gmcast.version = 0; ist.recv_addr =
    2016-11-29 20:07:04 14339 [Note] WSREP: Service thread queue flushed.
    2016-11-29 20:07:04 14339 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
    2016-11-29 20:07:04 14339 [Note] Plugin 'FEDERATED' is disabled.
    2016-11-29 20:07:04 14339 [Note] InnoDB: Using atomics to ref count buffer pool pages
    2016-11-29 20:07:04 14339 [Note] InnoDB: The InnoDB memory heap is disabled
    2016-11-29 20:07:04 14339 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
    2016-11-29 20:07:04 14339 [Note] InnoDB: Memory barrier is not used
    2016-11-29 20:07:04 14339 [Note] InnoDB: Compressed tables use zlib 1.2.3
    2016-11-29 20:07:04 14339 [Note] InnoDB: Using Linux native AIO
    2016-11-29 20:07:04 14339 [Note] InnoDB: Not using CPU crc32 instructions
    2016-11-29 20:07:04 14339 [Note] InnoDB: Initializing buffer pool, size = 4.0G
    2016-11-29 20:07:04 14339 [Note] InnoDB: Completed initialization of buffer pool
    2016-11-29 20:07:04 14339 [Note] InnoDB: Highest supported file format is Barracuda.
    2016-11-29 20:07:04 14339 [Note] InnoDB: 128 rollback segment(s) are active.
    2016-11-29 20:07:04 14339 [Note] InnoDB: Waiting for purge to start
    2016-11-29 20:07:04 14339 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.32-78.1 started; log sequence number 1625997
    2016-11-29 20:07:04 14339 [Note] RSA private key file not found: /data/mysql//private_key.pem. Some authentication plugins will not work.
    2016-11-29 20:07:04 14339 [Note] RSA public key file not found: /data/mysql//public_key.pem. Some authentication plugins will not work.
    2016-11-29 20:07:04 14339 [Note] Server hostname (bind-address): '*'; port: 3306
    2016-11-29 20:07:04 14339 [Note] IPv6 is available.
    2016-11-29 20:07:04 14339 [Note]   - '::' resolves to '::';
    2016-11-29 20:07:04 14339 [Note] Server socket created on IP: '::'.
    2016-11-29 20:07:04 14339 [Note] Event Scheduler: Loaded 0 events
    2016-11-29 20:07:04 14339 [Note] /usr/sbin/mysqld: ready for connections.
    Version: '5.6.32-78.1-56'  socket: '/data/mysql/mysql.sock'  port: 3306  Percona XtraDB Cluster (GPL), Release rel78.1, Revision 979409a, WSREP version 25.17, wsrep_25.17
    
  • unixroninunixronin Contributor Current User Role Beginner
    This next log is from a clean PXC 5.7.14 install allowed to initialize its own data directory upon bootstrap:
    2016-11-29T20:16:34.891939Z 0 [Warning] 'NO_ZERO_DATE', 'NO_ZERO_IN_DATE' and 'ERROR_FOR_DIVISION_BY_ZERO' sql modes should be used with strict mode. They will be merged with strict mode in a future release.
    2016-11-29T20:16:34.892028Z 0 [Warning] WSREP: Node is running in bootstrap/initialize mode. Disabling pxc_strict_mode checks
     100
     100
     100
     100
    2016-11-29T20:16:41.686667Z 0 [Warning] InnoDB: New log files created, LSN=45790
    2016-11-29T20:16:41.991630Z 0 [Warning] InnoDB: Creating foreign key constraint system tables.
    2016-11-29T20:16:42.046520Z 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: bdee098a-b670-11e6-949c-18a9053d3bf4.
    2016-11-29T20:16:42.063087Z 0 [Warning] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened.
    2016-11-29T20:16:42.632124Z 0 [Warning] CA certificate ca.pem is self signed.
    2016-11-29T20:16:42.693380Z 1 [Note] A temporary password is generated for root&#64;localhost: Ze1a_wQ9%Fy,
    2016-11-29T20:16:53.099353Z mysqld_safe Starting mysqld daemon with databases from /data/mysql
    2016-11-29T20:16:53.107278Z mysqld_safe Skipping wsrep-recover for empty datadir: /data/mysql
    2016-11-29T20:16:53.109622Z mysqld_safe Assigning 00000000-0000-0000-0000-000000000000:-1 to wsrep_start_position
    2016-11-29T20:16:53.362056Z 0 [Warning] 'NO_ZERO_DATE', 'NO_ZERO_IN_DATE' and 'ERROR_FOR_DIVISION_BY_ZERO' sql modes should be used with strict mode. They will be merged with strict mode in a future release.
    2016-11-29T20:16:53.362132Z 0 [Warning] 'NO_AUTO_CREATE_USER' sql mode was not set.
    2016-11-29T20:16:53.364126Z 0 [Note] /usr/sbin/mysqld (mysqld 5.7.14-8-57) starting as process 30434 ...
    2016-11-29T20:16:53.367508Z 0 [Note] WSREP: Read nil XID from storage engines, skipping position init
    2016-11-29T20:16:53.367523Z 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/libgalera_smm.so'
    2016-11-29T20:16:53.372842Z 0 [Note] WSREP: wsrep_load(): Galera 3.17(r447d194) by Codership Oy <info&#64;codership.com> loaded successfully.
    2016-11-29T20:16:53.372910Z 0 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
    2016-11-29T20:16:53.373260Z 0 [Warning] WSREP: Could not open state file for reading: '/data/mysql//grastate.dat'
    2016-11-29T20:16:53.373323Z 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
    2016-11-29T20:16:53.374702Z 0 [Note] WSREP: Passing config to GCS: base_dir = /data/mysql/; base_host = 10.20.10.17; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /data/mysql/; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /data/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 5G; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 128; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.listen_addr = tcp://10.20.10.17; gmcast.segment = 0; gmcast.version = 0; ist.recv_addr =
    2016-11-29T20:16:53.392154Z 0 [Note] WSREP: Service thread queue flushed.
    2016-11-29T20:16:53.392232Z 0 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
    2016-11-29T20:16:53.394322Z 0 [Note] InnoDB: PUNCH HOLE support available
    2016-11-29T20:16:53.394344Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
    2016-11-29T20:16:53.394354Z 0 [Note] InnoDB: Uses event mutexes
    2016-11-29T20:16:53.394363Z 0 [Note] InnoDB: GCC builtin __sync_synchronize() is used for memory barrier
    2016-11-29T20:16:53.394372Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.3
    2016-11-29T20:16:53.394382Z 0 [Note] InnoDB: Using Linux native AIO
    2016-11-29T20:16:53.394738Z 0 [Note] InnoDB: Number of pools: 1
    2016-11-29T20:16:53.394877Z 0 [Note] InnoDB: Not using CPU crc32 instructions
    2016-11-29T20:16:53.397024Z 0 [Note] InnoDB: Initializing buffer pool, total size = 4G, instances = 4, chunk size = 128M
    2016-11-29T20:16:53.590496Z 0 [Note] InnoDB: Completed initialization of buffer pool
    2016-11-29T20:16:53.663167Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
    2016-11-29T20:16:53.665472Z 0 [Note] InnoDB: Crash recovery did not find the parallel doublewrite buffer at /data/mysql/xb_doublewrite
    2016-11-29T20:16:53.667106Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
    2016-11-29T20:16:54.027535Z 0 [Note] InnoDB: Created parallel doublewrite buffer at /data/mysql/xb_doublewrite, size 15728640 bytes
    2016-11-29T20:16:54.143416Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables
    2016-11-29T20:16:54.143571Z 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
    2016-11-29T20:16:54.327794Z 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
    2016-11-29T20:16:54.329814Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
    2016-11-29T20:16:54.339821Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
    2016-11-29T20:16:54.340428Z 0 [Note] InnoDB: Waiting for purge to start
    2016-11-29T20:16:54.390611Z 0 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.7.14-8 started; log sequence number 2491825
    2016-11-29T20:16:54.390991Z 0 [Note] Plugin 'FEDERATED' is disabled.
    2016-11-29T20:16:54.398116Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them.
    2016-11-29T20:16:54.398160Z 0 [Note] Skipping generation of SSL certificates as certificate files are present in data directory.
    2016-11-29T20:16:54.410828Z 0 [Warning] CA certificate ca.pem is self signed.
    2016-11-29T20:16:54.410899Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory.
    2016-11-29T20:16:54.411048Z 0 [Note] Server hostname (bind-address): '*'; port: 3306
    2016-11-29T20:16:54.411161Z 0 [Note] IPv6 is available.
    2016-11-29T20:16:54.411177Z 0 [Note]   - '::' resolves to '::';
    2016-11-29T20:16:54.411221Z 0 [Note] Server socket created on IP: '::'.
    2016-11-29T20:16:54.442001Z 0 [Note] Event Scheduler: Loaded 0 events
    2016-11-29T20:16:54.442265Z 0 [Note] /usr/sbin/mysqld: ready for connections.
    Version: '5.7.14-8-57'  socket: '/data/mysql/mysql.sock'  port: 3306  Percona XtraDB Cluster (GPL), Release rel8, Revision a3f9d06, WSREP version 26.17, wsrep_26.17
    

    I also have a log from a new 5.6.32 install with a data directory pre-populated via rsync from a standalone Percona Server 5.6.29. It doesn't really offer any additional information but I can provide it as well if needed.

    Both of the 5.6.32 instances are in INIT state and cannot be made to leave it. Trying to force them primary via "set global wsrep_provider_options='pc.bootstrap=true'" results in an error in the mysql client "ERROR 1210 (HY000): Incorrect arguments to SET" and an error in the log file:
    2016-11-29 20:21:05 14339 [Warning] WSREP: Unknown parameter 'pc.bootstrap'

    The 5.7.14 instance will not allow any commands to be executed until you update the temporary superuser password using ALTER USER, but ALTER USER cannot be executed because the node is in INIT state and will not allow any tables to be changed.
  • unixroninunixronin Contributor Current User Role Beginner
    Oh, and I forgot to add ... we are unaware of any infrastructure problem. All of the servers on which we are experiencing problems run Percona Server 5.6.34 or 5.7.16 with no problems whatsoever.
  • unixroninunixronin Contributor Current User Role Beginner
    We found the problem. It was a non-obvious configuration error. All of the correct values and variables were present ... but somehow the values of wsrep_cluster_address and wsrep_sst_receive_address were swapped, and the error got propagated throughout all the other new nodes.

    It is at times like this that I really wish for a robust configuration checker.
  • krunalbauskarkrunalbauskar Advisor Inactive User Role Novice
    Good to know you found the problem.

    Configuration checker may help but in limited fashion as the format looks quite similar except gcomm:// prefix. But yes, we will consider it in due-course.
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.