Not the answer you need?
Register and ask your own question!

Pacemaker + PRM on two nodes. Replication failed.

DeV1LDeV1L EntrantInactive User Role Beginner
I have installed two nodes cluster using Percona and Clusterlabs manuals.
I configured 2 resources: ClusterIP and Master/Slave Set for MariaDB 10.
Also i enabled replication using GTID.

Resources has this configuration.
[root@centos-web02 percona]# pcs constraint colocation add  master  ms_MySQL with ClusterIP
[root@centos-web02 percona]# pcs resource op add  p_mysql monitor interval="5s" role="Master" OCF_CHECK_LEVEL="1"
[root@centos-web02 percona]# pcs resource op add  p_mysql monitor interval="2s" role="Slave" OCF_CHECK_LEVEL="1"
[root@centos-web02 ~]# pcs status resources ms_MySQL
 Master: ms_MySQL
  Resource: p_mysql (class=ocf provider=percona type=mysql)
   Attributes: config=/etc/my.cnf pid=/var/run/mariadb/mariadb.pid socket=/var/lib/mysql/mysql.sock replication_user=replication replication_passwd=**** max_slave_lag=60 evict_outdated_slaves=false binary=/usr/sbin/mysqld test_user=test_user test_passwd=**** master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true globally-unique=false target-role=Master is-managed=true
   Operations: start interval=0s timeout=120 (p_mysql-start-interval-0s)
               stop interval=0s timeout=120 (p_mysql-stop-interval-0s)
               monitor interval=20 timeout=30 (p_mysql-monitor-interval-20)
               monitor interval=10 role=Master timeout=30 (p_mysql-monitor-interval-10)
               monitor interval=30 role=Slave timeout=30 (p_mysql-monitor-interval-30)
               promote interval=0s timeout=120 (p_mysql-promote-interval-0s)
               demote interval=0s timeout=120 (p_mysql-demote-interval-0s)
               monitor interval=5s role=Master OCF_CHECK_LEVEL=1 (p_mysql-monitor-interval-5s)
               monitor interval=2s role=Slave OCF_CHECK_LEVEL=1 (p_mysql-monitor-interval-2s)
[root@centos-web02 ~]# pcs status resources ms_MySQL [p_mysql]

I have checked switchover by command:
[root@centos-web02 ~]# pcs resource move ClusterIP centos-web03

The resources have moved, replication was worked.
And after back switchover all worked fine to.
[root@centos-web02 ~]# pcs resource move ClusterIP centos-web02

But when i tried to stop cluster on centos-web02 which was a Master and then start it again, something went wrong.
[root@centos-web02 ~]# pcs cluster stop centos-web02
centos-web02: Stopping Cluster (pacemaker)...
centos-web02: Stopping Cluster (corosync)...
[root@centos-web02 ~]# pcs cluster start centos-web02
centos-web02: Starting Cluster...
[root@centos-web02 ~]# pcs resource
 ClusterIP      (ocf::heartbeat:IPaddr2):       Started centos-web02
 Master/Slave Set: ms_MySQL [p_mysql]
     Masters: [ centos-web02 ]
     Slaves: [ centos-web03 ]

I have "# Database consistency check failed" in mysqldbcompare. And error in the replication.

#On centos-web02
MariaDB [mysql]> SHOW MASTER STATUS\G
*************************** 1. row ***************************
            File: binlog.000015
        Position: 56096
    Binlog_Do_DB:
Binlog_Ignore_DB:
1 row in set (0.00 sec)

#On centos-web03
MariaDB [mysql]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: centos-web02.simfy.arkadium.com
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000015
          Read_Master_Log_Pos: 56096
               Relay_Log_File: mysql-relay-bin.000002
                Relay_Log_Pos: 534
        Relay_Master_Log_File: binlog.000015
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1062
                   Last_Error: Could not execute Write_rows_v1 event on table arkadium.wp_options; Duplicate entry '28980' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log binlog.000015, end_log_pos 557
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 356
              Relay_Log_Space: 56572
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1062
               Last_SQL_Error: Could not execute Write_rows_v1 event on table arkadium.wp_options; Duplicate entry '28980' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log binlog.000015, end_log_pos 557
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: No
                  Gtid_IO_Pos:
      Replicate_Do_Domain_Ids:
  Replicate_Ignore_Domain_Ids:
                Parallel_Mode: conservative
1 row in set (0.00 sec)

Do i understand correctly that cluster should not crashes in this scenario? Have i done something wrong?
My goal is automatically failover.
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.