heartbeat failover master-master active-passive failover

I’m trying to have an idea of heartbeat [[URL]http://www.linux-ha.org/Heartbeat[/URL] ].
Specifically I’m thinking of using it for managing a master-master replication (active-passive) for handing failover.

I want to know how does it recover / switch passive master to active one.
From description here [[URL]http://www.linux-ha.org/GettingStarted[/URL] ] I can sense it’s changing IP or something.
What manual process we must follow to failover passive to active master?

  1. Stop writes on active server.
  • FLUSH TABLES WITH READ LOCK.
  • Kill all client connections.
  • SHOW MASTER STATUS on active server
  • note the binary log coordinates.
  1. Make sure Passive Master is in sync with Active Master.
  2. SET @@global.read_only := 0 on passive server, allowing writes.
  3. Reconfigure your applications to write to the newly active server.

Please correct if i’m wrong in above process.

Suppose your servers are 10.0.0.1 and 10.0.0.2. Then Heartbeat adds another ip (eg 10.0.0.3) to the master. When the master goes down, the slave gets this ip. So you always connect to 10.0.0.3.

How much correct the manual steps are? suggestions?

Master-Master with Heartbeat is a pretty common pattern. We’ve been using it in production for a while with success.

The way I learned it is to install heartbeat from my Linux distro package management system (in my case Debian Stable). Then I set-up a test cluster of two machines and played with it and kept a journal of my own experiments. In heartbeat, use the CRM if you can and find the hb_gui program to manage the cluster. Become conversant with the command line tool and CRM shell too (if available). The linux-ha.org website is in a state of flux at the moment as the project name and scope is changing to “Pacemaker” ([URL=“http://ClusterLabs”]http://clusterlabs.org/wiki/Main_Page[/URL]). For heartbeat 2.1.x consult: http://clusterlabs.org/mediawiki/images/7/7d/Configuration_E xplained_0.6.pdf

The heartbeat developers hang out in #linux-ha on FreeNode IRC, you can ask more advice there. They will tell you to pick a distro where you’re NOT using v2.1.x but I’ve found it works.

Due to various issue with replication slaves though I’m looking at DRBD rather than replication as a possible lower maintenance way to achieve high availability:
http://dev.mysql.com/doc/refman/5.0/en/faqs-mysql-drbd-heart beat.html
http://www.mysql.com/why-mysql/white-papers/mysql_wp_drbd.ph p

Looks like Baron doesn’t think so though: http://www.xaprb.com/blog/2008/08/06/how-to-scale-writes-wit h-master-master-replication-in-mysql/

Hope that helps,
Imran