Share, , Google Plus, Pinterest,

Print

Posted in:

Linux Cluster Part 1 – Install Corosync and Pacemaker on CentOS 6

I have been using Linux Cluster Engine called Corosync and Linux Cluster Resource Manager called Pacemaker for a while now and must say i am very satisfied with it. Corosync and Pacemaker combined can turn your Linux boxes into a Linux High Availability Cluster.

Corosync and Pacemaker Linux Cluster of course supports both Active/Passive and Active/Active modes on multi-nodes!

Linux Cluster (source: clusterlabs.org)
Linux Cluster (source: clusterlabs.org)

This is the first part of my “Linux Cluster” posts:

Corosync

Corosync is an open source Cluster Engine. It is actually a Communication System that enables two or more Linux Cluster nodes to transfer information between them. Corosync is constantly listening on configured port number where Linux Cluster nodes are sending information. Corosync Communication System enables all of the nodes to know the exact state of each other at all time. In case one of the Linux Cluster nodes fails this information will be immediately transferred to other still exsisting Linux Cluster nodes.

Pacemaker

Pacemaker is an open source high availability Resource Manager. As the name says, Pacemaker manages resources. Pacemaker enables detection and recovery of application and machine failures. Pacemaker holds the configuration of all Resources Linux Cluster will manage as also all relations between the Machines and Resources. In case one of the Linux Cluster nodes fails Pacemaker will detect this and start configured Resources on one of the othe available Linux Cluster nodes.

Let’s learn how to Install and Configure Linux Cluster!

In the following steps we will configure a two node Linux Clustermultiple node Linux Cluster is also available with Corosync and Pacemaker.

1. DNS resolution

Make sure you have successfully set up DNS resolution and NTP time synchronization for both your Linux Cluster nodes.

2. Add repository

Add HA-Clustering Repository from OpenSuse on both nodes! You will need this Repository to install CRM Shell, to manage Pacemaker resources:

/etc/yum.repos.d/ha-clustering.repo

[haclustering]
name=HA Clustering
baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/
enabled=1
gpgcheck=0

3. Install packages

Install Corosync, Pacemaker and CRM Shell. Run this command on both Linux Cluster nodes:

/usr/bin/yum install pacemaker corosync crmsh -y

4. Create configuration

Create Corosync configuration file which must be located in “/etc/corosync/” folder. You can copy /paste the following configuration and edit the “bindnetaddr: 192.168.1.100” to the IP address of your first Linux Cluster node:

/etc/corosync/corosync.conf

compatibility: whitetank

aisexec {
    # Run as root - this is necessary to be able to manage resources with Pacemaker
    user: root
    group: root
}

service {
    # Load the Pacemaker Cluster Resource Manager
    ver: 1
    name: pacemaker
    use_mgmtd: no
    use_logd: no
}

totem {
    version: 2
    #How long before declaring a token lost (ms)
        token: 5000
    # How many token retransmits before forming a new configuration
        token_retransmits_before_loss_const: 10
    # How long to wait for join messages in the membership protocol (ms)
        join: 1000
    # How long to wait for consensus to be achieved before starting a new
    # round of membership configuration (ms)
        consensus: 7500
    # Turn off the virtual synchrony filter
        vsftype: none
    # Number of messages that may be sent by one processor on receipt of the token
        max_messages: 20
    # Stagger sending the node join messages by 1..send_join ms
        send_join: 45
    # Limit generated nodeids to 31-bits (positive signed integers)
        clear_node_high_bit: yes
    # Disable encryption
        secauth: off
    # How many threads to use for encryption/decryption
        threads: 0
    # Optionally assign a fixed node id (integer)
    # nodeid: 1234interface {

        interface {
            ringnumber: 0
            # The following values need to be set based on your environment
                bindnetaddr: 192.168.1.100
                mcastaddr: 226.94.1.1
                mcastport: 5405
                ttl: 1
        }
    }

logging {
    fileline: off
    to_stderr: no
    to_logfile: yes
    to_syslog: yes
    logfile: /var/log/cluster/corosync.log
    debug: off
    timestamp: on

logger_subsys {
    subsys: AMF
    debug: off
    }
}

amf {
    mode: disabled
}

Copy Corosync configuration file to the second Linux Cluster node and edit the “bindnetaddr: 192.168.1.100” to the IP address of your second Linux Cluster node.

5. Generate Auth Key

Generate Corosync Authentication Key by running “corosync-keygen” – This might take some time!. The key is located in “/etc/corosync” directory, file is named “authkey”:

[root@foo1 /]# corosync-keygen
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Press keys on your keyboard to generate entropy (bits = 176).
Press keys on your keyboard to generate entropy (bits = 240).
Press keys on your keyboard to generate entropy (bits = 304).
Press keys on your keyboard to generate entropy (bits = 368).
Press keys on your keyboard to generate entropy (bits = 432).
Press keys on your keyboard to generate entropy (bits = 496).
Press keys on your keyboard to generate entropy (bits = 560).
Press keys on your keyboard to generate entropy (bits = 624).
Press keys on your keyboard to generate entropy (bits = 688).
Press keys on your keyboard to generate entropy (bits = 752).
Press keys on your keyboard to generate entropy (bits = 816).
Press keys on your keyboard to generate entropy (bits = 880).
Press keys on your keyboard to generate entropy (bits = 944).
Press keys on your keyboard to generate entropy (bits = 1008).
Writing corosync key to /etc/corosync/authkey.

 

Transfer the “/etc/corosync/authkey” file to the second Linux Cluster node.

6. Start Corosync service on both nodes:

[root@foo1 /]# service corosync start
Starting Corosync Cluster Engine (corosync):               [  OK  ]
[root@foo2 /]# service corosync start
Starting Corosync Cluster Engine (corosync):               [  OK  ]

7. Start Pacemaker service on both nodes:

[root@foo1 /]# service pacemaker start
Starting Pacemaker Cluster Manager:                        [  OK  ]
[root@foo2 ~]# service pacemaker start
Starting Pacemaker Cluster Manager:                        [  OK  ]

8. Check cluster status

After a few seconds you can check your Linux Cluster status with “crm status” command:

[root@foo1 /]# crm status
Last updated: Thu Sep 19 15:28:49 2013
Last change: Thu Sep 19 15:11:57 2013 via crmd on foo2.geekpeek.net
Stack: classic openais (with plugin)
Current DC: foo1.geekpeek.net - partition with quorum
Version: 1.1.9-2.2-2db99f1
2 Nodes configured, 2 expected votes
0 Resources configured.

Online: [ foo1.geekpeek.net foo2.geekpeek.net ]

 

As we can see the status says 2 nodes are configured in this Linux Cluster – foo1.geekpeek.net and foo2.geekpeek.net. Both nodes are online. Current DC is foo1.geekpeek.net.

NEXT STEP is to configure Pacemaker resources – applications, IP addresses, mount points in the cluster. I will be covering Pacemaker resource configuration in the next post soon. Stay tuned in!

Here’s my latest book about High Availability on CentOS Linux

  • Pingback: Linux Cluster Part 2 - Adding and Deleting Cluster Resources | GeekPeek.Net()

    • Amr

      I really appreciate your efforts
      thank you

      • Mitch

        Thanks Amr!

  • HeikoW

    Guter Artikel.
    Vielleicht noch erwähnen das die Firewall aus ein sollte oder das man den Multicast Traffik zulassen muss, sonst sieht man den anderen Node nicht.

    Firewall fuer Multicast auf beiden Nodes freischalten:
    # iptables -I INPUT -p udp -m state –state NEW -m multiport –dports 5404,5405 -j ACCEPT

    • Mitch

      Hey HeikoW!
      Thanks for your comment! I can read German but can’t write German well enough, so i will respond to your comment in English 🙂

      I am really happy you reminded me about the possible Firewall issues. It seems, i totally forgot to put it in my post but wanted to – my bad! I will update the post and add additional information on Firewall configuration!

      Thanks again for your help and input!

      Regards,
      Mitch

      • prakash chawla

        hey ..
        firstly thank you so much for PXEBoot configration.
        currently I m going to working on clustering , actually I m kind a dummy, but information you provided, are very helpful..
        please create only or single part of configration file for clustering
        thankyou
        prakash chawla,
        University of rajasthan,India

  • HeikoW

    Sorry for my german comment, now in english.
    Good Article.
    Maybe mention that the firewall should one or that one must allow multicast traffic, otherwise you can not see the other node.
    To enable multicast on both nodes:
    # iptables -I INPUT -p udp -m state –state NEW -m multiport –dports 5404,5405 -j ACCEPT
    Dont forget to save Firewall Rule. If you dont do this, after restart the rules get lost.
    # /etc/init.d/iptables save
    Regards HeikoW

  • Pingback: Linux Cluster Part 3 - Manage Cluster Nodes and Resources | GeekPeek.Net()

  • phyo

    Hi, Thanks for great tutorial. Just want to confirm with you.

    at “bindnetaddr: ” we have to put IP address of first linux server or network address?

    • Mitch

      Hi Phyo! As bindnetaddr you have to put the IP address of the linux cluster node on which you are editing the configuration file. If editing corosync.conf on the first cluster node, put in the IP address of this (first) server and if editing corosync.conf on the second cluster node, put in the IP address of the second server.

      Regards,
      Mitch

      • Can not use network address? xxx.xxx.xxx.0

  • Pingback: How to Install DHCP Server on CentOS 6 | GeekPeek.Net()

  • Mauricio

    Excelente aporte gracias

  • Pingback: Linux Cluster Part 3 - Manage Cluster Nodes and Resources - GeekPeek.Net()

  • Pingback: Linux Cluster Part 2 - Adding and Deleting Cluster Resources - GeekPeek.Net()

  • Chris Paul

    When trying to start pacemaker, I get this error in the log: “error: find_corosync_variant: Corosync is running, but Pacemaker could not find the CMAN or Pacemaker plugin loaded” Any ideas?

  • Courtney Campbell

    Not sure why you are generating a key when you have secauth: off. Also looks like in RHEL 6.4 and up that crm is gone.

  • sluge

    RHEL 6.6 doesn’t have crm utility anymore. Now it has crm_mon, etc.

  • Pingback: Linux Cluster Software |()

  • Nicholas Allard

    Packages skipped because of dependency problems:

    clusterlib-3.0.12.1-73.el6_7.2.x86_64 from updates

    libesmtp-1.0.4-15.el6.x86_64 from base

    pacemaker-1.1.12+git20140723.483f48a-1.1.x86_64 from haclustering

    pacemaker-cli-1.1.12+git20140723.483f48a-1.1.x86_64 from haclustering

    pacemaker-libs-1.1.12+git20140723.483f48a-1.1.x86_64 from haclustering

    1:perl-TimeDate-1.16-13.el6.noarch from base

    Unable to install pacemaker on fresh install of centos 6.7?

    • Volodymyr Ivanets

      Hello Nicholas,

      I have recently started learning Corosync/Pacemaker/CRMSH software. I decided to use my favorite CentOS 6 and faced same problems as you. From what I experienced so far, I had to compile Corosync and Pacemaker manually. Although CRMSH web site states “We try to build Red Hat / Fedora-compatible RPM packages on the OBS (see above)” and provided package installing successfully, I had problems down the road on both CentOS 6 and 7. Here is the link: https://github.com/ClusterLabs/crmsh/issues/130

  • Shreyash Saurabh

    while trying to download corosync on ubuntu i am coming across an error which says libknet not found. Anyone knows how to resolve this problem?

  • raisharadha

    While i start pacemaker i get FAILED. Any idea of why this could happen

    ~]# service pacemaker start
    Starting Pacemaker Cluster Manager [FAILED]