Archive for April, 2013

RabbitMQ and RedHat Cluster

April 6, 2013 Leave a comment

I’m learning all about my life
By looking through her eyes

(Dream Theater – Through her eyes)

There are couple of ways of achieving RabbitMQ high availability, you can set up high available queues or high availability of RabbitMQ daemon, and you can combine both approaches. Official documentation explains how to set up HA of single instance of RabbitMQ with Pacemaker and DRBD, but because I’m mainly RedHat Cluster Suite user, so that documentation doesn’t help much.

To achieve high availability of any service, we must understand how the service works, where it stores it’s data, what has to be transferred between machines for service to continue to function properly

RabbitMQ consists of configuration files, server database and log directory. Configuration files are generally read only, and are only important when service starts. So, they can be distributed among RedHat Cluster members via configuration management like Puppet or node syncing tool like Csync2. Log files are not essential for functioning of service, but are mainly used for debugging and reporting of RabbitMQ actions. So, each cluster node can have it’s own directory for RabbitMQ logs. So, we’re left with server database. That’s the only part of RabbitMQ installation which has to be shared between cluster nodes somehow. We can use either shared storage or DRBD to achieve this. Clustered aware filesystem like GFS or OCFS2 is not needed, because filesystem has to be mounted on active node only. RabbitMQ sets it’s nodename to ‘rabbit@`hostname`’ when it starts, and sets up MNESIA_DIR based on that nodename. Because HA RabbitMQ instance has to be able to run on multiple machines with same exact settings, we have to set up NODENAME too.
So, rabbitmq-env.conf looks like this:

# Bind to one network interface only

# Nodename should be unique per erlang-node-and-machine combination

# The Mnesia database files directory

Node IP address has to be set, otherwise RabbitMQ would listen on *:5672. In this case, I’ve used “rabbit@localhost” as node name. What it’s important is that hostname which you choose has reverse DNS entry in place, otherwise RabbitMQ won’t start (at least version 2.6.1 which I use). Directory “/ha_cluster/rabbitmq” (or “/ha_cluster”) is mountpoint for HA filesystem, and should have same permissions as RabbitMQ service. That’s the directory where mensia data will be stored. To make it more clear, here is the snippet from cluster.conf:

        <ip address="" monitor_link="1"/>
        <drbd name="drbd_rabbit" resource="drbd_rabbit"/>
        <fs device="/dev/drbd0" fstype="ext4" mountpoint="/ha_cluster/rabbitmq" name="fs_rabbitmq"/>
        <script file="/etc/init.d/rabbitmq" name="rabbitmq"/>
<service autostart="1" domain="foobar" name="ha_rabbitmq" recovery="relocate">
        <ip ref=""/>
                <drbd ref="drbd_rabbit">
                        <fs ref="fs_rabbitmq">
                                <script ref="rabbitmq"/>

In this example, I used DRBD. If you have shared storage at your disposable, that’s an option too. In resources section, I’ve defined ip, drbd, fs and service resources. Because RedHat Cluster Suite doesn’t offer native cluster agent for RabbitMQ, I just use standard init service. If you want to run multiple RabbitMQ’s on same machine, then this approach wouldn’t work, and you would have to have multiple scripts that point to different configuration files, or simply write your own cluster agent and use it as native type.

Note that this setup when a node fails, both durable queues and the persistent messages within them can be recovered by a different node, but it does impose delay at failover/failback. RabbitMQ won’t be available for a couple of seconds, until service relocates to another cluster node. If this is not acceptable, then you have to also setup active-active setup which involves multiple RabbitMQ’s.

%d bloggers like this: