Thursday, July 31, 2014

Rabbit MQ vs Apache QPID - picking your AMQP broker

Two of the top AMQP brokers are RabbitMQ and QPID. Which one is for you? Here are some of the considerations:

Both have the same programming model as you can connect with any AMQP compatible client, so how do you pick?

First off you need to decide if you're interested in the Java QPID broker or the C++ QPID broker. They have the same name but they are quite different. This article will compare the C++ QPID broker.


RabbitMQ has been developed in Erlang, and thus you need to install Erlang before you can run it. Think of this as the equivalent of needing a JVM for running something like Tomcat. Erlang is in most Linux distros packaging system so this is easy.

QPID C++ broker doesn't have a storage dependency like its Java equivalent. To enable any form of HA QPID depends on rgmanager.


RabbitMQ uses the built in Erlang database: Mnesia. A Mnesia database can be configured to either be in RAM or disk, which allows RabbitMQ to offer in memory / disk based queues very easily. It also means the developers of RabbitMQ haven't had all the complexities of developing a file based storage mechanism and don't rely on a database.

QPID stores its queues in memory or database. Without looking at the code that is all we're going to learn. If you see QPID referencing databases then you are looking at the QPID Java broker.

HA Queues and Failover

RabbitMQ runs in an active-passive mode for HA queues. Each queue has a master and writes are replicated to slaves. By default, RabbitMQ doesn't try and solve network partition within clusters. It expects a reliable LAN and should never be used across a WAN. However it has two optional ways of dealing with network partitions: pause_minority and autoheal. If you enable pause minority a rabbit in a cluster will disable its self by disconnecting clients if it can only communicate with a minority of nodes. The idea here is that clients will reconnect to the majority part of the cluster. Autoheal lets the nodes carry on as normal and when the network partition ends resets the nodes in the side that had the fewest client connections, how brutal!

QPID has overwhelming number of options when it comes to HA. Lets stick to in LAN HA. There you have active-passive messaging clusters. Here queues are replicated to a passive QPID node, then when the master fails you can either move a virtual IP or configure the client with both the master and the salve and it will failover. The big caveat is that QPID won't handle failover its self, you need a cluster manager like Rgmanager. The rational is that they want to avoid split brain, I'd rather have the functionality built in and not use it if I need to worry about split brain.

Cross DC? WAN compatible?

Who runs in a single DC these days? Not me! Most queueing solutions shy away from the WAN connection. Most have a clustering mode that will only work in a reliable LAN. When it comes to getting your data from one datacenter to another it is usually a bolt on.

RabbitMQ has two plugins to handle this. Lets take them individually. The Shovel plugin allows you to take messages from a queue in one cluster and put it on an exchange in another, perfect! It can be configured to create the queues/exchanges and handles connection failures. You can set properties like the ack mode depending on how resilient you want the putting of messages to be. You can also list multiple brokers so you can have it failover to another broker if you lose connection to the original. A great feature of shovels is that you can define them on every node of your cluster and it will run on a single node, if that node fails then it will start on another node.

RabbitMQ also has the federation plugin. This plugin allows you to receive messages from exchanges and queues from other brokers. It achieves the same goal as the shovel plugin but is slightly less configurable.

QPID has a very similar feature to shovels called broker federation where you define a source queue on one broker and a destination exchange on a broker running in a different exchange. Like RabbitMQ these "routes" can be replicated on a cluster of brokers and if the broker that is executing the route goes down another in the cluster can take its place.

It is a draw! However I would always consider a simpler option for cross WAN queueing. That is get the message in a queue in the datacenter is is produces but use a consumer from an application running in the other datacenter. After all it is your application that needs the message and is best places to handle failures.


RabbitMQ is far simpler to get going with, primarily due to fantastic documentation and very easy install process. I also like the the fact RabbitMQ uses Erlang's in built database and internode communication mechanism. The two modes for network partitions are very nice also. So RabbitMQ wins this tight race.

However, before you go and use RabbitMQ, checkout Apache Kafka as the wildcard!

Thursday, July 10, 2014

Installing Cassandra and Spark with Ansible

I am currently doing a proof of concept with Spark and Cassandra. I quickly need to be able to create and start Cassandra and Spark clusters. Ansible to the rescue!

I split this my ansible playbook into three roles:
  • Cassandra
  • Ops center
  • Spark
My main playbook is very simple:

I have some hosts defined in a separate hosts file called m25-cassandra. I've decided to install htop, I could have out this in a general server role.

I also define a few variables, these of course course could be defined else where per role:
  • cluster_name - this will replace the cluser name in each of the hosts cassandra.yaml
  • seeds - as above
So lets take a look at each role.


Here are the tasks:

This is doing the following:
  • Installing a JRE
  • Adding the Apache Cassandra debian repository
  • Adding the keys for the debian repository
  • Installing the latest version of Cassandra
  • Replacing the cassandra.yaml (details later)
  • Ensuring Cassandra is started
The template cassandra.yaml uses the following variables:
  • cluster_name: '{{ cluster_name }}' - So we can rename the cluster
  • - seeds: "{{ seeds }}" - So when we add a new node it connects to the cluster
  • listen_address: {{ inventory_hostname }} - Listen on the nodes external IP so other nodes can communicate with it
  • rpc_address: {{ inventory_hostname }} - So we can connect ops center and cqlsh to the nodes
Magic! Now adding new hosts to my hosts file with the tag m25_cassandra will get Cassandra installed, connected to the cluster and started.

Ops Center

The tasks file for ops center:

This is doing the following:
  • Adding the Datastax community debian repository
  • Adding the key for the repo
  • Installing Ops Center
  • Starting Ops Center
No templates here as all the default configuration is fine.


The spark maven build can build a debian package but I didn't find a public debian repo with it in so the following just downloads and unzips the Spark package:

I start the workers using the script from my local master do don't need to start anything on the nodes that have Cassandra on.


Ansible makes it very easy to install distributed systems like Cassandra. The thought of doing it manually fills me with pain! This is just got a PoC, I don't suggest downloading Spark from the public internet or always installing the latest version of Cassandra for your production systems. The full souce including templates and directory structure is here.