Monday, December 22, 2014

Getting started: Cassandra + Spark with Vagrant

I play with a lot of different technologies and I like to keep my work stations clean. I do this by having a lot of vagrant VMs. My latest is Apache Spark with Apache Cassandra. We're going to install a working setup of Cassandra/Spark using Vagrant and Ansible. The Vagrant/Ansible is on Github here.

To get going you'll need:
If you haven't used Ansible before ignore all the paid for Ansible Tower and install it with your favourite package manager e.g homebrew or apt. 

Once that's installed checkout the Vagrant file.

Then launch the VM with vagrant up. This can take some time as it actually installs:
  • Java
  • Cassandra
  • Spark
  • Spark Cassandra connector
I could have baked a virtual box with all this in but the Ansible also documents you install all of these (and me once I've forgotten). As well as being slow it has the disadvantage that if downloads Cassandra/Spark so if their repositories are down it won't work.

The VM runs on port Your Spark master should be up and running on

You'll also have ops centre installed at:

To add the cluster simply click "Add existing cluster.." then enter the IP

If you want to use cqlsh then simply "vagrant ssh" in and then run "cqlsh"

To get spark shell up and running just "vagrant ssh" in and then run the spark-shell command:

Spark shell has been aliased to include the Cassandra spark connector so you can start using Cassandra backed RDDs right away!

Any questions or problems just ping me on twitter: @chbatey


Luis said...

Thank you for all your work!

I was wondering if this method was still available. Went i try to run the vagrant up command i get "host not found"

Padminiprwatech said...

Thanks for sharing your innovative ideas to our vision. I have read your blog and I gathered some new information through your blog. Your blog is really very informative and unique. Keep posting like this. Awaiting for your further update.If you are looking for any How to install Cassandra on ubuntu related information, please visit our website Cassandra Cluster ubuntu Setup

michael said...


Yasodha Varman said...

Here is the best AWS Solution Architect Training in Chennai from Infycle Technologies, the best software training institute in Chennai. And we circulate the topmost demanding courses like Graphic Design and Animation, Power BI, Combo of Python + Oracle with Java, Blockchain, Artificial Intelligence, Big data, Azure Certifications, Python, Selenium Automation Testing, Machine Learning, Medical Coding, etc., with 100+ Live Practical Sessions. Reach us on call at +91-7504633633, +91-7502633633 for best offers.