Saturday, May 24, 2014

Introducing Scassandra: Testing Cassandra with ease

Scassandra (Stubbed Cassandra) is a new open source tool for testing applications that interact with Cassandra.

It is intended to be used for:
  • Unit testing DAOs that interact with Cassandra
  • Acceptance testing applications that interact with Cassandra
The first release is primarily aimed at Java developers. Subsequent releases will be aimed at people writing black box acceptance tests.

It works by implementing the server side of the CQL binary protocol, meaning that Cassandra drivers, such as the the Datastax Java Driver, believe it to be a real Cassandra.

The original motivation for developing Scassandra was to test edge case / failure scenarios. However it quickly became apparently it is just as useful with regular happy path unit tests.

So how does it work?


When you start Scassandra it opens two ports:
  • Binary port: this is the port you'll configure your application with rather than the binary or a real Cassandra. 
  • Admin port: this is for Scassandra's REST API that provides priming and retrieving a list of executed queries. 
By default, when your application connects to Scassandra and executes queries Scassandra will respond with a result with zero rows.

Then via Scassandra Java Client, and later the REST API, you can prime Scassandra to return rows (where you can specify the column types and values), read request time outs, write request time outs and unavailable exceptions.

After you've run your tests you can verify the queries and prepared statements that your application has executed. Including the consistency the queries have been executed with. So if you have requirements to execute certain writes at a high consistency but other queries can be at a lower consistency this can be tested via black box acceptance tests.

The benefits of Scassandra over testing against a real Cassandra instance:
  • Test failures deterministically: where previously you would need to have a multi node cluster with the ability to bring down nodes. 
  • Test the consistency of queries. This has come up at my workplace where a requirement was that for most queries we can downgrade consistency when there are failures but for certain important writes they had to be executed at QUORUM.
  • Have fast running DAO tests that don't require mocking out driver classes or a real Cassandra running.

So how do I use Scassandra?


The first release of Scassandra is aimed at Java developers. Scassandra comes in two parts:

  • Scassandra server. This is a Scala application that has been put in Maven central with a pom that will bring in its transitive dependencies.
  • Scassandra Java client. A thin wrapper written in Java to make using Scassandra from Java tests easy. This has methods to start/stop Scassandra and classes that prime / retrieve the list of executed queries.

For the first release it is expected that Scassandra will only be used via the Java client and no one will use it as a standalone executeable or interact with the REST API directly.

To get started with the Scassandra Java client then go here. Or checkout the example project here.

If you aren't using Java or a language that can easily use the Java client then the next release will be for you where we'll build a standalone executable and from that release on we'll make the REST API backward compatible as we'll expect people to use it directly.

It is all open source on github and you can find Scasandra server here.

The Java client is here.

And all the details of the REST API e.g how to prime are on the Scassandra sever website here.

Using Stubbed Cassandra: Unit testing Java applications

My first article on Scassandra introduced what it is and why I've made it.

This article describes how to use Scassanda to help unit test a Java class that stores and retrieves data from Cassandra.

It assumes you're using a tool that can download dependencies from maven central e.g Maven, Gradle or SBT.

First add Scassandra as a dependency. It is in maven central so you can add it to your pom with the following xml:

<dependency>
  <groupId>org.scassandra</groupId>
  <artifactId>java-client</artifactId>
  <version>0.2.1</version>
</dependency>

Or the following entry in your build.gradle:

dependencies {
    compile('org.scassandra:java-client:0.2.1')
}

There are four important classes you'll deal with from Java:
  • ScassandraFactory - used to create instances of Scassandra
  • Scassandra - interface for starting/stopping Scassandra and getting hold of a PrimingClient and an ActivityClient
  • PrimingClient - sends priming requests to Scassandra RESTful admin interface
  • ActivityClient - retrieves all the recorded queries and prepared statements from the Scassandra RESTful admin interface

The PrimingClient and ActivityClient have been created to ease integration for Java developers. Otherwise you would need to construct JSON and send it over HTTP to Scassandra.

You can start a Scassandra instance per unit test and clear all primes and recorded activity between tests.

To start Scassandra before your test starts add a BeforeClass e.g:


You can also add a AfterClass to close Scassandra down:

Now that you have Scassandra running lets write a test. Perhaps you want to test a simple Java DAO that connects to Cassandra and executes a query.

And you have a backing table like:

CREATE TABLE person (
  id int,
  first_name text,
  PRIMARY KEY (id)
)

Lets TDD the DAO using Scassandra starting with our connect method:

Lets look at what this code is doing:
  • Line 4: Informs the activity client to clear all recorded connections. This is to stop other tests that have caused connections interfering with this one. 
  • Line 6: We call on connect on our PersonDao.
  • Line 8: We call retrieveConnections on the activity client and expect there to be at least one. The Java Datastax driver makes multiple connections on startup so you can't assert for this to be 1.
This fails with the following message:

java.lang.AssertionError: Expected at least one connection to Cassandra on connect
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at com.batey.examples.scassandra.PersonDaoTest.shouldConnectToCassandraWhenConnectCalled(PersonDaoTest.java:94)

Now lets write some code to make it pass:

Now lets test the retrieveNames function gets all the first_names out of the person table.

This will prime Scassandra to return a single row with the column first_name with the value Chris. We expect our DAO to turn that into a List of strings containing Chris. To make this pass we need to execute a query and convert the ResultSet, something like this:

Next lets say you have the requirement that you really must not get an out of date list of names. So you want to test that the consistency you do the query is QUORUM. You can test this like this:

Lets look at what each line is doing:
  • Line 4 builds the expected query, note the consistency is also set. If you build a Query without a consistency it defaults to ONE.
  • Line 7 clears all the recorded activity so that another test does not interfere with this one. It also clears the queries that were executed as part of connect (the Datastax Java driver issues quite a few queries on the system keyspace on startup)
  • Line 11 retrieves all the queries your application has execited
  • Line 12 verifies the expected query that was built on Line 4 has been executed

This will fail with an error message like this:

java.lang.AssertionError: Expected query with consistency QUORUM, found following queries: [ {Query{query='select * from people', consistency='ONE'}]

We can make this pass by adding the consistency to our query:

And we're done!

This has been a brief instruction to Scassandra but hopefully the above gives you an idea of how Scassandra can be used to test your Java applications that use Cassandra. We've covered:
  • Priming basic queries
  • Verifying queries
  • Verifying connections
Future blog posts will show you how to:
  • Prime prepared statements
  • Prime different column types in responses
  • Prime error cases
Scassandra has only just been released. The future road map includes:
  • JUnit rule so you don't need to handle starting/stopping and clearing recorded activity
  • More generic priming e.g any query on this table
  • Support for more drivers
All the code for this example can be found in full here.