Monday, May 28, 2012

Dropwizard and Spring (sort of)

We're big Dropwizard fans.  If ever I have to, its going to be tough for me to go back to Tomcat and war files.   Dropwizard is lean and mean.

But in all of Dropwizard's beautiful simplicity, sometimes you need a little bit more.  You might need a JMS connection, JPA access to a database, or a scheduled job.  Dropwizard has simple facilities for some of the same functions (e.g. JDBI), but if you need capabilities beyond that, it is sometimes best to fall back on tried and true (and incredibly popular) frameworks like Spring.

For JMS, Spring JMS is simple and rock solid.
For JPA, Spring Hibernate takes the pain out of ORM. (or at least most of it)
For scheduling, Spring + Quartz is great.
For workflow, I've even become a fan of Activiti, which also has Spring integration.

OK -- so we're motivated to build the bridge between these lovely frameworks, but what does that even mean?

The heaviest point of intersection between the two is configuration.  Dropwizard loves yaml.  Spring loves xml.  And both want to control the load process.  Although we debated (and continue to debate) it,  in the end we decided that it was best to let Spring drive because we believe its necessary to take full advantage of Spring's ability to autowire an application.

Thus, in our initialize method of our com.yammer.dropwizard.Service.initialize() method we load our ApplicationContext off of the classpath.

protected void initialize(CirrusServiceConfiguration conf, Environment env) throws Exception {
   applicationContext = new ClassPathXmlApplicationContext("beans.xml");
   RestResource restResource = (RestResource) applicationContext.getBean("restResource");

Each of our Dropwizard JAX-RS Resources we declare as spring components (with @Component), which are then picked up during a component scan.  As shown above, we can then pull them out of the application context and register them with Dropwizard.

Now, that's fairly clean.  Dropwizard controls the main(), while Spring controls the configuration load.  The challenge comes if you want to maintain only one single environment specific configuration file.   Dropwizard will want that environment configuration in yaml, and will parse the config using Dropwizard conventions.   But without further work, you'd have to maintain another set of xml or properties files to keep Spring happy.

We found an "acceptable" solution, by integrating snakeyaml into the Spring config.  Fortunately, Mea came up with a pretty nice piece of code for this.  Follow that blog entry, pointing the placeholder at the Dropwizard configuraiton.

Doing that allowed us to use properties defined in the Dropwizard yaml in our spring configuration.  For example, you could refer to ${fooDb.user} in your Spring config, if there were a hierarchical property in your dropwizard yaml:

  user: ora_foo_readonly

This works and centralizes the configuration, but the keen eye will notice that we are parsing the configuration twice.  We're first allowing Dropwizard to parse it, then we are parsing it again with Snakeyaml for the Spring specific configurations.  This doesn't necessarily make for a happy environment because Dropwizard will encounter properties it doesn't understand.  And it will complain, sometimes loudly, if it doesn't have corresponding properties and classes that correspond to those values.  Alas, presently that is the price you must pay.

I know others are working this problem, and a solution may be forthcoming, but for now -- this allows us to move forward with both Spring and Dropwizard in a single application with a single centralized configuration.

Please let us know if people have alternative solutions for integrating these stellar frameworks.

Wednesday, May 2, 2012

Dumping Data from Cassandra (like mysqldump, to create a seed script)

To manage our database migrations for Cassandra, we've taken to creating scripts that create/modify the schema and populate configuration data. In my previous post, I showed how to dump and load the schema (keyspaces and column families).

Here is a code snippet that lets you dump the data into set statements that you can then load via the CLI.

     public void dumpColumnFamily() {  
         String columnFamily = "YOUR_COLUMN_FAMILY";  
         Cluster cluster = HFactory.getOrCreateCluster("dev", "localhost:9160");  
         Keyspace keyspace = HFactory.createKeyspace("YOUR_KEYSPACE", cluster);  
         RangeSlicesQuery<String, String, String> rangeSlicesQuery = HFactory  
                 .createRangeSlicesQuery(keyspace, StringSerializer.get(),  
                         StringSerializer.get(), StringSerializer.get());  
         rangeSlicesQuery.setKeys("", "");  
         rangeSlicesQuery.setRange("", "", false, 2000); // MAX_COLUMNS  
         rangeSlicesQuery.setRowCount(2000); // MAX_ROWS  
         QueryResult<OrderedRows<String, String, String>> result = rangeSlicesQuery .execute();  
         OrderedRows<String, String, String> orderedRows = result.get();  
         for (Row<String, String, String> r : orderedRows) {  
             ColumnSlice<String, String> slice = r.getColumnSlice();  
             for (HColumn<String, String> column : slice.getColumns()) {  
                 System.out.println("set " + columnFamily + "['" + r.getKey()  
                         + "']" + "['" + column.getName() + "'] = '"  
                         + column.getValue() + "';");