Tuesday, 14 May 2013

Continuous delivery using MongoDB

Introduction


Last weekend I organized, managed and supported our production release for the first time. Our current primary data store is SQL Server and the release follows roughly this order:
  • Shutdown application
  • Backup databases
  • Run data definition deployment scripts and data scripts
  • Deploy the application
  • Start-up application
  • Test application

From shutdown to start-up there are a good few hours off downtime for the application. That's why releases usually happen on the weekend. It would be nice however if we would be a bit more flexible and the team would appreciate it if we could do deployments after hours during the week. To be able to do this though we would need to cut down the deployment time.

One way of doing this is to use a schemaless database like MongoDB. Since MongoDB doesn't enforce a schema there are no data definition scripts to run either on the database side. The schema is enforced on the application level. That means that the application is responsible for writing data in a safe way and providing read methods which can retrieve the stored data again. By deploying the new version of the application we would therefore also roll out the new version of the schema implicitly.

Data migration in MongoDB

After deploying the new version of the application existing data needs to be migrated. Lets take for example the common case of adding a new field to a collection. In RDBMS we would add the column using a DDL statement and either set a default value or run a batch update to populate the column.

In MongoDB there is a incremental way of achieving the same thing. Currently I am reading the book "NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence" by Pramod J. Sadalage and Martin Fowler which introduced an interesting pattern to achieve incremental database deployment.

  


The application needs to make sure that during the transition phase it can read both the old version of the document and the new version. However once the document gets saved the application would save only the new version. That way the entire data set will get migrated over time. To implement this the book suggest to add a field schema_version to each document. This schema version could correspond to the version of the application. Based on this field the application can decide if a document has been migrated already. If a document is loaded which is of an older version the application can provide special code to execute for migrating to the next version and save it.

Once all documents are migrated the migration code in the application can safely be removed in the next release.



1 comment:

  1. "Nice and good article.. it is very useful for me to learn and understand easily.. thanks for sharing your valuable information and time.. please keep updating.php jobs in hyderabad.
    "

    ReplyDelete