Solr as a NoSQL Database
First let me say that I have spent most of my professional career working with SQL databases. In fact, I have made a career out of being good with data and SQL. So when I heard about noSQL I just dismissed it as crazy talk. What self respecting web application developer would build a modern web app without a database! Me. I did. And I have to say that noSQL is powerful. It is not a replacement for SQL, but it is a very powerful tool that is better suited to many web applications than a SQL database. Am I throwing away all of my SQL books? No, but SQL is not the only tool I have now for powering dynamic web applications.
What is NoSQL?
NoSQL means "Not Only SQL". Nobody is proposing that we throw SQL in the trash. Rather, that other storage mechanisms may be more appropriate. Now that we know what noSQL does not mean, let discuss what it does mean. Imagine building a web application with the Google search engine as your interface to data. Imagine a repository of XML documents with Google search as your mechanism for searching them. You could do things with that that SQL could never do. I use Google everyday to search for information that I need. Why not empower my apps to be able to do the same?
So I did it. I decided to write a meal planner application for my wife. She and I have been raw food eaters for a few years and the thing with raw foods is that recipes can take a few days to prepare because you usually have to sprout or dehydrate something. Existing meal planners would not work for us.
Why Solr?
I chose Solr as my noSQL backend. Why? Because it is the closest thing to Google that I could find. Solr is essentially a RESTful interface that sits on Lucene. And Lucene is an extremely powerful open source search engine. And what Lucene is particularly good at is searching and indexing XML.
So the basic concept is this - objects are persisted as XML documents and can then be searched, updated or deleted by the application. And this concept is an extremely powerful one. Why? No more table definitions, no more stored procedures, no more data access layer. And best of all is Solr speaks JSON. Now if I want an entity I just search for that entities id and I get back a JSON response that is immediately ready for me to use in the JavaScript. With this project over 70% of my time was focused on user interface development in Jquery. And that is why I am sold on NoSQL. (Just to be fair Linq and OData could offer similar time savings while keeping the SQL backend.)
JavaScript is an extremely dynamic language. I can create and delete properties from JavaScript objects on the fly. Because the objects are not bound by a database scheme, the application has the freedom to behave in ways that would not be possible or practical with a traditional SQL backend where objects and properties are persisted in tables and columns. SQL requires that the entities have a known structure. But what if you don't know what the entity will look like until run time?
The XML documents do not need to adhere to a specific schema. No more massive in place upgrades of databases or worse trying to convert data from version 1 of a database to version 2. There are downsides to not having structured data. Your application has to be smart enough to handle mutating versions of an object, but with many highly dynamic sites we are coding this way anyway. NoSQL just gives you more options.
Solr was able to handle everything I threw at it. CRUD operations were too easy. Data ranges were simple and paging was way easier and more efficient than in SQL. And I found that it was pretty easy to deal with "mutating objects". With the first pass at my recipe object, I failed to include a few important properties (Like that never happens to you.) In subsequent versions of the object, I check to see if certain properties exist and create or remove them as necessary. So now the "old" recipe entities were able to be selected, upgraded, and persisted as part of the application's normal behavior. No more database upgrade scripts!
Would I use it in Production?
Honestly, I am not sure. I would definitely do it to prototype applications. And I would probably use it on small to medium sized projects if it was a strong fit. But I am not 100% sure about large enterprise applications. The biggest reason for my hesitation is I am not sure downstream reporting systems would know what to do with it. How could they? One day an entity has one schema, the next day it has a slightly different one. How would BI systems deal with that kind of dynamism? Not very well I can assure you.