cancel
Showing results for 
Search instead for 
Did you mean: 

Using NoSQL Instead of RDBMS

kylesaid
Champ in-the-making
Champ in-the-making
I know this wouldnt be trivial, but how feasible would it be to rip out the RDBMS replace it with a NoSQL database? Specifically graph databases like Neo4j work very well for ECM use cases. Any thoughts?
8 REPLIES 8

kylesaid
Champ in-the-making
Champ in-the-making
**Bump** No Interest?

Here's a link for an intro to Neo4j. It's an 8-page into that describes a high-level architectural view and a big of history behind the product.
http://dist.neo4j.org/neo-technology-introduction.pdf

Its interesting that Neo4j's first incarnation originated from the need for a high performance and highly scalable engine for an ECM implementation.

mrogers
Star Contributor
Star Contributor
Yours is a big question, so please don't take a lack of answer to mean there's no interest.  The problem is where to start …

There's clearly some overlap between the functionality provided by the alfresco repository and a NoSql implementation like Neo4j.  
On one hand its very difficult to "rip out the database" for all use cases for Alfresco..

Adding a NoSql store to Alfresco could makes a lot of sense for some intermediate use cases.   And in particular some WCM scenarios we have worked though have NoSql databases as a "delivery" tier.

kylesaid
Champ in-the-making
Champ in-the-making
I'd most likely want to co-locate a Neo4j database along side the RDBMS for starters. Content, folders, users, security roles are all nodes and CRUD operations are the relationships between those nodes. Using Spring AOP advices, I dont see a huge amount of effort for creating a graph network. I just wouldnt know what classes or methods to intercept using said advices.

Rendering all of this in Share is a whole different beast, but I would at the very least like to get the backend basics down. The basics being a graph network of content and folders nodes and the relationships in between.

Can anyone narrow down what packages I should look at for the existing persistence layer?

huima
Champ in-the-making
Champ in-the-making
This is actually really interesting question and funny that there are no more discussions about Neo4J in the Alfresco forums 😄

I just stumbled upon Neo4J as I am designing / thinking about designing a system, where Alfresco could / would be used as a document storage and additional data would come from multiple relational databases and systems. Modeling everything in relational database with ORM tools would / could be slow and cumbersome, compared to the models offered by Neo4j and Alfresco.  As there can be problem traversing Alfresco's graph ( as associations are not indexed in Lucene ), building and maintaining search graph in Neo4J could be a good idea in some cases.

Definitely would be interested to hear experiences, if someone has done anything with Alfresco and Neo4J.

kylesaid
Champ in-the-making
Champ in-the-making
I've decided that i'm going to start simple and i'm going to create a bolt-on for a Neo4j-Alfresco Auditing GraphDB. All of the logic for creating the nodes and relationships are pretty straight forward. Adding the Alfresco nodes and relationships to the Neo4j Lucene indexes should be pretty trivial as well. I'm already seeing EXTREMELY positive results from queries that would traditionally have poor performance characteristics in which 100s/millions rows data set with lots of joins would take minutes. Neo4j is traversing and returning results from said data set in a few seconds. Once I have something crude put together i'll share in more detail.

This really solves a whole lot of issues that I can see.
- Working with semi-structured or unstructured data
- Object-relational impedance mismatch. No need for ORM tools as you already mentioned.
- Working with large volume data sets
- Working with varied and complex data sets. 100s of joins needed? No Problem!
- Ease of handling evolving schemas. We all know that business requirements never remain static, so this is VERY important.

One of my colleagues asked me why not use a NOSQL document db or a key-value store. My answer was simple, a graph database offers the richest data model and give me the added benefit traversing (or joining) by relationship. Also while k-v stores can handle a larger data set at the moment, Neo4j can still handle 10s of billions of nodes within a single JVM. Lastly, k-v stores are absolutely horrible for developer productivity, whereas working with Neo4j feels very natural. That also includes columnar-family db's like Apache Cassandra.

kylesaid
Champ in-the-making
Champ in-the-making
Here's a good presentation btw….

http://www.mefeedia.com/watch/31077078

zoe
Champ in-the-making
Champ in-the-making
I've decided that i'm going to start simple and i'm going to create a bolt-on for a Neo4j-Alfresco Auditing GraphDB. … Once I have something crude put together i'll share in more detail.

I'm interested in hearing about this - are you still planning on sharing the details?  Would love to see it!
Zoe

lucasalberto
Champ in-the-making
Champ in-the-making
Are there news about this topic, nosql + alfresco?