cancel
Showing results for 
Search instead for 
Did you mean: 

Alfresco Database Design

davidp
Champ in-the-making
Champ in-the-making
Does Alfresco forsee any bottlenecks or potential performance problems due to the non-relational structure of the Alfresco database?  I am concerned about the potential issues with large complex reporting queries and such on the various data sets in the Alfresco database.  Is there a plan to provide any advanced reporting capabilities where the queries might, in a relational environment, include many complex table joins?

Any information that can alleviate our concerns is greatly appreciated.

Regards,
David Paolicelli
8 REPLIES 8

killinchi
Champ in-the-making
Champ in-the-making
I have reached a point where I may have to use a different Content repository because i see no easy way to extract the data from alf_node_properties, this is because of serializable values and the issues of inner joining for each propertiy you want to report on.  In my opinion the developers though that this generic structure was a great way to store data but they didn't they about how to get it out in an efficient manner.

deebaprithy
Champ in-the-making
Champ in-the-making
Hello,
I need a database diagram of Alfresco. Is there any softcopy available for me to download. This is because i need to more understand the Alfresco Database structure.


thanks
Deepa Preethy.M

rdanner
Champ in-the-making
Champ in-the-making
Hello,
I need a database diagram of Alfresco. Is there any softcopy available for me to download. This is because i need to more understand the Alfresco Database structure.


thanks
Deepa Preethy.M

If you cant get one here, download a copy of DBDesigner.  Its a MSQL tool and it will create a decent ERD for you by reverse engineering the database.

stk137
Champ in-the-making
Champ in-the-making
Any information that can alleviate our concerns is greatly appreciated.

(my comments are related to the first couple of posts in this thread started over 1.5 years ago. )

It was interesting to see this old thread pop up.  It's too bad that no one from Alfresco responded.  I have only been using Alfresco (1.4, 2.0, 2.1) for a few months but the database design and query performance are issues for me.   I am curious at the future design of Alfresco in this regard.  Should I give up hope for a better database design?   Will the database always be so generic as to not take advantage of using a RDBMS?  Will we ever see proper columns for major fields (title, modified, modified_by, etc)?  Will users and groups ever get their own tables?  Does trying to fulfill the JCR spec itself make this impossible?

When I initially heard about the development of the JCR spec, I was optimistic that this would lead to things like Plone but in Java and much more performant - replacing ZopeDB with enterprise SQL.

However, it seems that Alfresco, and maybe all JCRs, have recreated the limitations of Plone/ZopeDB (while at the same time missing some of the strengths of Plone's presentation layer).

Queries and reports are highly important to a general purpose content management system.  Much of what you see on a Plone based site is driven by queries - latest headlines, recent docs, etc.  But Plone suffered from poor query performance.  Without a true relational model and enterprise RDBMS, Plone can only perform acceptably well through thorough and smart caching.  Searching (queries, maybe not free-text), sorting, filtering, grouping are what RDBMS excel at and I'd like that power in any CMS I use.

Although, Alfresco can use Oracle it might as well be using ZopeDB in some respects.  So far anything I have tried in Alfresco that involves a query performs terribly.  Even just trying to search for a user to add to a group is far too slow.  Currently I am trying to come up with an acceptable way to present a sorted list of the top 10 most recently modified documents and can't seem to do this with any query (XPath or Lucene) - and this is a feature that a CMS should have out of the box - IMHO.  If this were a true relational model, this could be quite zippy.  It could just be some joins/where/order by all on indexed columns.  But with the current design even standard, commonly queried fields like "modified" don't exist as columns, so they aren't even indexable, and with users and groups buried in the generic model, you can't join up for security, causing queries to make another pass through the data to filter for security.(http://forums.alfresco.com/viewtopic.php?t=2164&sid=0ae441f8b824b23eda11fc335bb3431d)


And so on.

borledge
Champ in-the-making
Champ in-the-making
It would be nice if someone from Alfresco could provide a ERD or at least a UML model.

dbevacqua
Champ in-the-making
Champ in-the-making
I feel for you!

We've encountered similar problems. One of the biggest is relating to joins between content types. Our approach involves these things, among others:

- an interceptor on nodeService to expand serializable properties (e.g. mltext) into custom tables
- an interceptor to maintain (custom) ancestor associations among nodes

- create model as POJOs
- alfresco-specific annotations for the pojos
- annotation processor to generate custom alfresco content model
- annotation processor to generate views of the alfresco db for each content type and for each association type, including ancestry
- materialise the views
- map POJOs to materialised views

and hey presto you can query your alfresco repository with HQL and get POJOs back!

bit dirty, but orders of magnitude quicker for a lot of queries.

Dominic

hbf
Champ on-the-rise
Champ on-the-rise
I can only confirm the experiences made by some in this thread: Query performance, in particular category query performance, is – I am sorry to say – terrible. So sad for such a great content repository! But without efficient queries, it simply cannot be the repo for sophisticated web apps, without a smart caching/custom interceptors at least.

Are there plans to improve the DB design? I have heard about a SQL-like query language but no news for quite some time…

jck
Champ in-the-making
Champ in-the-making
We developed a module which permits to use SQL on a real DB design to extract meta data from Alfresco. More information can be found on this wiki :
* http://wiki.bluexml.com/index.php/Alfresco_SQL_views

Like Dominic says it, materialized views would be agreat solution, but unfortunately reserved to Oracle users.

We are currently workgin and documenting another solution based on Alfresco policies.

HTH,

Jean-Christophe