Hyland Connect

kgeis · ‎01-29-2007

I was thinking about content models for things like forums, tasks, and calendars. It made me wonder, where do you draw the line between something that fits a more traditional sense of managed content (like, for instance, a bunch of PDFs) and something that is usually handled in database-backed applications? In systems like Alfresco, properties are meant for metadata, but they could easily be considered data as well. A CMS like Alfresco might just make application design a lot easier if you map domain objects to content models instead of to database tables. And certainly relational databases solved a performance problem for a long time, but now we're starting to have enough computing power and we need higher levels of abstraction to build solutions faster.

I'm just wondering if anyone has thought about this and if there are any sort of current practices for architecture concerning this divide. How are people out there mixing Alfresco with database-backed applications?

fselendic · ‎01-30-2007

I was thinking about content models for things like forums, tasks, and calendars. It made me wonder, where do you draw the line between something that fits a more traditional sense of managed content (like, for instance, a bunch of PDFs) and something that is usually handled in database-backed applications? In systems like Alfresco, properties are meant for metadata, but they could easily be considered data as well. A CMS like Alfresco might just make application design a lot easier if you map domain objects to content models instead of to database tables. And certainly relational databases solved a performance problem for a long time, but now we're starting to have enough computing power and we need higher levels of abstraction to build solutions faster.

I'm just wondering if anyone has thought about this and if there are any sort of current practices for architecture concerning this divide. How are people out there mixing Alfresco with database-backed applications?

We thought about that A LOT. There's no easy answer. Actually the performance is just one of the possible problems (although Alfresco also technically sits on a database).

First of all, Databases are also much more advanced regarding tools support, modeling, recovery, backup, OLAP, Data mining, stuff like that. We even wrote our plugin for Eclipse to automatically create Alfresco types from Hibernate model. Much more stuff like that should be avaiable te be able to compete with traditional databases.

On the other hand, having all of your data at one place, and all of it being treated like a content is very appealing. Especially when having such a powerful framework and services that Alfresco offers (like indexing, searching, versioning, scripting, templating, dashboards, easy data mashup, workflow, various other aspects, federated search in future, categorising, classifying, clustering etc.) It becomes very trivial and flexible to work with content/data that way, and ideas just keep poping up

Of course, you can get all that with more traditional approaches, through database, but then you end up writting some of the same functionality that Alfresco offers. And if you do that, why the hell did you take Alfresco in the first place for

? Just for DM? And when combining Alfresco with databases, you essentially get two data sources, with all of the problems that such approach brings. How will you sync between the two? How will you scale? Cluster? When will you take data from database and when through Alfresco? Now, drop in other data sources like LDAP (needed if you want to glue together several OS products, like Portal, or messaging server, or some groupware) and problems just keep on growing.

We would definatelly need some benchmarks, some best practices document, Alfresco blurred the lines a bit too much.

It is actually repository vs database problem, we are experimenting now (as of now we have party pattern in Alfresco, call detail records, some iCal stuff over CalDav), some guys went through that iteration before us, doing groupware over JSR170 (JackRabbit) but dropped that approach because of performance problems, and went for classical OR mapper/database approach. We are not sure if performace problems were just JackRabbit implementation fault, or "lets keep all of our data in repository" is just not going to work at all performance wise.

We are too very interested in subject. But unfortunately don't have manpower or time to dig in deeper for now.

Hyland Connect

where to draw the line (CMS vs. database application)