Hyland Connect

pco · ‎09-29-2014

Hi, folks,

I am trying to create a small groovy app with OpenCMIS and Alfresco Community 5.0.a as the backend (so CMIS 1.1 🙂
I experienced quick progress during the first weeks with Alfresco (Content Modelling, customizations) and OpenCMIS APIs. Having arrived on the fine-tuning level some questions arise where I am not sure how to proceed.

Scenario:
1. I want to provide a tabular view of all cmis:document objects of a folder (like detail view in Windows explorer for example)
2. Some columns refer to properties of aspects (e.g. number of pages inside PDF file)
3. Not all documents necessarily have all aspects (e.g. jpeg's don't have "number of pages" aspect, would not make any sense)

What I have done so far: I Started with Session.query method for creating the query. The tabular view renders the List of QueryResult-items. That means: Fast access to the repo, only one AtomPub-Request per query (verified with tcp sniffer) - quite snappy (OperationsContext is tuned appropriately). I can display basic data, like name, contentstream size, content type, thumbnail etc.

Example for OpenCMIS workbench (simplified):


def folder = session.getObjectByPath('<some folder>');
// hint: select * just for simplicity of example, don't do this in production, if performance matters
def result = session.query("select * from cmis:document where IN_FOLDER('${folder.id}')", false)

result.each {
    println it.getPropertyValueById('cmis:name')
    println it.getPropertyValueById('pdfp:numberOfPages')
    println '—————'
}
‍‍‍‍‍‍‍‍‍‍‍

Problem: The optional Aspects. In order to fetch the "numberOfPages" for example, I would have to do an outer join, which is not yet supported by alfresco. This is how it should look like (note the "LEFT"):


def folder = session.getObjectByPath('<some folder>');
def result = session.query("select * from cmis:document d LEFT join pdfp:pages p on d.cmis:objectId=p.cmis:objectId where IN_FOLDER(d,'${folder.id}')", false)

result.each {
    println it.getPropertyValueById('cmis:name')
    println it.getPropertyValueById('pdfp:numberOfPages')
    println '—————'
}
‍‍‍‍‍‍‍‍‍‍

If I omit the "LEFT" I am exactly getting what I want, but of course only for documents which have the aspect - too bad.

So I started to investigate workarounds. Some came to my mind:
1. After the main query, do a subsequent query for each optional aspect with "cmis

bjectId in (<firstquery>.ids), then join in java
2. Load full-blown CmisObject after the query to get the info (n+1 problem, very bad idea)
3. Use Session.queryObjects to get full-blown CmisObjects right away (I tried this, but it turned out that aspect's properties are not selected, you need to call object.refresh() to get them, so again n+1)
4. #2 but with ehcache in front, so the repo call happens only on rare occasion
5. Do not add the data to the tabular view but only in the detail view

At the moment I am thinking about #4 or #5 because I wanted to keep my query code as simple as possible and rather wait for outer join to be implemented.

So: How are you guys coping with this? I can't imagine I am the only one who fell over this limitation.
Btw: Hope to see aggregation implemented in the future as well (Select distinct, count(), group by, all these would be very welcome)

Regards

iv0id · ‎09-30-2014

Hi pco,

i've also encountered this case and for now i'm using the second solution (load the hole CmisObject) since the search results in my case don't exceed 10 items. I hope that the next versions will implement the outer joins

pco · ‎10-02-2014

Hi, iv0id,

thanks for your reply, then it seems to be the right way to do it at the moment. Let's hope for outer JOIN. Probably I'll try to explore the cmis server code once I have some spare time.

In fact, I also had a little "weird" Idea yesterday: Create one aspect that contains all Fields of all my optional aspects as non-mandatory and protected (AspectsCarbonCopyAspect) and then basically write a custom Behaviour that updates the CC aspect's fields upon property changes on the original aspects. The query then is an inner join of cmis:document and the CC aspect.
This would resolve the problem but I do not feel very comfortable polluting the entire repository with redundant information just because of a "lack of flexibility". So I'll also stick with #2 (or #4 to be precise, ehcache does not hurt).

Have a great day

Hyland Connect

Efficient CMIS query for nodes with optional aspects