cancel
Showing results for 
Search instead for 
Did you mean: 

return last object in a folder from a query?

aweber1nj
Champ in-the-making
Champ in-the-making
So, from my early-learnings of CMIS and query-language, the lack of functions and joins is a bit of a drag…

In order to return the last/latest object in a folder, I know I can use the in_folder( ) expression, but I don't have a way to choose max( ) on a column, right?  (As an aside, I would have liked to have simply select max(cmisSmiley SurprisedbjectId), but I don't think that's "safe" with Alfresco).

So what are my options here?  Select the objectId and lastModifiedDate from all the objects in the folder and keep track of which is the latest manually as I iterate through the entire result-set?  That's really pretty slow and sub-optimal when a database or index can perform the actual query on the server and return one result over the network instead of potentially thousands of results (and paging, etc.).

I hope I missed something that makes this a lot easier and more network-optimized than that.  Any ideas?

-AJ
4 REPLIES 4

jpotts
World-Class Innovator
World-Class Innovator
Probably the easiest thing to do is to order by cmis:lastModificationDate descending. And you can make it more efficient by setting maxItems to 1.
>>> query = "select cmis:objectId, cmis:name from cmis:document where cmis:name like 'test-plain%' order by cmis:lastModificationDate desc"
>>> rs = repo.query(query, maxItems="1")
>>> rs[0].name
u'test-plain-3.txt'
That file, test-plain-3.txt, is the most recently modified file matching the query.

If you need to get the most recently modified doc in a folder, use the in_folder(folderId) clause, like this:
>>> folder = repo.getObjectByPath('/Sites/test-site-1/documentLibrary')
>>> query = "select cmis:objectId, cmis:name from cmis:document where in_folder('%s') order by cmis:lastModificationDate desc" % folder.id
>>> query
"select cmis:objectId, cmis:name from cmis:document where in_folder('workspace://SpacesStore/388e0fed-9104-4507-a702-30c4768e119b') order by cmis:lastModificationDate desc"
>>> rs = repo.query(query, maxItems="1")
>>> rs[0].name
u'test-plain-3.txt'
Yes, it sucks that I have to retrieve the folder first to get its ID just to build the in_folder clause.

Jeff

aweber1nj
Champ in-the-making
Champ in-the-making
Probably the easiest thing to do is to order by cmis:lastModificationDate descending. And you can make it more efficient by setting maxItems to 1.
>>> query = "select cmis:objectId, cmis:name from cmis:document where cmis:name like 'test-plain%' order by cmis:lastModificationDate desc"
>>> rs = repo.query(query, maxItems="1")
>>> rs[0].name
u'test-plain-3.txt'

Does that capability to set maxItems exist with OpenCMIS/Chemistry java classes?  I don't see a session.query( ) overload that would necessarily work like that.

Yes, it sucks that I have to retrieve the folder first to get its ID just to build the in_folder clause.
Remember you mentioning this in your webinar.  Agree with your assessment. Smiley Happy

Thanks again,
AJ

jpotts
World-Class Innovator
World-Class Innovator
Sorry, I was trying to lay my Python trip on you, man. Smiley Happy

If you look at the JavaDoc for OpenCMIS you'll see that there is a version of the query method that takes an OperationContext. OperationContext has a method called setMaxItemsPerPage.
String query = "select * from cmis:document";
OperationContext oc = new OperationContextImpl();
oc.setMaxItemsPerPage(5);
ItemIterable<QueryResult> q = getSession().query(query, false, oc);
So grab an OperationContext, call that method, then pass it in when you do the query. Voila. Now you're down with the OC.

Jeff

aweber1nj
Champ in-the-making
Champ in-the-making
Good call on using the OperationContext to save network traffic.  Don't know if the CMIS server-side will translate that into a better-performing query (like adding a LIMIT to the end of the select), but it certainly will help on the network side.

When you start to scale to hundreds-of-thousands or millions of objects, passing those "hints" back to the underlying DB engine is a huge win.  Of course, in this case you really can't "translate" the per-page request into a LIMIT, as it's a little different.  We need a way to pass something like that back though…whether it's in an OC setMaxItems (total), or allowing the query to be extended with optimization hints like LIMIT.

Thanks again,
AJ