cancel
Showing results for 
Search instead for 
Did you mean: 

Querying the Alfresco repository (~DQL)

bboeri
Champ in-the-making
Champ in-the-making
I understand that Alfresco has no document query language similar to Documentum's DQL.  Can anyone suggest a way to generate reports with the following kinds of information (assuming requisite permissions):

1) Get a complete count of all documents of a specific type
2) Get a complete count of all documents modified between two dates
3) The trickiest one: Get a report of all documents submitted to a workflow within a specific timeframe (e.g., via Activiti), so I could get a list of when each began and completed the workflow, so I could then generate stats such as "average time in the workflow" etc.

I suspect one possibility for 1and 2 would be appropriate Lucene queries, and then looking at a total count of results returned.  In Documentum, however, that wouldn't work because the search system (FAST, and maybe even Verity) returned a maximum result count of from 200-500 results, which in many cases would be far fewer than the possible search results.  Thus, at least for Documentum, the use of DQL.

TIA!
Bob Boeri
9 REPLIES 9

jpotts
World-Class Innovator
World-Class Innovator
You might be interested in the querying capability that the CMIS Query Language provides. And it is only one letter different from DQL! Smiley Happy

Seriously, though, it does give you a query capability closer to what you are probably used to with DQL.

Here's a Python example that uses cmislib to count all documents of a specific type, sc:whitepaper:
>>> query = 'select cmis:objectId from sc:whitepaper'
>>> rs = repo.query(query)
>>> len(rs)
3
Date ranges are similarly easy. Here's the query for a date range that also joins on a custom aspect, sc:webable, to find objects with the published date in the given range:
queryString = """select d.cmis:objectId, w.sc:published
                         from sc:whitepaper as d
                         join sc:webable as w on d.cmis:objectId = w.cmis:objectId
                         where w.sc:published > TIMESTAMP '2006-01-01T00:00:00.000-05:00'
                         and w.sc:published < TIMESTAMP '2007-06-02T00:00:00.000-05:00'"""

If you want to run these on your own using Java instead of Python, check out this custom content types tutorial that includes a section with several sample queries.

Your third request is indeed the trickiest. The easiest thing to do is to set a start date and an end date on the content from within the workflow. Then you can use CMIS Query Language to query for those properties. If you don't like that idea you can instead write data to a "workflow log" object (or whatever you want to call it–it is up to you to define) and then query against those objects.

Activiti may have some built-in reporting that could help you here but I haven't looked into it. This kind of workflow reporting is a requirement we hear a lot.

Hope this has gotten you a little closer to your goal.

Jeff

PS. I recognize your name. Are you the same Bob Boeri that used to edit EContent Mag?

bboeri
Champ in-the-making
Champ in-the-making
Yes, Jeff, I am he (that is, I don't edit eContent mag, but I continue to write columns for them (in my 18th year, if you count the name the magazine originally had, eMedia).  I also sit in on the editorial board to arm wrestle with the other editors regarding who gets nominated for the annual eContent 100 issue (Alfresco made the cut last year).

Bob

jpotts
World-Class Innovator
World-Class Innovator
Cool! Nice to meet you, virtually. Did those queries help you out?

Jeff

bboeri
Champ in-the-making
Champ in-the-making
The examples look helpful, but as an ECM business analyst (rather than a developer) I'll need some help from my local developer-colleagues.  At least I know that the capabilities are there (and CQL will probably be most useful in a Java environment).  I also need more detail about CQL itself but expect I'll find that elsewhere, perhaps  @ OASIS. This looks like it is at least a partial answer: http://docs.oasis-open.org/search-ws/searchRetrieve/v1.0/csprd01/part5-cql/searchRetrieve-v1.0-csprd....

Thanks again.
Bob

alexr
Champ in-the-making
Champ in-the-making
I was thinking of starting to develop a small admin tool for querying Alfresco through CMIS queries.

However one of the things we often use is a query to retrieve the number of documents of a certain type where metadata has a certain value:

select count(*) from … where …


Then I ran into an old Jira entry:

https://issues.alfresco.com/jira/browse/ALF-4935?page=com.tuncaysenturk.jira.plugins.jira-enhancer-p...

I understand that in a programming/scripting language you can do a len(resultset) . However one would expect the CMIS query language to have implemented the count(*) as part of the syntax, which should return a number as a result of the query.

1) Is this possible through CMIS queries in general and against Alfresco specific?
2) If not will the "select count(*)" be added to the CMIS query language?

Kind regards,

Alex

jpotts
World-Class Innovator
World-Class Innovator
There are a number of things you can do in SQL that you cannot do in CMIS Query Language. I do not think COUNT is part of the 1.0 spec. You could check with OASIS to see if support for COUNT is planned for 1.1. If not, you could raise an issue with the CMIS Technical Committee.

Jeff

paul_price
Champ in-the-making
Champ in-the-making
Mentioned in this comment ("There are a number of things you can do in SQL"), this is what I am looking for. I have a huge collection of docs (~.5 Million) that I would like to get counts by date and State_Code (part of my content model). I desperately want to use SQL rather than SOLR because I can access the production databases much easier than other machines in that environment (plus I'm a DBA, so SQL rules).

So can you (or someone else) point me to some hints/wiki for doing this type of query?

My environment: Alfresco Enterprise 4.1.8 on Linux, using SQL Server 2012 database. I will be moving to 4.2 in the near future, so the ultimate solution needs to run in 4.2.

Thanks for any help.

alexr
Champ in-the-making
Champ in-the-making
Thanks Jeff for the quick response. I will check the cmis 1.1 specs @Oasis.

Kind regards,

Alex

andy
Champ on-the-rise
Champ on-the-rise
Hi

Some of your use cases will be supported when we expose SOLR faceting.

Andy