cancel
Showing results for 
Search instead for 
Did you mean: 

Slow Search Response with multiple categories

stevewickii
Champ in-the-making
Champ in-the-making
We are performing a pretty simple search.

We have a web application on a separate server that uses the Alfresco webservice client api to query alfresco on another physical server.  We want all documents in a folder that match any of a number of categories.

Here's my query:
+PATH:"/app:company_home/cm:FMC/cm:Content/cm:Links/cm:associations/cm:National/*" +(PATH:"/cm:generalclassifiable/cm:Audience/cm:Consultants/cm:ConsultantCrops/cm:ConsultantCrops_x0020_Corn/member" OR PATH:"/cm:generalclassifiable/cm:Audience/cm:Consultants/cm:ConsultantCrops/cm:ConsultantCrops_x0020_Alfalfa/member" OR PATH:"/cm:generalclassifiable/cm:Audience/cm:Consultants/cm:ConsultantCrops/cm:ConsultantCrops_x0020_Cotton/member" OR ( PATH:"/cm:generalclassifiable/cm:Regions/cm:USA/member" OR PATH:"/cm:generalclassifiable/cm:Regions/cm:USA/cm:MO/member" ) OR PATH:"/cm:generalclassifiable/cm:Professional_x0020_Designations/cm:Certified_x0020_Crop_x0020_Advisor/member" OR PATH:"/cm:generalclassifiable/cm:Audience/cm:Consultants/member")

The Query takes 35 seconds.
No improvement is realized by using +PARENT vs +PATH for the path being queried.
Reducing the number of OR statements reduces the query time, so query time is dependent upon the user's configuration.
The query with no categories takes less than 1 second.
The query with 1 category (Audience/Consultants/member) takes 5 seconds.
There are 7 categories in the statement above, and it takes 35 seconds to execute, so it would appear that the amount of time it takes is pretty much 5 seconds times the number of OR statements in the query.

Can we do anything to speed this up?

several pages in our website perform two or more of these queries to load.  That means some of our pages can take anywhere between 70 seconds and 2 minutes to load!
2 REPLIES 2

hbf
Champ on-the-rise
Champ on-the-rise
Have you found out a way to speed up your category searches?

I am performing simpler queries but I need the result for AJAX responses and for this, it should really be fast. I am in the range of 0.5 second and would like to have something much faster.

Maybe an Alfresco engineer can help us with a trick? Maybe an appropriate ehcache or Lucene configuration will help?

Thanks in advance,
Kaspar

stevewickii
Champ in-the-making
Champ in-the-making
Well, Alfresco 2.9.0C_dev is available now, and I still don't see an improvement querying by Categories.

What we did to improve speed was convert each category to a NodeRef, and then query for Document Nodes where the categories property contains the NodeRef we're looking for.

+PARENT:"workspace://SpacesStore/6f9c8d63-7069-11dc-9c51-a36d884e565a" 
+(@cm\:categories:"workspace://SpacesStore/bd7e3f09-3616-11dc-a7d0-67d533e7244b"
OR @cm\:categories:"workspace://SpacesStore/a39d7d01-7103-11dc-9a14-ed64e29c2348")
-TYPE:"cm:folder"

Here the PARENT NodeRef is for a Folder Node, and each categories NodeRef is for a Category Node.  This query yeilds results much faster than a query by PATH.  There is one issue.  After we upgraded alfresco from v2.1.0 to 2.9.0B, querying by the categories property stopped working!  To resolve this issue we had to delete and restore all of the documents in alfresco.