cancel
Showing results for 
Search instead for 
Did you mean: 

Search performance issues using searchService

lista
Star Contributor
Star Contributor
Hi all,

i have about 70k nodes, under a structure like this:

-main space
–subspace based on node's first character
—node
—node

Searching for a specific node is a practically instant through Alfresco Explorer, but when using API it takes about 7 seconds (just to show the difference).
Here's the concrete code at fault:

SearchParameters parameters = new SearchParameters();
parameters.addStore(STORE_REF);
parameters.setLanguage(SearchService.LANGUAGE_LUCENE);
parameters.addSort("@" + sortQName, sortIsAsc);
parameters.setQuery(query);
parameters.addLocale(locale);

results = searchService.query(parameters);

Lucene query is basic, nothing complicated, and Alfresco in play is 3.4.

The interesting part comes now; when I remove the sort part, it works almost as fast as through Alfresco Explorer, and it definitely stops being a performance issue.
How do you explain this, and does anyone have any concrete advice?
3 REPLIES 3

afaust
Legendary Innovator
Legendary Innovator
Hello,

this is related to how sorting works in Lucene. When you sort by a field, all existing values of that field are read from Lucene into memory to be pre-sorted. Then, the actual results are sorted by looking up the index of the concrete value in the pre-sorted list. If you have a field that has a lot of different values, you can end up with a large overhead, depending on how fast Lucene can read the value range into memory (I/O).

The alternatives:
a) Don't sort by fields with extremely distinct values
b) Use alternative sort fields that are applicable only to a specific context (e.g. when you search over a 10k subset of your 70k nodes, a field that is only used in your specific subset).
c) Consider upgrading to 4.0 and use the alternatives available there (CannedQuery / SOLR)

Regards
Axel

andy
Champ on-the-rise
Champ on-the-rise
Hi

Later enterprise releases fo 3.4 do ordering in memory and fall back to lucene if there are too many docs in the results (which is configurable)
I am not sure which community product this is in - or destined to be in ….

Look for the properties:

lucene.indexer.useInMemorySort=true
lucene.indexer.maxRawResultSetSizeForInMemorySort=1000

Andy

lista
Star Contributor
Star Contributor
Thank you both for your answers.
For now, we decided to limit the sorting options in order to make our life easier. We'll get back to this in the near future, and I'll be sure to share our results/knowledge here.