Incorrect and partial total result by lucene query
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎02-27-2009 07:03 AM
We are running the following lucene query via action
query = "PATH:\"/app:company_home/cm:Test/cm

So we would like to get the total number of pdf files under the "company_home/Test/Document" space.
We had the following issue:
– the first results of the query was not complete. we had not the correct total number
– so we run again the query until we get the correct total number of pdf files
So the query seems to return the correct total number only after we run it more than one time (times depend on number of PDF files in the space).
for example: if we have about 5.000 files
the first query shows total=1.500 files
the second one shows total=2.410 files
………
Until the last one shows=5.000 files
then the queries became stable and result is 5.000.
There is some configuration parameters for the Lucene query to avoid this problems ?
We would like to have the correct total number at the first query.
Thanks for your help
- Labels:
-
Archive
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎02-27-2009 07:25 AM
#
# Properties to limit resources spent on individual searches
#
# The maximum time spent pruning results
system.acl.maxPermissionCheckTimeMillis=10000
# The maximum number of results to perform permission checks against
system.acl.maxPermissionChecks=1000
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎02-27-2009 08:45 AM
The problems was that we increased the value but the time to wait for the query result was too long (about 60 seconds).
The time is too long for WEB interactive use.
At the moment, We have a total of 10.000/15.000 pdf documents to search.
I thinks it is not a critical number for Alfresco.
What's happen if we 'll manage 100.000 documents ?
Have you some suggestions in order to set the variables in the correct way to manage
10.000/15.000 documents ?
The machine is:
CPU Intel dual core 2,66 GHZ
Memory 2GB RAM
# The maximum time spent pruning results
system.acl.maxPermissionCheckTimeMillis=??????
# The maximum number of results to perform permission checks against
system.acl.maxPermissionChecks=????
Thanks in advance for your Help
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎02-27-2009 10:58 AM
If you have code that does the search, then you can set the search parameters:
org.alfresco.service.cmr.search.SearchParameters
and set the limit and limitBy properties on a per-query basis. You can also bypass security checks in your code by running as the system user or using the searchService instead of the SearchService.
Leave the default for user-driven searches as it is to prevent users from overloading the system with queries for thousands of documents.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎03-02-2009 05:10 AM
Now the full-text search philosophy is more clear to me.
Basically, I can never know the total number of contents(documents) in alfresco that correspond to a query criteria.
Is it Correct ?
This could be a problem for the end-user.
Have you some suggestions in order to provide end-user with it ?
Alfresco could be a good framework in order to develop vertical application but often you must know before
hown many objects you must manage (for instace to show it on the web interface).
Are there some alfresco internal counter variables containing the total number of contents(documents) in one "Space" ?
Thanks in advance
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎03-02-2009 06:40 AM
We don't currently have space-related quotas built into the system. We're laying some groundwork in 3.2 that will make these types of calculations more efficient.
Regards

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎04-20-2009 07:27 PM
Hi,
We don't currently have space-related quotas built into the system. We're laying some groundwork in 3.2 that will make these types of calculations more efficient.
Regards
Derek, as it is now, what is the fastest way to find the count of the number of nodes for a query? For example, the number of nodes in a particular category or parent category?
Is there a way to do this, in say a webscript, without Lucene? (The total number might be larger than maxPermissionChecks)
I see in this post they talk of storing totals in another node, updated daily by workflow- which seems pretty kludgy. http://forums.alfresco.com/en/viewtopic.php?f=4&t=3677&p=11875&hilit=total+count#p11875
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎04-21-2009 04:25 AM
Derek, as it is now, what is the fastest way to find the count of the number of nodes for a query? For example, the number of nodes in a particular category or parent category?dwilson,
The only way to efficiently find nodes in a category is by using a Lucene search, I'm afraid.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎04-21-2009 01:59 PM
That is unfortunate, so in order to display this kind of information on our website, the only option without taking forever is to cache the totals?Derek, as it is now, what is the fastest way to find the count of the number of nodes for a query? For example, the number of nodes in a particular category or parent category?dwilson,
The only way to efficiently find nodes in a category is by using a Lucene search, I'm afraid.
e.g.:
Pet Category Links: (Totals)
- Dogs (8,123)
- Cats (12,328)
- Birds (7,230)
- Rabbits (418)
- Hamsters (3,132)
- … etc.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎10-28-2009 04:58 AM
> You can also bypass security checks in your code by running as the system …
It's possible using web services?
Thanks
