cancel
Showing results for 
Search instead for 
Did you mean: 

Lucene query problem (too much results and lacking results)

michalwrobel
Champ on-the-rise
Champ on-the-rise
My goal is to schedule an action for every file in a given directory(recursively).
I started with a simple action just printing out file name to log and implementing cron + lucene query configuration in xml as stated here http://wiki.alfresco.com/wiki/Scheduled_Actions

Problem instance:

I have a 'testDir' in company home with one file 'testFile'.
But my action returns 3 filenames: 'testFile' , 'doclib' , 'webpreview'.

I tried with other directiories, 'doclib' and 'webpreview' were always there. Sometimes it also happend that guid-like filenames were printed out, and in one case one of the files was omitted.

Important parts of my implementation:

Scheduling config:

<property name="queryTemplate">
             <value>+PATH:"/app:company_home/cm:testDir//*"</value>
        </property>
        <property name="cronExpression">
          <value>0 0/1 * * * ?</value>
        </property>

Filename logging in Action:

String fileName = (String) nodeService.getProperty(actionedUponNodeRef, ContentModel.PROP_NAME);
logger.debug("filename: " + fileName);

So the questions are:

- is this lucene query somehow incomplete? (it occured that one of the existing files was omitted)EDIT: may it be caused by 'stale' lucene indexes?
- I guess there are some 'spiecial hidden files' which are not shown by Share client. But how should I write the lucene query to get only files which are 'human relevant'? (without doclib, webpreview, and guid named files rubbish)
2 REPLIES 2

jpotts
World-Class Innovator
World-Class Innovator
Your lucene query is asking for members of a folder and any of its descendents. Your query is not filtering on type, aspect, or anything else. So, you're getting exactly what you asked for. Smiley Happy The thumbnails that are generated for things like the document library and the preview (technically these are called "renditions") are stored as children of the objects they are thumbnails of.

If you don't want to see those, be more specific. Maybe there is a specific type you are interested in or some types or aspects you could exclude. For example, you could exclude the thumbnails by doing something like:

PATH:"/app:company_home/cm:testDir//*" AND -TYPE:"cm:thumbnail"

Jeff

michalwrobel
Champ on-the-rise
Champ on-the-rise
Thank you very much that solved the problem! Smiley Wink