cancel
Showing results for 
Search instead for 
Did you mean: 

Searching in large text files (Lucene limited?)

jzaruba
Champ in-the-making
Champ in-the-making
Hello

I have PDF documents that contain only image data. Those are accompanied by text files with the image data 'transformed' into a text. I can link those two so that I get the PDF listed in search results.

The problem is when the text content is too large, then the words from the beginning are found, the words from the end are not… (The text content I've tested has 1.6MB.)

I was under impression that there's limit of 65k characters for text properties, so I linked the text content via property of type d:content that I add to the PDF document. Apparently that does not help. Smiley Sad

Any hints, please?
3 REPLIES 3

gyro_gearless
Champ in-the-making
Champ in-the-making
Try setting

lucene.indexer.maxFieldLength=1000000
in alfresco-global.properties - the default is only 10000, so large documents are only partially indexed per default 😞
I consider this to be one of Alfrescos most annoying "features"….

HTH
Gyro

jzaruba
Champ in-the-making
Champ in-the-making
You're my hero, sir! Smiley Happy
Thanks a lot!

mjuarez
Champ in-the-making
Champ in-the-making
jzaruba could you explain to me how to save in "d:content" attribute?

I can save in "d:content" attribute, but when I doing in the search/advanced search, this not works, found 0 results.

Could you help me?

Thanks!