Hyland Connect

jayjayecl · ‎08-29-2008

Hello,

First, I need to say that I read a lot about Indexing process, Searching process and effects of languages on these.
I read the following topics, and a few others :

http://forums.alfresco.com/en/viewtopic.php?f=4&t=10114&hilit=search+and+locale
and
http://forums.alfresco.com/en/viewtopic.php?f=9&t=9524&hilit=search+and+locale

My problem is that, with users that can be from different countries/languages, and that mix CIFS and webclient usage (for file uploading of file searching), the results of any search process are unefficient. I mean : they like the functionning of CIFS/windows search.
Indeed, they've got a lot of troubles getting the right result using webclient or a search portlet (via webservice), because of all stemming/analyzing procedures that are lead during the indexing process.

So, I'd like to configure a simple indexing anlysis, that would just erase any accents (for french and spanish words), but keep the words unstemmed.
If the users look for "procedure", they want to find files containing "procédure" of even "ProCéDUre", whatever the locale of their webclient, the locale of the document, or the way the file was uploaded.

Iwas wondering if it was as simply as
- declaring the same LuceneCustomAnalyzer in the DataTypeAnalyzers_locale.properties
- Creating this LuceneCustomAnalyzer from the French one, removing the call to FrenchStemmer, and customizing it in order to erase accents.
Am I right on this way to do it ?

Is there anything I forgot (like the fact that doing so, any search for "procedureS" (plural) will not show files with "procedure" (singular) ?

Thank you all

jayjayecl · ‎08-29-2008

Hummm, I'm wondering whether to use FrenchAnalyzer (without FrenchStemmer) et IsoLatin1Filter, or AlfrescoStandardAnalyzer.

One last question.
When I'm done with the config changes, will a full reindexing process (index.recovery.mode=FULL) rebuild the indexes taking account of this change about analyzers ?

Thank you for any reply

andy · ‎09-11-2008

Hi

Yes a full index rebuild will apply any analyzer changes.

Andy

Hyland Connect

[Index & Search] Removing effects of language / locale