cancel
Showing results for 
Search instead for 
Did you mean: 

Search underscore

janmulder
Champ in-the-making
Champ in-the-making
Hello

I would like to search on underscore.
I would like to disable the underscore as word separator.

I looked at:
data-model/source/java/org/alfresco/repo/search/impl/lucene/analysis/AlfrescoStandardAnalyser.java
public class AlfrescoStandardAnalyser extends Analyzer

I could try to change the Alfresco java code, and put:
public class AlfrescoStandardAnalyser extends WhiteSpaceAnalyzer
I would then rebuild the lucene index.

Would that be a good approach?
Are there other things I would have to take care of?
Are there better approaches?

Thanks.
3 REPLIES 3

andy
Champ on-the-rise
Champ on-the-rise
Hi

You can configure from where the analyser to type mapping is loaded by default/per model/per class/per type.

e,g,

<model name="d:dictionary" xmlns="http://www.alfresco.org/model/dictionary/1.0">

   <description>Alfresco Dictionary Model</description>
   <author>Alfresco</author>
   <published>2005-09-29</published>
   <version>1.0</version>
   <analyserResourceBundleName>alfresco/model/dataTypeAnalyzers</analyserResourceBundleName>




You can also change what is found in this file.

If you change the configuration you need to reindex.

Andy

janmulder
Champ in-the-making
Champ in-the-making
Hi

Thanks for the hints.

I have looked at the files.
I could not see a simple list of token separators which I could configure,
so for example: "-" splits tokens, but "_" should not split tokens.

I saw some other suggestion to copy the Lucene source code of StandardTokenizer,
and then just program whatever you need/like.
I am fine with that approach, but I see 2 different Lucene versions in the alfresco tree,
which Lucene source code version should I download to adapt?

Thanks.

andy
Champ on-the-rise
Champ on-the-rise
Hi

The lucene subsystem uses 2.4.1 and SOLR uses lucene 2.9.3.
Wire up the 2.4.1 classes in the repo.

Depending on your use case - treating the field as an identifier may help

I will add some simple configuration to the backlog (stop words, separators).
In the end the most flexiable way is to hook up your own analysis.

Andy