cancel
Showing results for 
Search instead for 
Did you mean: 

Exact search with case insensitive

spilby
Confirmed Champ
Confirmed Champ
Hello everyone!

I'm using this expression to search a string custom property of my tree:


String query = "PATH: \"/" + myPath + "/cm:" + folder + "//.\"" + " AND TYPE:\"" + myType + "\""
             + " AND =@mod\\:" + myCustomProperty + ":\"" + searchValue + "\"";

SearchParameters sp = new SearchParameters();
sp.addStore(StoreRef.STORE_REF_WORKSPACE_SPACESSTORE);
sp.setLanguage(SearchService.LANGUAGE_FTS_ALFRESCO);
sp.setQuery(query);
ResultSet results = getSearchService().query(sp);


And I have this on my model.xml


   <namespace uri="myCustomNS" prefix="mod" />

   <property name="mod:myCustomProperty"  editVisible="true" createVisible="true" editable="true" inheritable="false" calculable="true">
      <title>My custom property</title>
      <type>d:text</type>
      <mandatory>true</mandatory>
      <index enabled="true">
         <atomic>true</atomic>
         <stored>false</stored>
         <tokenised>both</tokenised>
      </index>
      <constraints>
         <constraint ref="desal:stringLength100" />
      </constraints>
   </property>   


I use LANGUAGE_FTS_ALFRESCO instead of LANGUAGE_LUCENE because of the "=" operator in the search.

I need to find the exact phrase that I have on the value property.

The problem is that I don't want a case sensitive search. I want the results, with lower or upper case.

If I change the query and I put @mod\\: instead of =@mod\\: the search is case insensitive, like I want. But don't find with exact phrase. I need to use the "=" operator, but I need too the case insensitive search.

How can I do this search?

Thank you very much!
4 REPLIES 4

afaust
Legendary Innovator
Legendary Innovator
Hello,

you don't need the =@ operator at all - you just need to handle phrase vs. term distinction properly. In Alfresco FTS, you can search for modSmiley TongueropertyName:value and value is a term that should match without tokenization / stemming. If you do a modSmiley TongueropertyName:"value" query, you are doing a phrase query and tokenization / stemming will be applied. Of course, if your value is a more complex value with whitespaces, colons and what-not, you need to escape that value properly before you can use it in a term query - or you risk generating a completely different or invalid query.
Escaping can be tricky in JavaScript but can be done using normal String / regex replacement mechanisms. In Java, the QueryParser class provides static utilities to escape strings for use in Lucene / FTS queries without further manual processing.

A side note: Alfresco FTS or CMIS SQL are the only search languages anyone should ever use in custom code. Yes, Lucene and other languages are also there - some for backwards compatibility reasons and some for very, very special use cases. It is best to stick to Alfresco FTS or CMIS SQL because those languages receive primary attention when improvements / new features are added. E.g. in Lucene, you don't benefit from the metadata query feature introduced in Alfresco 4.2.

Regards
Axel

spilby
Confirmed Champ
Confirmed Champ
Hello, and sorry by the delay, I don't work on this project until yesterday again.

I try to do the query with the QueryParser of Lucene 2.4.1. I do this:


QueryParser.escape(value)


where value is the property value with the white spaces. But not works.

If I use @ operator without the =, and I have for example a "ABC 1" value, when I search "ABC" returns ok and find it, and it's wrong.

Without the " the query not works and don't find me anything. Seems I need the " because my value have white spaces.

I try to replace manually whitespaces on the query.


value.replaceAll(" ", "\\_x0020_");


or


value.replaceAll(" ", "\\u00A0");


but also not works, and returns me ok if I find "ABC" and I have "ABC 1".

Where is the problem?

A lot of thanks!

Best regards

afaust
Legendary Innovator
Legendary Innovator
Hello,

since I had to deal with some search issues myself over the weekend I also tried your use case.
It turns out, your use case can't be covered by any search query syntax feature. Either you want an exact match (including case) or an approximate match (case-insensitive but including tokenisation) - there are no other alternatives constellations here.
The only two options I see here are:
- use a duplicated property to store values in a defined case (either upper or lower) and use that for your lookups (requires a policy to copy/modify your duplicate property based on user changes to the primary property)
- do a case-insensitive, tokenised search and perform the necessary second-stage filtering yourself

Regards
Axel

spilby
Confirmed Champ
Confirmed Champ
Hello,

thanks for your help, Axel.

Considering alternatives changing my model.xml…

1) If I change my model.xml and put
<tokenised>false</tokenised>
, and using modSmiley TongueropertyName:"different values" syntax, may be works and don't tokenize the value? Or the query tokenize the results even the false on the tokenised xml tag?

2) Are there some tag on my custom property on the model.xml to desactivate the case sensitive? Something to put true or false on…


<type name="cm:content">
        <title>Content</title>
        <parent>cm:cmobject</parent>
        <properties>
           <property name="cm:content">
              <type>d:content</type>
              <mandatory>false</mandatory>
              <index enabled="true">
                 <atomic>false</atomic>
                 <stored>false</stored>
                 <tokenised>true</tokenised>
              </index>
           </property>
        </properties>
     </type>


Best regards