cancel
Showing results for 
Search instead for 
Did you mean: 

Search Engine (lucene) problem with Greek intonated vowels

aeon
Champ in-the-making
Champ in-the-making
Greetings friends,

There is a peculiar problem when using the Greek language pack and trying to use lowercase search strings that contain words with intonated vowels.

For example, the correct spelling for the "word" test in Greek is: "δοκιμή"

The last letter is a vowel that has a tone ("ή").

If someone attempts to search whatever in alfresco (item, space, etc) that contains intonated words, the result returns nothing even if such words do exist!! The problem is that *most* Greek words contain intonated vowels, which means that in *most* cases the search result will return nothing!

The affected letters (words that contain them do not appear in the search result) are the following: ά, ή, ό, έ, ί, ϊ, ΐ, ύ, ϋ, ΰ

Please let me know what can be done to amend this situation.



P.S: If the moderators believe this post belongs to another category, feel free to move it accordingly.
3 REPLIES 3

andy
Champ on-the-rise
Champ on-the-rise
Hi

You probably want to configure the repository to use a Greek tokeniser.
Have you done this?
See the admin guide.

There will still be issues as we do not correctly support stemming and wildcard queries throught the stack for non standard tokenisers. Quoted search strings will work correctly. This is on the listy of things to do.

BTW, if you quote your search string in the UI I would expect it to work.

Regards

Andy

aeon
Champ in-the-making
Champ in-the-making
Thank you Andy for your prompt reply.

I have to make a small correction, concerning something which I discovered in the meantime, that might help locating what's wrong:

The problem (no results returned) occurs only when the search string is found in titles. It works fine when the string is contained inside content (i.e. in a text file).

Btw, I tried enclosing the search string both in single and double quotes, but unfortunately the problem still remains!

Is the tokenizer the answer or should I look for something else?

andy
Champ on-the-rise
Champ on-the-rise
Hi

This should be sorted in 2.0. Greek should be supported out of the box.
Tokenisation should be done for you - according to the locale selected from the UI. Again, only items in quotes are forced to be tokenised.

I have added some simple tests. If there are any Greek specific ones you would like to see then let me know.

Regards

Andy
Getting started

Tags


Find what you came for

We want to make your experience in Hyland Connect as valuable as possible, so we put together some helpful links.