cancel
Showing results for 
Search instead for 
Did you mean: 

Lucene search fails

grosisimo
Champ in-the-making
Champ in-the-making
Hi

I'm working with alfresco 2.9 (I'm trying to upgrade to 3.2, but I still haven't finished). It seems that after some time working lucene search starts failing.
For example, I created a couple of users. I searched them using the email address doing this query:
@cm\:email:some?user.com
I had to replace the @ with ? because @ character caused problems (not a big deal).
Some time later this query started returning 0 results, and then the query that was returning right results was:
@cm\:email:some\?user.com
I replaced ? with \?.
Even more, I had to use "\" to escape all wildcards such as * too.
After a full reindexing it started working fine again.
Is it possible that lucene indexes in 2.9 version degradates?. I'm working out to upgrade to 3.2 but by now I'll have to work with 2.9 version for some time, so I have to know this. Until I find this out, i'll prepare a cron to reindex periodically as a workarround, but it's not a good solution.
Does this work fine in 3.2 version?

Some time before I faced a similar problem. When I looked for a text property that contained long numbers strings (for example "49723849789474"), it returned strings that were similar, but not equal (for example "49723849789535"), but what I needed was an exact match. That problem didn't solve even reindexing, so I solved it externally. Does anyone know why lucene did that? Is there a solution?.

Thanks in advance
2 REPLIES 2

mrogers
Star Contributor
Star Contributor
Its unlikely that the index "degrades".

For your first part of the posting, yes you will need to escape the wild card characters.

For the second part of your problem you are token matching when searching for your long strings in the text, not doing numeric comparisons.    The tokeniser is probably returning the same token for your similar but not identical numbers which is why a match is reported.       The solution is to compare the numbers numerically rather than textually.   If you can get your number into a property it may be a good start.

grosisimo
Champ in-the-making
Champ in-the-making
No, wait.
What I meant when saying "escape wildcards" was that in order to use wildcard characters as wildcards (not as the character itself), i needed to escape them.
For example, if I wanted to search for all the users whose email started with "mymail" I had to write
@cm\:email:mymail\*
but it should be
@cm\:email:mymail*
First it worked as in the first example, but now, after a full indexing it works as the second example. This is how lucene searchs are supposed to work, isn't it?.