cancel
Showing results for 
Search instead for 
Did you mean: 

RC 1 : search : strange search behaviour

jbaton
Champ in-the-making
Champ in-the-making
Hi all,

I created a very simple document with the web client's editor.
Some words were underlined.
Searching for those words did not retrieve the document.

I use RC1 on w2k
4 REPLIES 4

davidc
Star Contributor
Star Contributor
Hi,

Thanks for reporting the issue.  We're attempting to reproduce but could do with some more detail.

Could you provide:
a) the text you've entered into the in-line editor
b) the search criteria

Also, if you could provide an export file (.acp) of your data that would also help as we'll able to see the structure of your spaces etc.

Thanks.

jbaton
Champ in-the-making
Champ in-the-making
Hi,

I have use company-relatively-secret document so I'm afraid I can't export my space, right now. If you can't reproduce the bug, I'll try to take them off.

My document is a two lines html document of french spoonerisms. Below is a cut and paste of the source. I can also send it by email if necessary.

<html><head></head><body><p><font size="2">La <u>Ch</u>ine se dresse à la vue des ni<u>pp</u>ons</font></p><p><font size="2"><u>T</u>aisez-vous, en <u>b</u>as</font></p><p><font size="2"></font></p></body></html>

The file looks like that.

La Chine se dresse à la vue des nippons
Taisez-vous, en bas

It is located in the company home of my repository.

The unsuccessful search was on "chine", the document contains "Chine". Unsuccessful too for "nippons", "taisez", "bas" which are the other underlined words.

Looking for "vue" or "dresse" retrieves the document.


HIH

Jerome BATON

jbaton
Champ in-the-making
Champ in-the-making
Hi David,

Well, I was a bit puzzled with the behaviour described above.
You will notice that the search produces this type of results with word which typo is not the same on all letters.

Due to morning inspiration, I can tell you that

trafficjam causes the same results as the previous post's spoonerisms.

HIH

Can you reproduce the 'bug' ?

I don't know if you are the lucene guy in your team but I think it is worth looking at the lucene buglist about the html-document-indexer


Jerome

davidc
Star Contributor
Star Contributor
Jerome,

We can now reproduce the bug as you've reported.  The underlying html text extraction is at fault, so we'll be able to resolve this in a build soon.

Use http://www.alfresco.org/jira/browse/AR-163 to track progress of fix.

Thanks.
Getting started

Tags


Find what you came for

We want to make your experience in Hyland Connect as valuable as possible, so we put together some helpful links.