cancel
Showing results for 
Search instead for 
Did you mean: 

Lucene Indexing Problem using WebScripts

kumbach
Champ in-the-making
Champ in-the-making
We're running into a frustrating indexing issue that I'm trying to solve.

I have a space named Company Home/Additions. Users with write permission can create documents in there. Simple, and works well.

In the background, I have a scheduled task that runs a WebScript that checks for new documents in Company Home/Additions every minute. If it sees a new document in there, it simply moves it to a different space that the person who created the document doesn't have write access to. Since the script runs in the background as the Admin user, this all works well. Except…

Quite often, the Lucene indexes are getting out of sync with the repository. What I see happening is that the Lucene search that the script uses to look for new documents in the Additions space finds documents that are no longer there. Then when the script tries to move the document to a different space, it throws an error saying the document doesn't exist.

What's interesting is that if I go into the node browser and do a Lucene search, the problem document shows up as being in the Additions folder. But if I descend through the child nodes from Company Home to Additions, the document does not show up in there.  And, if I browse to the folder that the script initially tried to move the document to, it shows up there. This is how I determined that the Lucene index is being corrupted.

The only way I have found to fix this is to do a full Lucene index rebuild. But this is not practical because it may take several hours to do that and we can't afford that system downtime.

Although I discovered this problem initially through WebScripts, I have recoded the process using the Foundation Services and have run into the same problem.

Does anyone have insight into what might be causing this problem? Has anyone run into this situation before? I'd really appreciate hearing from anyone if they have.

Thanks!

Kevin
5 REPLIES 5

pmonks
Star Contributor
Star Contributor
Note that indexing is asynchronous in some cases, so your code needs to gracefully handle the case where a search result is no longer in the original folder.  That said, this should be a rare occurrence, so it sounds like there's something else going on here.

Which Alfresco version and edition (Enterprise / Community) are you seeing this behaviour with?  Is the installation a cluster or a single server installation?

Cheers,
Peter

kumbach
Champ in-the-making
Champ in-the-making
Which Alfresco version and edition (Enterprise / Community) are you seeing this behaviour with?  Is the installation a cluster or a single server installation?

We're running 2.1.1 Enterprise on a single server installation (Sun Solaris).

ebell
Champ in-the-making
Champ in-the-making
In the Actions Tutorial from Jeff Potts, he notes a potential explanation of why this happens. 
See page 6 in http://www.ecmarchitect.com/images/articles/alfresco-actions/actions-article.pdf

andy
Champ on-the-rise
Champ on-the-rise
Hi

This sounds like a bug.
If you have reproducable test case can you raise the bug and attach the code or floow this up via support.

Are you storing your index on local disk?
Is it possible there is more than one instance of Alfresco running with the same configuration?

Is there any additional information in the log files?

Andy

kumbach
Champ in-the-making
Champ in-the-making
If you have reproducable test case can you raise the bug and attach the code or floow this up via support.
I'll work on that if the test mentioned below fails.

Are you storing your index on local disk?
Yes

Is it possible there is more than one instance of Alfresco running with the same configuration?
Don't think so. We have two instances running, but they are configured to be independent servers.

Is there any additional information in the log files?
Nothing that I found useful.

We've recently upgraded from 2.1.1 to 2.2.0 and did a brief multi-user test and couldn't reproduce the problem. It was easily reproducible on 2.1.1.  We're doing a larger multi-user test soon and that will flush out the problem if it's still in there…