cancel
Showing results for 
Search instead for 
Did you mean: 

Strange and serious re-indexing behaviour

atariq
Champ in-the-making
Champ in-the-making
We've recently and strangely been having issues re-indexing our repository and it's had severe problems for the team.

Our repository is stored on another fileserver while the lucene-indexes were stored on the local Alfresco server. In an attempt to rebuild the index, I stopped Alfresco, removed the current lucene-index folder, set the flag in repository.properties appropriately to index.recovery.mode=FULL, and restarted the service.

Under the catalina.out logs I note that the reindexing process gets successfully completed at 100% without any errors. However, Alfresco then refuses to start with the following errors in the logs:

09:54:19,319 INFO  [repo.admin.ConfigurationChecker] The Alfresco root data directory ('dir.root') is: /net/hal/export/alfres
co/alf_data
09:54:19,511 ERROR [repo.admin.ConfigurationChecker] CONTENT INTEGRITY ERROR: Indexes not found for 5 stores.
09:54:19,511 INFO  [repo.admin.ConfigurationChecker] You may set 'index.recovery.mode=FULL' if you need to rebuild the indexe
s.
09:54:19,512 ERROR [repo.admin.ConfigurationChecker] Ensure that the 'dir.root' property is pointing to the correct data loca
tion.
09:54:19,524 ERROR [web.context.ContextLoader] Context initialization failed
org.alfresco.error.AlfrescoRuntimeException: Ensure that the 'dir.root' property is pointing to the correct data location.
        at org.alfresco.repo.admin.ConfigurationChecker.check(ConfigurationChecker.java:312)
        at org.alfresco.repo.admin.ConfigurationChecker.access$000(ConfigurationChecker.java:72)
        at org.alfresco.repo.admin.ConfigurationChecker$1.execute(ConfigurationChecker.java:178)
        at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:225)
        at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:155)
        at org.alfresco.repo.admin.ConfigurationChecker.onBootstrap(ConfigurationChecker.java:182)
        at org.alfresco.util.AbstractLifecycleBean.onApplicationEvent(AbstractLifecycleBean.java:62)
        at org.springframework.context.event.SimpleApplicationEventMulticaster$1.run(SimpleApplicationEventMulticaster.java:7
7)
        at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:49)
        at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticast
er.java:75)

Since I had just rebuilt the index, I was surprised to see an integrity error here. So I naturally decided to see what my lucene-index directory looked like. While previously it was a 2GB+ hefty directory tree, right now it was only 172kb in size and under the lucene-indexes/workspace/SpacesStore I could only see the files, IndexInfo and  IndexInfoBackup as opposed to the whole index.

It's quite serious now as many people can't get to their documents. Help is much appreciated once again!

By the way, for your information, we are running Alfresco 2.1 community on a linux box that is connected to our own MySQL server.

Thanks!
4 REPLIES 4

atariq
Champ in-the-making
Champ in-the-making
Ok, the situation is under control now. I opted to restore an older index of the repository and it thankfully worked. Albeit with the loss of a few files that I was able to retrieve via the Manage Deleted Items feature in the web-client.  As long as the department is able to access the documents everyone is happy! Smiley Happy

Although, that was a bit of a scare though. I'm still not sure why the lucene-indexes folder was pretty much empty after apparently "successfully" rebuilding the index.

In addition, I'm seeing a strange discrepancy between CIFS and the web client. All the up-to-date files can be viewed and downloaded via the web client, however, if I go through CIFS, the directories all seem to be at least a couple of months old. Any ideas why this would be happening?

rdanner
Champ in-the-making
Champ in-the-making
Ok, the situation is under control now. I opted to restore an older index of the repository and it thankfully worked. Albeit with the loss of a few files that I was able to retrieve via the Manage Deleted Items feature in the web-client.  As long as the department is able to access the documents everyone is happy! Smiley Happy

Although, that was a bit of a scare though. I'm still not sure why the lucene-indexes folder was pretty much empty after apparently "successfully" rebuilding the index.

In addition, I'm seeing a strange discrepancy between CIFS and the web client. All the up-to-date files can be viewed and downloaded via the web client, however, if I go through CIFS, the directories all seem to be at least a couple of months old. Any ideas why this would be happening?

I'm glad you where able to recover – it's this kinda stuff that having a support contract makes me feel a bit better.  I'd like to know more about what happened and what really went wrong here.  The CIFS projection issue is a bit troubling.  Please keep us up to date as you discover anything.  I hope that anyone who has seen similar behavior will chime in here.

atariq
Champ in-the-making
Champ in-the-making
Yup, Russ, it's quite alarming. I've done tests again to tackle the issues I saw, and they're being replicated without a solution (other than using an older backup of the lucene indexes). So I guess my questions are two-fold:

1) Why aren't the indexes being rebuilt fully and properly after I set the index.recovery.mode flag to FULL? I see no errors in the logs, the process ends by saying:

14:53:37,704 INFO  [node.index.FullIndexRecoveryComponent]      20 % complete.
14:56:30,202 INFO  [node.index.FullIndexRecoveryComponent]      30 % complete.
14:59:22,320 INFO  [node.index.FullIndexRecoveryComponent]      40 % complete.
15:11:03,774 INFO  [node.index.FullIndexRecoveryComponent]      50 % complete.
15:40:00,134 INFO  [node.index.FullIndexRecoveryComponent]      60 % complete.
16:23:45,650 INFO  [node.index.FullIndexRecoveryComponent]      70 % complete.
17:24:10,075 INFO  [node.index.FullIndexRecoveryComponent]      80 % complete.
18:41:00,758 INFO  [node.index.FullIndexRecoveryComponent]      90 % complete.
20:04:15,408 INFO  [node.index.FullIndexRecoveryComponent]      100 % complete.
20:04:15,428 INFO  [node.index.FullIndexRecoveryComponent] Index recovery completed.

yet the lucene-indexes folder that gets created only has the directory tree structure and lacks any files. Why would this be happening?

and

2) Why would going through CIFS in Windows Explorer display files dated months ago while the web client is showing all the up-to-date data items? As such, CIFS is rendered useless for our users and in fact poses a danger to our data integrity.

I'm just trying to make sense of the situation.

andy
Champ on-the-rise
Champ on-the-rise
Hi

The index rebuild may not have indexed all the content. If the transformation to text is expected to take more than 20ms (by default) then this is done in the background - and may happen after the rebuild says it is done. When it is done - all the atomically indexed properties are done - the file content may be done in the background.

It is clearly faster to do an index catch up from a backup as there is less to do.

The default property that controls this is in repository.properties. You can increase this and effectively make all content indexed atomically.

The content index is a big chunk of the index size.

There should be files in the index structiure.

Andy