cancel
Showing results for 
Search instead for 
Did you mean: 

Indexing Modes

jjf
Champ in-the-making
Champ in-the-making
For some reason when I restart the server and have index.recovery.mode set to "NONE", "VALIDATE", or "AUTO", the application won't authenticate users or show any files in the spaces.  When I set it to "FULL", everything works fine after it rebuilds.  The problem with FULL is that it takes a long time to rebuild the indexes and during that time some files do not show up (because they haven't been indexed).  Is something wrong with NONE, VALIDATE, and AUTO?  Is there a trick to getting them to work correctly?

Edit 1: I first reported locking issues but that doesn't seem to be the case.

Edit 2: I enabled debugging using log4j.logger.org.alfresco.repo.node.index=DEBUG.  This reports that the VALIDATE/AUTO are working correctly since the last txn matches.  This shouldn't be the case since none of the files/users show up.
8 REPLIES 8

pmonks
Star Contributor
Star Contributor
It's a total guess on my part, but I wonder if the shutdown process isn't closing the Lucene index files properly?  Try setting the log level for the org.alfresco.repo.search.impl.lucene package to DEBUG and see if anything unexpected happens during shutdown.

Cheers,
Peter

jjf
Champ in-the-making
Champ in-the-making
Unfortunately, enabling the DEBUG mode for indexing makes the logs extremely chatty.  I was hoping to avoid this.

jjf
Champ in-the-making
Champ in-the-making
Also, when I say that it takes a long time to index when set to FULL, currently it takes 12hrs to index 10GB of content.  When indexing occurs, is it also indexing user's?  If so, that might explain why it's taking so long.  We have 15,000+ users in the user store.  I am looking to reduce that number by removing the user nodes for some users in SQL.

jjf
Champ in-the-making
Champ in-the-making
In looking at the source code it seems the indexing relies on transactions.  The current total transactions we have is 2,011,897.  Now a FULL recovery results in a Out of Memory error on the server after it reaches 80%-90% indexing progress.  The memory on the instance (Weblogic) is set to 768mb, so that shouldn't be an issue.

dinger
Champ in-the-making
Champ in-the-making
In looking at the source code it seems the indexing relies on transactions.  The current total transactions we have is 2,011,897.  Now a FULL recovery results in a Out of Memory error on the server after it reaches 80%-90% indexing progress.  The memory on the instance (Weblogic) is set to 768mb, so that shouldn't be an issue.
It will be an issue. If you can up it 1024megs. We ran out of memory after carrying out an re-index.

That might solve your other problem?

Rob

jjf
Champ in-the-making
Champ in-the-making
Ok. I'm going to give it a try in QA.  What's alarming is that 2 weeks ago we had only 1,000,000 transactions in Alfresco.  Now we have 2,000,000.  There have been very little changes to the application in those 2 weeks and no new files have been uploaded.  I guess I'm wondering what a "transaction" is?

I might try deleting the 1,000,000 most recent transactions from the SQL table and then do a full rebuild.  Hopefully this takes us back to our state 2 weeks ago.

jjf
Champ in-the-making
Champ in-the-making
Looks like a transaction is stored anytime an action occurs in Alfresco (new, edit, delete, open, etc.)?  That might explain the 2M transactions since we do a batch every night to update the Alfresco users/groups.

jjf
Champ in-the-making
Champ in-the-making
Still looking for some guidance on what fills up the alf_transaction table.  Any ideas?