cancel
Showing results for 
Search instead for 
Did you mean: 

Alfresco Lucene Indexing

alfsender
Champ in-the-making
Champ in-the-making
Hi All,

We are having an Alfresco repository which has more then 2 million records. The records are categorized in below folder structure.

Main_Folder
   Sub_folder_1
       folder_1 (5,000 Docs)
       folder_2 (5,000 Docs)
       folder_3 (5,000 Docs)
       …
       …
       folder_1000 (5,000 Docs)
       …

The problem we are facing is, we restarted server for new code deployment and it got stuck with below error..

2013-11-21 05:49:28,789 INFO  [STDOUT] 2012-09-11 05:49:28,789  WARN  [cache.node.propertiesTransactionalCache] [indexTrackerThread1] Transactional update cache 'org.alfresco.cache.node.propertiesTransactionalCache' is full (65000).

2013-11-21 05:49:28,821 INFO  [STDOUT] 2012-09-11 05:49:28,821  WARN  [cache.node.aspectsTransactionalCache] [indexTrackerThread1] Transactional update cache 'org.alfresco.cache.node.aspectsTransactionalCache' is full (65000).

2013-11-21 05:49:28,822 INFO  [STDOUT] 2012-09-11 05:49:28,821  WARN  [cache.node.parentAssocsTransactionalCache] [indexTrackerThread1] Transactional update cache 'org.alfresco.cache.node.parentAssocsTransactionalCache' is full (65000).

2013-11-21 05:49:28,823 INFO  [STDOUT] 2012-09-11 05:49:28,823  WARN  [cache.node.nodesTransactionalCache] [indexTrackerThread1] Transactional update cache 'org.alfresco.cache.node.nodesTransactionalCache' is full (125000).

2013-11-21 05:49:29,816 INFO  [STDOUT] 2012-09-11 05:49:29,816  WARN  [alfresco.cache.contentDataTransactionalCache] [indexTrackerThread1] Transactional update cache 'org.alfresco.cache.contentDataTransactionalCache' is full (65000).


Before restarting server when we tried to apply group access to <b>Sub_folder_1</b> that time also we got the same above error, and it didnt apply permission as well. we were able to apply permission to its parent folder <b>Main_Folder</b>, but that is not what we want, as user access will be different by folder. Later we tried updating above cache memory from 65000 -> 150000 and 125000 -> 200000, but after that also we are getting same error.

Request you to send your suggestion to resolve above issue, as we can not do FULL reindexing of these many documents.

Thank you.
3 REPLIES 3

rjohnson
Star Contributor
Star Contributor
I remember having this issue myself when doing an index rebuild on a 4.0a system with only 150,000 documents. I found a very good note on it in the forums (somewhere). Now, I did not keep a copy of it verbatim, nor the link, but looking in my alfresco-global.properties I think the solution for me was to:-

Stop Alfresco.
Add the line lucene.indexer.batchSize=1000 to alfresco-global.properties
Set index.recovery.mode=FULL
Restart Alfresco
Let the indexes build
Stop Alfresco
Comment out lucene.indexer.batchSize=1000
Set index.recovery.mode=AUTO
Restart Alfresco.

If you want to avoid a full rebuild you could try stopping Alfresco, just adding the

lucene.indexer.batchSize=1000

restaring Alfresco and letting it run until it is caught up. That may or may not work because the post was specific about setting it for a rebuild then "unsetting" it for normal running.

Please let me know how you get on.

alfsender
Champ in-the-making
Champ in-the-making
Thank you Sir, for you response.

Actually we have two servers in clustered, so one node this what we have done and restarted server in for FULL indexing.

But reason this is happening in out case is because when we add groups (through alfresco explorer) to space "sub_folder_1" it creates a single transaction to add groups to all sub documents, and as there are million records it creates single transaction for 14 million documents. So when we do a restart it tries to index last transaction i.e. the one with 14 millions records, and after some time it give above error with cache FULL.

Is there a way in which we can tell to Alfresco that do index all transaction except the one with 14 million records, . Or can we just delete that transaction entry from Transaction table in Alfresco. If we do that what are the other table that might get affected because of that and what are the other places where this might affect ?

Thank you.

Hi,
The cache full is rather a warning than an error.
I did a reindex on the other day with an almost empty system and I got the same message.
The full reindex took about half an hour to run but everything went fine.
Avoid directly modifying the database: as mentioned here http://forums.alfresco.com/forum/developer-discussions/repository-services/increase-cache-contextxml... transaction cache default values are fine.
If you happen to have a cluster configuration, remember you can always perform a full hot reindex http://docs.alfresco.com/4.2/index.jsp?topic=%2Fcom.alfresco.enterprise.doc%2Ftasks%2Fhot-reindex.ht...
Also here http://alch3mi5t.blogspot.it/2013/02/lucene-performance-during-index.html you can find some tips on Lucene performance optimization

Regards,
Andrea