cancel
Showing results for 
Search instead for 
Did you mean: 

Relationship between alf_transaction and index recovery

iblanco
Confirmed Champ
Confirmed Champ
I've a repository with around 25000 documents, we have just uploaded all of them and executed some actions that set some metadata on them. "alf_node" table is around 145000 rows long and alf_transaction around 275000. In our setup we hace 2 Alfresco Community 4.0.d machines in cluster, using ehcache and index tracking to make the synchronization. We are not using SOLR, just plain old Lucene.

Much of the content is not still indexed but as far as I know this is a process that is done in background based on the existence of FTSSTATUS fields on the lucene indexes. The problem is that after some trouble we stopped both machines and now when I start them with the index recovery set to AUTO it starts the process but does seem to take too long, over 7 hours now and still going. I understand that the full content indexing might take too long, but "the other indexing" the one that is made in foreground should be quite fast.

I suspect it might be related to the fact that the recovery process is trying to "recover" all the transactions and those seem to be too many.
Does the recovery process consider all the transactions in alf_transaction of just the last N ? Is it time based ? I think alf_transaction is emptied by a scheduled job, but when is it safe deleting a transaction entry from this table ?

I know does are quite a lot of question but if someone could seed some light around how transaction table and index recovery process are related that would be really helpful.

Thank you very much.
10 REPLIES 10

mitpatoliya
Star Collaborator
Star Collaborator
Well as per my opinion you should not touch the transaction table.
You should go for the full recovery only.
Because once your transaction table is messed up you will never be able to create the indexes again.
Whenever you set recovery mode as Full indexing it will create all the indexes again so that would be better.

iblanco
Confirmed Champ
Confirmed Champ
The problem is the full reindexing does not solve the problem. One of the transactions touches around 118000 nodes and after reindexing it for over and hour a Mysql exception fires:

com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during commit(). Transaction resolution unknown.
   at sun.reflect.GeneratedConstructorAccessor447.newInstance(Unknown Source)
   at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
   at com.mysql.jdbc.Util.handleNewInstance(Util.java:409)
   at com.mysql.jdbc.Util.getInstance(Util.java:384)
   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1015)
   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:989)
   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:984)
   at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:929)
   at com.mysql.jdbc.ConnectionImpl.commit(ConnectionImpl.java:1663)
   at org.apache.commons.dbcp.DelegatingConnection.commit(DelegatingConnection.java:334)
   at org.apache.commons.dbcp.DelegatingConnection.commit(DelegatingConnection.java:334)
   at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.commit(PoolingDataSource.java:211)
   at org.hibernate.transaction.JDBCTransaction.commitAndResetAutoCommit(JDBCTransaction.java:139)
   at org.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:115)
   at org.springframework.orm.hibernate3.HibernateTransactionManager.doCommit(HibernateTransactionManager.java:656)
   at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:754)
   at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:723)
   at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:393)
   at org.alfresco.util.transaction.SpringAwareUserTransaction.commit(SpringAwareUserTransaction.java:472)
   at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:410)
   at org.alfresco.repo.node.index.AbstractReindexComponent$ReindexWorkerRunnable.run(AbstractReindexComponent.java:1008)
   at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

I suspect this is related to the commit excedding the 50 seconds set in "innodb_lock_wait_timeout" but in our Mysql 5.1 server this parameter can only be applied as a server wide configuration and I don't feel very comfortable increasing this value in such a wide way.

We are developing a process that will execute as an Action in the repository and  will generate new transactions that will touch the nodes in batches of 50 nodes. That way the older transaction will no longer be related to the 110000 nodes, instead we will have many smaller transactions. It will be something like going file by file "changing something" but without changin anything that the user can see (applying a hidden aspect or something like that). I think this is something Alfresco manages so I guess that way the transaction table is not destroyed in any way.

Hope this works. I'll let you know about it.

iblanco
Confirmed Champ
Confirmed Champ
It did work as expected, many new transactions were created and the huge one dissapeared. After that the index rebuilding worked right.

thijslemmens
Champ in-the-making
Champ in-the-making
Hello iblanco

We also have problems indexing big transactions on a 3.1 alfresco. Are you willing to share your code?

kind regards

iblanco
Confirmed Champ
Confirmed Champ
We configure this class as an action:


public class ForceReindexNodes extends ActionExecuterAbstractBase {

   public static final Logger LOG = Logger.getLogger(ForceReindexNodes.class);
   public static final String NAME = "reindex-node";
   
   private ServiceRegistry serviceRegistry;
   private BehaviourFilter behaviourFilter;
   private int totalBatchNodes = 50;
   private boolean removeAspect = true;
   
   
   public void setServiceRegistry(ServiceRegistry serviceRegistry) {
      this.serviceRegistry = serviceRegistry;
   }

   public void setBehaviourFilter(BehaviourFilter behaviourFilter) {
      this.behaviourFilter = behaviourFilter;
   }

   public void setTotalBatchNodes(int totalBatchNodes) {
      this.totalBatchNodes = totalBatchNodes;
   }

   public void setRemoveAspect(boolean removeAspect) {
      this.removeAspect = removeAspect;
   }



   @Override
   protected void executeImpl(Action action, NodeRef actionedUponNodeRef) {
      
      iterateChildren(actionedUponNodeRef);

   }
   
   protected void iterateChildren(NodeRef nodeRef){
      
      NodeService nodeService = serviceRegistry.getNodeService();
      DictionaryService dictionaryService = serviceRegistry.getDictionaryService();
      
      //We launch transactions not bigger than X
      List<ChildAssociationRef> childAssoc = nodeService.getChildAssocs(nodeRef);
      List<NodeRef> batch = new ArrayList<NodeRef>();
      NodeRef node = null;
      QName type = null;
      int count = 0;
      
      Iterator<ChildAssociationRef> iter = childAssoc.iterator();
      while(iter.hasNext()){
         
         count ++;
         if(count == totalBatchNodes){
            addAspect(batch);
            batch = new ArrayList<NodeRef>();
            count = 0;
         }
         
         node = iter.next().getChildRef();
         type = nodeService.getType(node);
         if(dictionaryService.isSubClass(type, ContentModel.TYPE_FOLDER)){
            iterateChildren(node);
         }
         
         try{
            batch.add(node);
         }catch (Throwable e) {
            LOG.debug("Problem while trying to assign aspects to the nodes", e);
            continue;
         }
         
      }
         
      try{
         batch.add(node);
      }catch (Throwable e) {
         LOG.debug("Problem assigning aspect", e);
      }
      
   }
   
   private void addAspect(final List<NodeRef> list){
      
      RetryingTransactionCallback<Void> txnFile = new RetryingTransactionCallback<Void>() {

         @Override
         public Void execute() throws Throwable {
            
            behaviourFilter.disableBehaviour(ContentModel.ASPECT_AUDITABLE);
            
            NodeService nodeService = serviceRegistry.getNodeService();
            
            Iterator<NodeRef> iter = list.iterator();
            NodeRef nodeRef = null;
            
            LOG.info("Ejecutando el proceso a " + list.size() + " nodos");
            
            while(iter.hasNext()){
               nodeRef = iter.next();
               nodeService.addAspect(nodeRef, MyModel.ASPECT_REINDEX_NODE, null);
               if(removeAspect) nodeService.removeAspect(nodeRef, MyModel.ASPECT_REINDEX_NODE);
            }
            
            return null;
         }
      };
      TransactionService transactionService = serviceRegistry.getTransactionService();
      transactionService.getRetryingTransactionHelper().doInTransaction(txnFile, false, true);
      
   }

   @Override
   protected void addParameterDefinitions(List<ParameterDefinition> paramList) {
      // No parameters needed.

   }

}


Just start the repository setting index recoverty to "NONE" and execute this action against the folder that contains the problematic content. It will basically add an remove an empty aspect for each node under the folder but batching then in transactions no bigger than 50 nodes. We disable AUDITABLE aspect to avoid creation and modification date changes.

USE UNDER YOUR OWN RISK!!!! If an Alfresco Engineer sees this he/she might cry!!!! For us this solved the problem but it definitively looks as an horribly awful hack.

thijslemmens
Champ in-the-making
Champ in-the-making
Thank you. I have an issue on a system I'm trying to upgrade. It is a test system that is a clone of production. It is 3.1, so it is old. I probably need to change your code. We have 6000 000 nodes in the system and our biggest transaction contains 179 000 nodes. I wonder if it works on this scale. I'll let you know.
However, thanks in advance.

iblanco
Confirmed Champ
Confirmed Champ
For us touching just a folder and letting it group in little transactions all the nodes under it worked well, and solved a transaction of 118000 nodes (after a couple of hours working), if you just touch the related nodes ( around 170000) it should work.

If your nodes are not all under a certain path you should consider adapting the code so that instead of traversing the nodes it just takes the list of nodes from a text file in the repository (one nodeRef per line). You can generate the TXT file with a SQL query, upload it to the repo and execute the action against it…. much cleaner, it won't affect nodes that are not related.

If you try this approach please share the code, it might be quite useful for us too.

Thanks.

thijslemmens
Champ in-the-making
Champ in-the-making
I adapted the code and put it in our git repository. It needs the dummyModel.xml in the config folder and you can find an example configuration of the action there too.
You can run the action on a node in the repository. The node should have on nodeRef per line.
You can use the action in a script and supply the batch size and/or the list of NodeRefs objects. I have not tested this option yet, I use it  on a node in the repository. On 3.1, the disableBehaviour did not seem to work, but it is not important for me there. On 4.0 community it did work, so I suppose it was a bug in that version.

Here is the git repository:
https://bitbucket.org/xenit/transactionsplitter

all the best.

iblanco
Confirmed Champ
Confirmed Champ
thijslemmens great contribution! That's the kind of collaboration that makes open source rule!
Thank you very much, I don't need it right now but I'll keep a copy in case we need it back again.

All the best.
Getting started

Tags


Find what you came for

We want to make your experience in Hyland Connect as valuable as possible, so we put together some helpful links.