cancel
Showing results for 
Search instead for 
Did you mean: 

Add an aspect to large number of nodes

kuliado
Champ on-the-rise
Champ on-the-rise

Hello,

I have a little issue I don't know how to tackle properly and concerns performance.

I need to process ~200k documents that are missing an aspect. First, I identified these documents with a mysql query to evaluate how many there are. These nodes have aspect X but are missing aspect Y, so I need to add aspect Y to all these nodes.

I'm exploring the BatchProcessor approach right now and I am basing my code on the class org.alfresco.repo.node.db.NodeStringLengthWorker. Is this the correct way to do this ?

Few questions arise from writing my code :

- what's the best way to get the nodes in the WorkProvider : searchService solr/lucene ? nodeService ? nodeDAO.getNodesWithAspects (here I would have to check each node for the missing aspect) ?

- how do I stagger this search ? I see in NodeStringLengthWorker we use minNodeId and maxNodeId, is this a way to reduce the search load ?

I saw there was a talk https://community.alfresco.com/thread/214163-deleting-over-1000-nodes#comment-716898  about this subject but this link is not working anymore.

Our production environment is very frail so this process should really produce as less load as possible.

Thanks in advance, and if you need any more info I will provide gladly.

1 ACCEPTED ANSWER

I've create a sample project available at:

GitHub - aborroy/auditable-aspect-disable: Alfresco Repository module to disable AUDITABLE ASPECT be... 

Include the QName of the aspect you are adding in alfresco-global.properties to test the module.

Let me know if this works for you.

Hyland Developer Evangelist

View answer in original post

10 REPLIES 10

angelborroy
Community Manager Community Manager
Community Manager

IMO the best approach is to use REST API with pagination. So you can patch documents by blocks.

Hyland Developer Evangelist

kuliado
Champ on-the-rise
Champ on-the-rise

Thanks for your answer Angel.

Can I ask you to be more specific on your approach ? How would you get these nodes, /search endpoint ? And how would you patch them, /nodes/{nodeid} PUT ?

I need to prevent changing the modifier/modified properties while doing this and prevent all behaviour for performance could be a good idea.

This is mainly the idea, get the nodes with "/search" pages and add the aspect with "/nodes/{nodeid}".

You can disable AUDITABLE Behaviour to avoid the change of the modification properties.

Hyland Developer Evangelist

Hello,

Thanks again for your help. My solution is working now just I have to decide on the number of nodes to fix per loop performance wise. It takes around 7s for the transaction that fixes 20 nodes to finish.

One thing I can't prevent and is fundamental to approve my code is the change in modification properties. In my transaction I do a behaviourFilter.disableBehaviour(); then reenable at the end, but the nodes still show "Modified by System".

I see in the logs that as soon as I reenable the behaviours a few other custom behaviours trigger on these nodes and modifiy them, I suppose if they use current user context then this must be why modified & modifier change to System ?

Any idea how to solve this if my supposition is right ?

angelborroy
Community Manager Community Manager
Community Manager

You have to use a behaviour to control that.

Hope this helps: Change values of properties included in ‘cm:auditable’ aspect: cm:creator, cm:modifier, cm:created, ... 

Hyland Developer Evangelist

Thank you,

I implemented that behaviour and binded to EVERY_EVENT the onUpdateProperties to catch any changes where the new modifier prop is "System", the catch works, and I put old values back, all this inside disableBehaviour too to be sure.

Unfortunately the end result is still modifier = System. It seems a behaviour binded on onUpdateProperties TRANSACTION_COMMIT is always the last one to be executed and modifies the node, I know wich one it is since it shows in the logs, is there no way to order behaviour execution ?

angelborroy
Community Manager Community Manager
Community Manager

I don't know how are you doing the adding of the aspect. But this seudo-code should work...

policyBehaviourFilter.disableBehaviour(nodeRef, ContentModel.ASPECT_AUDITABLE);
try
{
nodeService.addAspect(nodeRef, YOUR_NEW_ASPECT, props);
}
finally
{
policyBehaviourFilter.enableBehaviour(nodeRef, ContentModel.ASPECT_AUDITABLE);
}
Hyland Developer Evangelist

Thanks Angel,

Yes that's somewhat how I'm adding the aspect. The problem being :

- I'm doing all that in a transaction.

- As soon as I enable behaviours again during this transaction then there's multiple behaviours that trigger and modify the node. I don't know how to enable AUDITABLE AFTER these behaviours finished triggering their changes in the same transaction. (Your code suggestion by disabling auditable only and letting the other behaviours trigger should fix this maybe ? I don't know when the line policyBehaviourFilter.enableBehaviour(nodeRef,ContentModel.ASPECT_AUDITABLE); is executed compared to the behaviours that modify the node)

- Again, when the transaction is committed, another behaviour comes in and modifies the node, changing auditable again.

This behaviour is binded on TRANSACTION_COMMIT, I should enable AUDITABLE after this behaviour did it's thing.

I've create a sample project available at:

GitHub - aborroy/auditable-aspect-disable: Alfresco Repository module to disable AUDITABLE ASPECT be... 

Include the QName of the aspect you are adding in alfresco-global.properties to test the module.

Let me know if this works for you.

Hyland Developer Evangelist