01-31-2019 07:59 AM
Hi,
We are using Alfresco Community Edition 5.0d. Unfortunately the best practices are not followed since beginning. Due to this all the documents are stored in the Repository root folder. This folder now has 800,000 records. This is causing performance issues in application.
After looking at several recommendations for keeping fewer number of files in a folder, we want to move all the existing document in to year wise folders. What is the recommended way to move the documents?
02-01-2019 09:50 AM
Probably a job is your safe bet, so you can control the number of documents to be re-classified within the same transaction.
02-01-2019 12:50 PM
Hi Anupam Gupta,
If you don’t want to modify/create a amp/jar module for adding a job like Angel Borroy has recommended, I propose you to develop a javascript code to classify the nodes according to any criteria’s like year/month/day of their creation.
In this script, you can run a lucene query to search the nodes. By default, each lucene query returns a maximum of 1000 nodes, so you can classify 1000 nodes by execution. If you choose this option, you need to install the following add-on in your Alfresco to develop and execute the javascript code: js-console. In the other hand, if you discard my recommendation, I suggest you install this add-on anyway to execute maintenance operations in your Alfresco.
A example of javascript pseudo-code for re-classify nodes could be the following:
var nodes = search.luceneSearch("PATH:\"{PATH TO FOLDER WITH 80000 NODES}/*\"");
for each(var node in nodes){
var date = node.properties["cm:created"];
var year = date.getFullYear();
var month = date.getMonth();
var day = date.getDay();
// check if folder {PATH TO FOLDER WITH 80000 NODES}/year/month/day exists
//if this folder doesn't exist you must create it
// finally you can move the node to folder
node.move("{PATH TO FOLDER WITH 80000 NODES}/year/month/day");
}
Regards,
Sergio.
02-02-2019 01:04 AM
Hi Sergio/Anupam,
You can also write a lucene query to get 1000 plus record.
Refer below example.
var queryString="TYPE:\"cm:content\"";
var paging =
{
maxItems: 100000000,
skipCount: 0
};
var def =
{
query: queryString,
page: paging
};
var nodes = search.query(def);
Best Regards
Mohit Rathi
+91 9028860467
02-04-2019 11:08 AM
Thanks Sergio and Mohit.
I enabled the Js-console and ran it for 1000 documents. It took 20 seconds. However when I ran it for 200,000 documents it got hung and kept running for 2 hours. I expected it to finish the movement in roughly 67 minutes. May be I made some mistake in java script code. Will debug it and post my code here for review.
02-04-2019 11:30 AM
You cannot expect to run a single transaction involving 200,000 documents.
Use CMIS document-by-document if you don't want to code a job.
02-04-2019 12:16 PM
Hi Angel Borroy
Can you please point me to a link where I can learn about coding a job for such document movement.
So far I have tried something like below
function move(){
var nodes = search.luceneSearch("PATH:\"/app:company_home/* \" AND TYPE:\"XXXX:doc\"");
while(nodes.length!=0){
for each(var node in nodes){
var current = node.properties["cm:created"];
var year = current.getFullYear();
var month = current.getMonth() + 1;
var day = current.getDate();
var yearSpace = space.childByNamePath(year);
if (yearSpace == null) {
yearSpace = space.createFolder(year);
}
var monthSpace = yearSpace.childByNamePath(month);
if (monthSpace == null) {
monthSpace = yearSpace.createFolder(month);
}
var daySpace = monthSpace.childByNamePath(day);
if (daySpace == null) {
daySpace = monthSpace.createFolder(day);
}
// Then move document
node.move(daySpace);
}
nodes = search.luceneSearch("PATH:\"/app:company_home/* \" AND TYPE:\"XXXX:doc\"");
}
}
move();
02-05-2019 01:41 AM
https://docs.alfresco.com/community/references/dev-extension-points-scheduled-jobs.html
02-05-2019 02:10 AM
Hi Anupam,
Don't try to put all document in one transaction.
Make a bunch of 20k to 25k and then try.
Best Regards
Mohit Rathi
+91 9028860467
Explore our Alfresco products with the links below. Use labels to filter content by product module.