cancel
Showing results for 
Search instead for 
Did you mean: 

Committing successful individual operation before Rollback

ashokharnal
Champ in-the-making
Champ in-the-making
I am trying to transform all pdf files in a folder  to plain text; extract some meta-data
from each one of the plain text files and as it gets extracted, fill up Alfresco meta data
fields for that pdf file with those extracted values from text file.
However, if even in a single file there is a transformation error , the whole transaction
is rolled back.  Even log file is not saved.

For example, the following is a simple code to transform all pdf files
to plain text files in a folder. But if there is an error in transformation, not a single pdf file gets converted
to text file. Even log file is not saved. If the error causing file is deleted, all rest pdf files will get
transformed to plain text. (alfresco version is: 3.4.d)

This is very frustrating for I cannot carry out operation even on a single file. Please advise what
is the way out so that except the error causing file all other files get processed/transformed?


// Location of log file report
var dest = companyhome.childByNamePath("Sites/myoffice/documentLibrary/AURC Space/DEO/Reports") ;
// Create new log file and add heading
logFile =  dest.createFile("log.txt") ;
logFile.content = "Record of which files failed transformation" + "\r\n" ;

// Start iterating over all files in Drafts folder and transform each, one by one
var draftfolder =  companyhome.childByNamePath("Sites/myoffice/documentLibrary/AURC Space/DEO/Drafts") ;
var childfiles = draftfolder.children ;
var filenumb = childfiles.length ;
var count = 0 ;
// Go through each file, one by one
for ( i = 0 ; i < filenumb ; i++ )
{
   var docu = childfiles[i] ;
   var newtxtNode ;
   try
      {
      newtxtNode = docu.transformDocument("text/plain") ;
                newtxtNode.save() ; 
      }
   catch (e)
      {
      logFile.content += ++count + " Error in transformation: " + docu.name     ;
      }
} // end of for loop
logFile.save() ;

Will be grateful for reply.
4 REPLIES 4

mrogers
Star Contributor
Star Contributor
I'd write the action in Java and have each transformation in a separate transaction. 

AFAIK you can't control transactions through the script API.

Or another option could be to queue actions to do the transformations asynchronously.   That's possibley a much better approach.

ashokharnal
Champ in-the-making
Champ in-the-making
Dear mrogers,

Kindly consider the following part of your reply:

Or another option could be to queue actions to do the transformations asynchronously. That's possibly a much better approach.

Please amplify it. How do we queue actions? Does it mean, I write a javascript for operation on one file and then design an inbound rule, that as and when a pdf file enters the particular folder, the java script will run in the
background to work on that pdf file.

I am uploading around 300GB of data residing in about 1600 files through smbclient. The uploading is being
done in batches and will take around a week's time. This operation is being carried out in various offices (for each office around 300GB). Uploading data through smb is also time consuming process. Meanwhile this background process (on each pdf file) may start slowing down uploading. Or is there some way to control it?

Ashok Kumar Harnal

ashokharnal
Champ in-the-making
Champ in-the-making
Dear mrogers,

I am experimenting with scheduling action with 'transactionmode' set as ISOLATED_TRANSACTIONS
and building a query template that will filter all nodes (pdf files) from Drafts folder.

Is it possible to change the query and scheduler definitions without restarting Alfresco? In that case where
should I place these definitions?

I will be grateful for help.

Ashok Kumar Harnal

ashokharnal
Champ in-the-making
Champ in-the-making
Well I was finally successful. Instead of building a loop in javascript that iterates over all files,
and works on them. I coded javascript that worked only on one document. To make this code to work
on all desired nodes, a query template was provided in Scheduled action xml file and
the transaction mode was set to ISOLATED_TRANSACTIONS. Even if the code failed to work over
any pdf file, action would continue and would then pass on to the next pdf file to work upon it.