cancel
Showing results for 
Search instead for 
Did you mean: 

Retry failed job and duedate/lockExpirationDate

mapor
Champ in-the-making
Champ in-the-making
Hello,

I'm trying to write a class in order to retry failed job with a delay, in my command i use the following lines :


public Object execute(CommandContext commandContext) {
     if(this.jobId==null){
        LOGGER.error("no job id given");
        throw new IllegalArgumentException();
     }
      JobEntity job = Context
      .getCommandContext()
      .getJobEntityManager()
      .findJobById(jobId);
    job.setLockOwner(null);
    Date date = Calendar.getInstance().getTime();
    date.setTime(date.getTime()+TEN_MN_IN_MS);
   
    job.setLockExpirationTime(date);
    job.setDuedate(date);
    if(LOGGER.isDebugEnabled()){
       LOGGER.debug("Job "+jobId+" has been delayed to "+date.getTime());
    }

    if(exception != null) {
      job.setExceptionMessage(exception.getMessage());
      job.setExceptionStacktrace(getExceptionStacktrace());
    }
    return null;
  }


However i can still see a value in lock owner in the jobs that should not have one anymore, and duedate is only update for timer entites, not message.

The purpose is to retry any kind of job with a delay like 10mn or 1h any amount of times it's required (due to external systems not very friendly…) .

I know about a JIRA that solved the case by patching activiti but i'm not allowed to patch Activiti and as it has been said in the JIRA, it should be enough to replace a FailedJobCommandFactory in order to do that, but it doesn't work.

Any idea of what doesn't work or other way to handle that are welcome.
24 REPLIES 24

mapor
Champ in-the-making
Champ in-the-making
I tried and something interesting showed up : when my job failed i get into my custom failed job executor but a exception is still throw  : "job xx failed"

So that means that with my current code the error of my job is rethrowed after my RetryJobExecutor instead of being schedule later. And since a exception means that the transaction will rollback…

Seems like i forgot something that make the exception not being rethrowed by activiti.
For my junit i just use manageService.executeJob and my delegate always throw

If i look in the DecrementJobRetriesCmd i can see that piece of code :
<java>
JobExecutor jobExecutor = Context.getProcessEngineConfiguration().getJobExecutor();
    MessageAddedNotification messageAddedNotification = new MessageAddedNotification(jobExecutor);
    TransactionContext transactionContext = commandContext.getTransactionContext();
    transactionContext.addTransactionListener(TransactionState.COMMITTED, messageAddedNotification);
   
</java>

From what i understood when i red about that, it was that code that make the job being retry immediately, so i removed it. I tried to put it again but the exception is still rethrowed.

edit : in inspected the code more deeply and in fact i have a NPE at job.setLockOwner(null); in my RetryJobExecutor the variable job is null, but the variable jobId is "26". So that means that i have an ID of a job that is not present anymore in the database. That NPE is not present in the given stacktrace. I have the first exception that have been throw by my delegate.

NB : my jobs are asynchronous service task with delegate expression if it's help.

jbarrez
Star Contributor
Star Contributor
The code you posted is using a transaction listener - hence the code will be executed just before the transaction actually commits to the database.

> Seems like i forgot something that make the exception not being rethrowed by activiti.

Sounds like it indeed. However, im not sure if thats pluggable actually … would need some investigation.

mapor
Champ in-the-making
Champ in-the-making
as i said in my edit, i got a NPE in my job executor., the variable job is null.

The  jobid that activiti gave me is not found. I don't know why because when i ran it from my application the job is found.

I use a hsql db in memory for my junit tests

jbarrez
Star Contributor
Star Contributor
> I use a hsql db in memory for my junit tests

The db shouldn't make a difference normally. Could you share that unit test that is failing, cause its hard to follow without knowing what all is happening.

mapor
Champ in-the-making
Champ in-the-making
here : http://www.fichier-zip.com/2014/06/02/activiti-unit-test-template-master/

I used two BPNM model just in case.

Note the code :

<java>
  JobEntity job = Context
          .getCommandContext()
          .getJobEntityManager()
          .findJobById(jobId);
</java>

still throw a NPE if i use the variable commandContenxt.

jbarrez
Star Contributor
Star Contributor
Thanks for the unit test.

It took my quite some puzzling to see whats going on.

First off all, you should disable the jobExecutor in the activiti cfg. It is interfering with your regular executeJobs() method.

Secondly, the real issue here is that the Test case is marked with @Transactional.
The whole test will be executed in one single transaction.
The .findJobById(jobId); will always return null: the data has never ever been committed to the database.
Once I removed the  @transactional, the NPE was gone.

mapor
Champ in-the-making
Champ in-the-making
Thanks for the NPE fix.

But now i have the main problem that produce in my application (i checked and i didn't have the NPE in the application) : if a put a breakpoint within that code in WorkflowFailTest.java :
<java>
for(Job job : list){
   StringBuilder builder = new StringBuilder();
   builder.append(job.getId()).append("/")
   .append(job.getRetries()).append("/")
   .append(job.getExceptionMessage());
   LOGGER.info(builder.toString());
  }
</java>

I can see that the dueDate is null or it shouldn't since i setted one in my RetryJobExecutor and while i'm waiting when the breakpoint is active the job are executing endlessly since i'm not decreasing the ammount of retries.

jbarrez
Star Contributor
Star Contributor
> I can see that the dueDate is null or it shouldn't since i setted one in my RetryJobExecutor and while i'm waiting when the breakpoint is active the job are executing endlessly since i'm not decreasing the ammount of retries.

Im sorry, i tried very hard … but can't understand what you are trying to say. Breakpointing in a multi threaded environment is hard, and will give you different results.

mapor
Champ in-the-making
Champ in-the-making
Well since i use the manager in order to execute all job synchronously in the junit test, the whole thing should be only in one thread no ? And so reading the jobs after execute them should give result up to date (and so with a due date set to current time + 10mn.

What i'm trying to say is that the setDueDate() used in my RetryJobExecutpor is not persisted.

Or maybe there is some hidden rules like only timer activity can have dueDate and so can't be used for all kind of jobs to schedule next try ?

In the end i just search how to prevent for immediate retry of a job that has failed and control when it will be executed again.

jbarrez
Star Contributor
Star Contributor
> Or maybe there is some hidden rules like only timer activity can have dueDate and so can't be used for all kind of jobs to schedule next try ?

No, any jobEntity can have it.

The last resort you can try: is the JobEntity part of the DbSQLSession.flush()? Setting the due date should mark the entity as dirty, and make it flush at the end of the transaction.