cancel
Showing results for 
Search instead for 
Did you mean: 

Jobexecutor does not cope with simultaneousness

sazzadul
Champ in-the-making
Champ in-the-making
Hi,

I wanted to share an experience with you and hopefully find a solution for it too.

It seems like Jobexecutor simply goes into hibernation when it gets to much to do.

Here is the scenario…
I have a process with a few servicetasks(async) and a user task with timerboundaryevent which was a duedate say 15 minutes ahead.
Now when I start a number of processes (say 100) they all enter the user task and waits for the duedate to expire.
So far so good, but as soon as the duedate expires the jobexecutor kicks in and it begins to work, I can see in the RU_JOB table that it updates the lock column for a few rows but thats basically it. The jobexecutor hereafter doesn't do a single thing, it totally stops. What is even worse it stops processing the new processes as well.

I have tried to increase the maxpoolsize, waitTimeInMillis without any luck. I have even tried a customjobexecutor(as suggested in this http://forums.activiti.org/en/viewtopic.php?f=6&t=3523&hilit=jobexecutor) which uses commonj but the result is same.

I am using jetty with Oracle, have also tried postgres and mssql2008 the result is same.
There is no exception whatsoever being thrown from activiti so I have no clue where the problem lies.

Is there any bug in the jobexecutor ? I hope I have made myself clear enough to reproduce this error.

Thanks.
Sazzadul

Jobexecutor does not seem to tackle simultaneousness at all.
27 REPLIES 27

trademak
Star Contributor
Star Contributor
Hi,

The job executor is a thread pool and polls for jobs to be processed.
When a job fails it is placed back in the jobs table and the retry count is increased by 1.
When a job executor polls for jobs again the failed job will be processed again until the same job has been retried 3 times, then the job is placed in deadlock mode and it will be be retried again.

What's the exact behavior you are experiencing? How many jobs are there in the ACT_RU_JOB table and what are the job executor threads doing?

Best regards,

stevennguyen
Champ in-the-making
Champ in-the-making
Hi Tijs,

We have serveral job executors.  When job executor hang, we restart it. After it pickup a few jobs, it hang again. All job executors  will be hung within a few minutes.  The number of jobs can be from 200 to 20,000 when we have the issue. We did a threaddump, but not able to find job executor thread.


I've seperated out different type of jobs, by trucate ACT_RU_JOB , manually insert jobs back and have only jobexecutors running to process them. Certain type of job, job executors can proccess 20K within 10 mins without an issue. We found that the job that causes job executor to hang, it requires external webservice call and take longer to process.

I see the last printout from the lob with Server error. Could this causes the job executor crash or hung?
Last printout from Job Executor:
Aug 16, 2013 8:20:47 PM org.activiti.engine.impl.jobexecutor.AcquireJobsRunnable run
INFO: JobExecutor[org.activiti.engine.impl.jobexecutor.DefaultJobExecutor] starting to acquire jobs
INFO: Error while closing command context
org.activiti.engine.JobNotFoundException: No job found with id '94593698'.
        at org.activiti.engine.impl.cmd.ExecuteJobsCmd.execute(ExecuteJobsCmd.java:59)
        at org.activiti.engine.impl.interceptor.CommandExecutorImpl.execute(CommandExecutorImpl.java:24)
        at org.activiti.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:60)
        at org.activiti.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:32)
        at org.activiti.engine.impl.jobexecutor.ExecuteJobsRunnable.run(ExecuteJobsRunnable.java:46)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:662)
Aug 16, 2013 8:21:56 PM org.activiti.engine.impl.interceptor.CommandContext close
SEVERE: Error while closing command context
org.activiti.engine.ActivitiOptimisticLockingException: HistoricVariableInstanceEntity[103316416] was updated by another transaction concurrently
        at org.activiti.engine.impl.db.DbSqlSession.flushUpdates(DbSqlSession.java:652)
        at org.activiti.engine.impl.db.DbSqlSession.flush(DbSqlSession.java:460)
        at org.activiti.engine.impl.interceptor.CommandContext.flushSessions(CommandContext.java:167)
        at org.activiti.engine.impl.interceptor.CommandContext.close(CommandContext.java:114)
        at org.activiti.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:69)
        at org.activiti.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:32)
        at org.activiti.engine.impl.jobexecutor.ExecuteJobsRunnable.run(ExecuteJobsRunnable.java:46)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:662)


Last few lines in the log:

Aug 16, 2013 8:25:53 PM weblogic.wsee.jaxws.framework.policy.advertisementimpl.AdvertisementHelperImpl registerExtension
WARNING: Registering oracle.j2ee.ws.wsdl.extensions.addressing.AddressingExtensionRegistry extension failed; java.lang.ClassNotFoundException: oracle.j2ee.ws.wsdl.extensions.addressing.AddressingExtensionRegistry
Aug 16, 2013 8:25:53 PM weblogic.wsee.jaxws.spi.WLSProvider createServiceDelegate
WARNING: Could not read WSDL Definition from URL wsdlDocumentLocation: 2 counts of InaccessibleWSDLException.

Aug 16, 2013 8:25:53 PM weblogic.wsee.jaxws.framework.policy.advertisementimpl.AdvertisementHelperImpl registerExtension
WARNING: Registering oracle.j2ee.ws.wsdl.extensions.addressing.AddressingExtensionRegistry extension failed; java.lang.ClassNotFoundException: oracle.j2ee.ws.wsdl.extensions.addressing.AddressingExtensionRegistry
Aug 16, 2013 8:25:53 PM weblogic.wsee.jaxws.spi.WLSServiceDelegate addWsdlDefinitionFeature
SEVERE: Failed to create WsdlDefinitionFeature for wsdl location: zip:/apps/opt/weblogic/config/BPS/servers/fduscpw8_bps01/tmp/_WL_user/CMPRealtime/shqoub/APP-INF/lib/ITWBAgent.jar!/META-INF/wsdl/heartbeat.wsdl, error: com.sun.xml.ws.wsdl.parser.InaccessibleWSDLException, message: 2 counts of InaccessibleWSDLException.

trademak
Star Contributor
Star Contributor
Hi,

From the last part of your log I see error messages for executing the web service call. This would mean that the job is retried 3 times and fails 3 times. After that the job will not be processes again. If this happens for every job, eventually the job executor will stop because all jobs are in failed state. Could that be the case?

Best regards,

Tijs,

Found the root of this issue.  We're using Weblogic version 11g and it's using older version of joda jar than Activiti's joda jar. In some scenario, we're getting "ERROR" and it's being not catched. This cause AcquireJobsRunnable to die. Below is the Error that I caught and print  in AcquireJobsRunnable.

org.joda.time.DateTime.parse(Ljava/lang/StringSmiley WinkLorg/joda/time/DateTime;
Sep 2, 2013 9:32:54 PM org.activiti.engine.impl.jobexecutor.AcquireJobsRunnable run
INFO: 232513==>org.activiti.engine.impl.calendar.DurationHelper.parseDate(DurationHelper.java:119)
org.activiti.engine.impl.calendar.DurationHelper.<init>(DurationHelper.java:64)
org.activiti.engine.impl.calendar.CycleBusinessCalendar.resolveDuedate(CycleBusinessCalendar.java:30)
org.activiti.engine.impl.persistence.entity.TimerEntity.calculateRepeat(TimerEntity.java:90)
org.activiti.engine.impl.persistence.entity.TimerEntity.execute(TimerEntity.java:72)
org.activiti.engine.impl.cmd.ExecuteJobsCmd.execute(ExecuteJobsCmd.java:71)
org.activiti.engine.impl.interceptor.CommandExecutorImpl.execute(CommandExecutorImpl.java:24)
org.activiti.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:60)
org.activiti.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:38)
org.activiti.engine.impl.jobexecutor.ExecuteJobsRunnable.run(ExecuteJobsRunnable.java:50)
org.activiti.engine.impl.jobexecutor.CallerRunsRejectedJobsHandler.jobsRejected(CallerRunsRejectedJobsHandler.java:26)
org.activiti.engine.impl.jobexecutor.DefaultJobExecutor.executeJobs(DefaultJobExecutor.java:82)
org.activiti.engine.impl.jobexecutor.AcquireJobsRunnable.run(AcquireJobsRunnable.java:67)
java.lang.Thread.run(Thread.java:662)


trademak
Star Contributor
Star Contributor
Okay, but are you saying that there are new jobs for the job executor to process and it will not execute the new ones anymore, because of this error?

Best regards,

mmaker1234
Champ in-the-making
Champ in-the-making
Hello Tijs,

At least in our case "<i>…there are new jobs for the job executor to process and it will not execute the new ones anymore, because of this error</i>" is exactly what we observed. I suppose Steven Nguyen observes the same.

trademak
Star Contributor
Star Contributor
Okay thanks, we'll try to reproduce it.

Best regards,

pollop
Champ in-the-making
Champ in-the-making
- Wrong post -

frederikherema1
Star Contributor
Star Contributor
This has been fixed in b10dfc9ab6170641030beb9e84151ed4a949ef03 on master, will be part of the 5.14 release

mmaker1234
Champ in-the-making
Champ in-the-making
Thank you very much, Frederik! 🙂

I'll check when 5.14 is released.