cancel
Showing results for 
Search instead for 
Did you mean: 

DB2 and async deadlocks

acoliver
Champ in-the-making
Champ in-the-making
Getting deadlocks in many places on DB2 whenever there is anything async.  DB2 implemented optimistic locking, or something resembling it, a few releases ago but it works differently than Oracle/PostgreSQL/MSSQL(in Snapshot Isolation mode).  Can anyone confirm that the DB2 backend does not use this feature.  Do you suspect that if I alter it to use this feature that Activiti will work better? http://pic.dhe.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=%2Fcom.ibm.db2.luw.admin.dbobj.doc%2Fd... and http://www.zinox.com/node/84

Has anyone else experienced this issue?
6 REPLIES 6

acoliver
Champ in-the-making
Champ in-the-making
In particular it pukes here:


select       JOB.*           from ACT_RU_JOB JOB          LEFT OUTER JOIN ACT_RU_EXECUTION PI ON PI.ID_ = JOB.PROCESS_INSTANCE_ID_      LEFT OUTER JOIN ACT_RE_PROCDEF PD ON PD.ID_ = PI.PROC_DEF_ID_         where (JOB.RETRIES_ > 0)       and (JOB.DUEDATE_ is null or JOB.DUEDATE_ < ?)       and (JOB.LOCK_OWNER_ is null or JOB.LOCK_EXP_TIME_ < ?)    and (         (JOB.EXECUTION_ID_ is null)      or ((PI.SUSPENSION_STATE_ = 1)               and (PD.SUSPENSION_STATE_ = 1))       )

acoliver
Champ in-the-making
Champ in-the-making
Starting to think it may be simpler than isolation problems and instead be a flat problem with that query despite the error message saying it is a timeout.

trademak
Star Contributor
Star Contributor
We have our full test suite running against DB2 and we use jobs quite a lot there, so this is strange.
Does DB2 give you any pointers on why the query is so unresponsive?

Best regards,

acoliver
Champ in-the-making
Champ in-the-making
So it only happens with non-exclusive async tasks.

I'm not really a DB2 expert beyond its "unique characteristics".  This is what shows up in the log:


org.activiti.engine.impl.jobexecutor.AcquireJobsRunnable
run
SEVERE: exception during job acquisition:
### Error updating database.  Cause: com.ibm.db2.jcc.c.SqlException: DB2 SQL err
or: SQLCODE: -911, SQLSTATE: 40001, SQLERRMC: 68
### The error may involve org.activiti.engine.impl.persistence.entity.JobEntity.
updateMessage-Inline
### The error occurred while setting parameters
### SQL: update ACT_RU_JOB      SET REV_ =  ?,        LOCK_EXP_TIME_ = ?,
LOCK_OWNER_ = ?,        RETRIES_ = ?,        EXCEPTION_STACK_ID_ = ?,        EX
CEPTION_MSG_ = ?      where ID_= ?       and REV_ = ?
### Cause: com.ibm.db2.jcc.c.SqlException: DB2 SQL error: SQLCODE: -911, SQLSTAT
E: 40001, SQLERRMC: 68
org.apache.ibatis.exceptions.PersistenceException:
### Error updating database.  Cause: com.ibm.db2.jcc.c.SqlException: DB2 SQL err
or: SQLCODE: -911, SQLSTATE: 40001, SQLERRMC: 68
### The error may involve org.activiti.engine.impl.persistence.entity.JobEntity.
updateMessage-Inline
### The error occurred while setting parameters
### SQL: update ACT_RU_JOB      SET REV_ =  ?,        LOCK_EXP_TIME_ = ?,
LOCK_OWNER_ = ?,        RETRIES_ = ?,        EXCEPTION_STACK_ID_ = ?,        EX
CEPTION_MSG_ = ?      where ID_= ?       and REV_ = ?
### Cause: com.ibm.db2.jcc.c.SqlException: DB2 SQL error: SQLCODE: -911, SQLSTAT
E: 40001, SQLERRMC: 68
        at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(Exception
Factory.java:23)
        at org.apache.ibatis.session.defaults.DefaultSqlSession.update(DefaultSq
lSession.java:147)
        at org.activiti.engine.impl.db.DbSqlSession.flushUpdates(DbSqlSession.ja
va:472)
        at org.activiti.engine.impl.db.DbSqlSession.flush(DbSqlSession.java:370)

        at org.activiti.engine.impl.interceptor.CommandContext.flushSessions(Com
mandContext.java:157)
        at org.activiti.engine.impl.interceptor.CommandContext.close(CommandCont
ext.java:109)
        at org.activiti.engine.impl.interceptor.CommandContextInterceptor.execut
e(CommandContextInterceptor.java:49)
        at org.activiti.spring.SpringTransactionInterceptor$1.doInTransaction(Sp
ringTransactionInterceptor.java:42)
        at org.springframework.transaction.support.TransactionTemplate.execute(T
ransactionTemplate.java:130)
        at org.activiti.spring.SpringTransactionInterceptor.execute(SpringTransa
ctionInterceptor.java:40)
        at org.activiti.engine.impl.interceptor.LogInterceptor.execute(LogInterc
eptor.java:33)
        at org.activiti.engine.impl.jobexecutor.AcquireJobsRunnable.run(AcquireJ
obsRunnable.java:57)
        at java.lang.Thread.run(Thread.java:619)
Caused by: com.ibm.db2.jcc.c.SqlException: DB2 SQL error: SQLCODE: -911, SQLSTAT
E: 40001, SQLERRMC: 68
        at com.ibm.db2.jcc.c.fg.e(fg.java:1596)
        at com.ibm.db2.jcc.b.gb.s(gb.java:859)
        at com.ibm.db2.jcc.b.gb.k(gb.java:387)
        at com.ibm.db2.jcc.b.gb.a(gb.java:60)
        at com.ibm.db2.jcc.b.w.a(w.java:52)

also


[3/5/13 15:48:33:172 CST] 00000004 ThreadMonitor W   WSVR0605W: Thread "Default : 3" (00000019) has been active for 697632 milliseconds and may be hung.  There is/are 1 thread(s) in total in the server that may be hung.
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:196)
at java.lang.Thread.join(Thread.java:616)
at org.activiti.engine.impl.jobexecutor.JobExecutor.stopJobAcquisitionThread(JobExecutor.java:182)
at org.activiti.engine.impl.jobexecutor.DefaultJobExecutor.stopExecutingJobs(DefaultJobExecutor.java:60)
at org.activiti.engine.impl.jobexecutor.JobExecutor.shutdown(JobExecutor.java:74)
at org.activiti.engine.impl.ProcessEngineImpl.close(ProcessEngineImpl.java:95)
at org.activiti.spring.ProcessEngineFactoryBean.destroy(ProcessEngineFactoryBean.java:40)
at org.springframework.beans.factory.support.DisposableBeanAdapter.destroy(DisposableBeanAdapter.java:211)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:498)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:474)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons(DefaultSingletonBeanRegistry.java:442)
at org.springframework.context.support.AbstractApplicationContext.destroyBeans(AbstractApplicationContext.java:1066)
at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:1040)
at org.springframework.context.support.AbstractApplicationContext.close(AbstractApplicationContext.java:988)
at org.springframework.context.support.AbstractApplicationContext.destroy(AbstractApplicationContext.java:976)
at org.springframework.beans.factory.support.DisposableBeanAdapter.destroy(DisposableBeanAdapter.java:211)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:498)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:474)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons(DefaultSingletonBeanRegistry.java:442)
at org.springframework.context.support.AbstractApplicationContext.destroyBeans(AbstractApplicationContext.java:1066)
at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:1040)
at org.springframework.context.support.AbstractApplicationContext.close(AbstractApplicationContext.java:988)
at org.springframework.context.access.ContextSingletonBeanFactoryLocator.destroyDefinition(ContextSingletonBeanFactoryLocator.java:158)
at org.springframework.beans.factory.access.SingletonBeanFactoryLocator$CountingBeanFactoryReference.release(SingletonBeanFactoryLocator.java:517)
*bleep*
at org.springframework.web.context.ContextLoaderListener.contextDestroyed(ContextLoaderListener.java:142)
at com.ibm.ws.webcontainer.webapp.WebApp.notifyServletContextDestroyed(WebApp.java:1796)
at com.ibm.ws.webcontainer.webapp.WebApp.destroy(WebApp.java:3126)
at com.ibm.ws.webcontainer.webapp.WebAppImpl.destroy(WebAppImpl.java:805)
at com.ibm.ws.container.AbstractContainer.destroy(AbstractContainer.java:74)
at com.ibm.ws.webcontainer.webapp.WebGroup.destroy(WebGroup.java:203)
at com.ibm.ws.webcontainer.webapp.WebGroup.removeWebApplication(WebGroup.java:241)
at com.ibm.ws.webcontainer.VirtualHost.removeWebApplication(VirtualHost.java:309)
at com.ibm.ws.webcontainer.VirtualHostImpl.removeWebApplication(VirtualHostImpl.java:208)
at com.ibm.ws.webcontainer.WSWebContainer.removeWebApplication(WSWebContainer.java:803)
at com.ibm.ws.webcontainer.component.WebContainerImpl.uninstall(WebContainerImpl.java:408)
at com.ibm.ws.webcontainer.component.WebContainerImpl.stop(WebContainerImpl.java:679)
at com.ibm.ws.runtime.component.ApplicationMgrImpl.stop(ApplicationMgrImpl.java:1150)
at com.ibm.ws.runtime.component.DeployedApplicationImpl.fireDeployedObjectStop(DeployedApplicationImpl.java:1324)
at com.ibm.ws.runtime.component.DeployedModuleImpl.stop(DeployedModuleImpl.java:645)
at com.ibm.ws.runtime.component.DeployedApplicationImpl.stop(DeployedApplicationImpl.java:1098)
at com.ibm.ws.runtime.component.ApplicationMgrImpl.stopApplication(ApplicationMgrImpl.java:921)
at com.ibm.ws.runtime.component.ApplicationMgrImpl.access$100(ApplicationMgrImpl.java:205)
at com.ibm.ws.runtime.component.ApplicationMgrImpl$2.run(ApplicationMgrImpl.java:1377)
at com.ibm.ws.security.auth.ContextManagerImpl.runAs(ContextManagerImpl.java:5353)
at com.ibm.ws.security.auth.ContextManagerImpl.runAsSystem(ContextManagerImpl.java:5441)
at com.ibm.ws.security.core.SecurityContext.runAsSystem(SecurityContext.java:255)
at com.ibm.ws.runtime.component.ApplicationMgrImpl.stopApplicationImpl(ApplicationMgrImpl.java:1383)
at com.ibm.ws.runtime.component.ApplicationMgrImpl.stop(ApplicationMgrImpl.java:2090)
at com.ibm.ws.runtime.component.CompositionUnitMgrImpl.stop(CompositionUnitMgrImpl.java:442)
at com.ibm.ws.runtime.component.CompositionUnitImpl.stop(CompositionUnitImpl.java:141)
at com.ibm.ws.runtime.component.CompositionUnitMgrImpl.stopCompositionUnit(CompositionUnitMgrImpl.java:723)
at com.ibm.ws.runtime.component.CompositionUnitMgrImpl.stopCompositionUnit(CompositionUnitMgrImpl.java:667)
at com.ibm.ws.runtime.component.ApplicationMgrImpl.stopApplication(ApplicationMgrImpl.java:1353)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:48)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:600)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:37)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:48)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:600)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:244)
at javax.management.modelmbean.RequiredModelMBean.invokeMethod(RequiredModelMBean.java:1074)
at javax.management.modelmbean.RequiredModelMBean.invoke(RequiredModelMBean.java:955)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
at com.ibm.ws.management.AdminServiceImpl$1.run(AdminServiceImpl.java:1331)
at com.ibm.ws.security.util.AccessController.doPrivileged(AccessController.java:118)
at com.ibm.ws.management.AdminServiceImpl.invoke(AdminServiceImpl.java:1224)
at com.ibm.ws.management.connector.AdminServiceDelegator.invoke(AdminServiceDelegator.java:181)
at com.ibm.ws.management.connector.ipc.CallRouter.route(CallRouter.java:242)
at com.ibm.ws.management.connector.ipc.IPCConnectorInboundLink.doWork(IPCConnectorInboundLink.java:353)
at com.ibm.ws.management.connector.ipc.IPCConnectorInboundLink$IPCConnectorReadCallback.complete(IPCConnectorInboundLink.java:595)
at com.ibm.ws.ssl.channel.impl.SSLReadServiceContext$SSLReadCompletedCallback.complete(SSLReadServiceContext.java:1784)
at com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)
at com.ibm.io.async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)
at com.ibm.io.async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161)
at com.ibm.io.async.AsyncFuture.completed(AsyncFuture.java:138)
at com.ibm.io.async.ResultHandler.complete(ResultHandler.java:204)
at com.ibm.io.async.ResultHandler.runEventProcessingLoop(ResultHandler.java:775)
at com.ibm.io.async.ResultHandler$2.run(ResultHandler.java:905)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1613)
[3/5/13 15:53:55:362 CST] 00000010 SystemOut     O WARN  *bleep* - after execute() [time=1719643ms, result=true] sql=select
     JOB.*     
    from ACT_RU_JOB JOB   
     LEFT OUTER JOIN ACT_RU_EXECUTION PI ON PI.ID_ = JOB.PROCESS_INSTANCE_ID_
     LEFT OUTER JOIN ACT_RE_PROCDEF PD ON PD.ID_ = PI.PROC_DEF_ID_   
    where (JOB.RETRIES_ > 0)
      and (JOB.DUEDATE_ is null or JOB.DUEDATE_ < '2013-03-05-15.25.15.717')
      and (JOB.LOCK_OWNER_ is null or JOB.LOCK_EXP_TIME_ < '2013-03-05-15.25.15.717')
   and (
        (JOB.EXECUTION_ID_ is null)
     or ((PI.SUSPENSION_STATE_ = 1)    
         and (PD.SUSPENSION_STATE_ = 1))
      )
[3/5/13 15:53:55:363 CST] 00000010 SystemOut     O WARN  *bleep* - after execute() [time=1719643ms, result=true] sql=select
     JOB.*     
    from ACT_RU_JOB JOB   
     LEFT OUTER JOIN ACT_RU_EXECUTION PI ON PI.ID_ = JOB.PROCESS_INSTANCE_ID_
     LEFT OUTER JOIN ACT_RE_PROCDEF PD ON PD.ID_ = PI.PROC_DEF_ID_   
    where (JOB.RETRIES_ > 0)
      and (JOB.DUEDATE_ is null or JOB.DUEDATE_ < '2013-03-05-15.25.15.717')
      and (JOB.LOCK_OWNER_ is null or JOB.LOCK_EXP_TIME_ < '2013-03-05-15.25.15.717')
   and (
        (JOB.EXECUTION_ID_ is null)
     or ((PI.SUSPENSION_STATE_ = 1)    
         and (PD.SUSPENSION_STATE_ = 1))
      )
[3/5/13 15:53:55:381 CST] 00000010 DMAdapter     I com.ibm.ws.ffdc.impl.DMAdapter getAnalysisEngine FFDC1009I: Analysis Engine using data base: *bleep*
[3/5/13 15:53:55:397 CST] 00000010 FfdcProvider  W com.ibm.ws.ffdc.impl.FfdcProvider logIncident FFDC1003I: FFDC Incident emitted on *bleep* com.ibm.ws.rsadapter.jdbc.WSJdbcResultSet.next 2624
[3/5/13 15:53:55:400 CST] 00000010 CommandContex E   Error while closing command context
                                 org.apache.ibatis.exceptions.PersistenceException:
### Error querying database.  Cause: com.ibm.db2.jcc.am.SqlTransactionRollbackException: The current transaction has been rolled back because of a deadlock or timeout.  Reason code "2".. SQLCODE=-911, SQLSTATE=40001, DRIVER=3.64.82
### The error may exist in org/activiti/db/mapping/entity/Job.xml
### The error may involve org.activiti.engine.impl.persistence.entity.JobEntity.selectNextJobsToExecute-Inline
### The error occurred while setting parameters
### SQL: select       JOB.*           from ACT_RU_JOB JOB          LEFT OUTER JOIN ACT_RU_EXECUTION PI ON PI.ID_ = JOB.PROCESS_INSTANCE_ID_      LEFT OUTER JOIN ACT_RE_PROCDEF PD ON PD.ID_ = PI.PROC_DEF_ID_         where (JOB.RETRIES_ > 0)       and (JOB.DUEDATE_ is null or JOB.DUEDATE_ < ?)       and (JOB.LOCK_OWNER_ is null or JOB.LOCK_EXP_TIME_ < ?)    and (         (JOB.EXECUTION_ID_ is null)      or ((PI.SUSPENSION_STATE_ = 1)               and (PD.SUSPENSION_STATE_ = 1))       )
### Cause: com.ibm.db2.jcc.am.SqlTransactionRollbackException: The current transaction has been rolled back because of a deadlock or timeout.  Reason code "2".. SQLCODE=-911, SQLSTATE=40001, DRIVER=3.64.82
at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:23)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:104)
at org.activiti.engine.impl.db.DbSqlSession.selectList(DbSqlSession.java:213)
at org.activiti.engine.impl.db.DbSqlSession.selectList(DbSqlSession.java:200)
at org.activiti.engine.impl.db.DbSqlSession.selectList(DbSqlSession.java:187)
at org.activiti.engine.impl.persistence.entity.JobManager.findNextJobsToExecute(JobManager.java:105)
at org.activiti.engine.impl.cmd.AcquireJobsCmd.execute(AcquireJobsCmd.java:51)
at org.activiti.engine.impl.cmd.AcquireJobsCmd.execute(AcquireJobsCmd.java:33)
at org.activiti.engine.impl.interceptor.CommandExecutorImpl.execute(CommandExecutorImpl.java:24)
at org.activiti.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:42)
at org.activiti.spring.SpringTransactionInterceptor$1.doInTransaction(SpringTransactionInterceptor.java:42)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:130)
at org.activiti.spring.SpringTransactionInterceptor.execute(SpringTransactionInterceptor.java:40)
at org.activiti.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:33)
at org.activiti.engine.impl.jobexecutor.AcquireJobsRunnable.run(AcquireJobsRunnable.java:57)
at java.lang.Thread.run(Thread.java:736)

Does the testsuite test async non-exclusive tasks against DB2?

sebastian_s
Champ in-the-making
Champ in-the-making
From the logs you posted I can tell you are running Activiti on the WebSphere Application Server. I suppose you have configured a DataSource to be used with Activiti in the Administration Console.

If yes, how have you configured the data source in respect to the webSphereDefaultIsolationLevel property? If no value is set (the default when creating new data sources) WebSphere falls back to its default which means 4 (TRANSACTION_REPEATABLE_READ) for DB2 data sources.

Have you tried setting the webSphereDefaultIsolationLevel property to 2 (TRANSACTION_READ_COMMITTED)?

acoliver
Champ in-the-making
Champ in-the-making
That is a good point.  I did not realize DB2 now supported READ_COMMITTED.  ATM it looks like switching to UUIDs seems to solve the problem.  Once this is actually tested I will report back.