cancel
Showing results for 
Search instead for 
Did you mean: 

Deadlock in cluster during AcquireJobsCmd

tomasz_lewandow
Champ in-the-making
Champ in-the-making
I have a cluster of 2 Activiti 5.14 instances, sharing the same PostgreSQL 9.1 database. I have a process with timers.
From time to time I experience a deadlock during AcquireJobsCmd execution:


2013-12-17 12:00:01,083 Thread-96            ERROR [                   impl.interceptor.CommandContext] [                         close] [140] : Error while closing command context: org.apache.ibatis.exceptions.PersistenceException:
### Error updating database.  Cause: org.postgresql.util.PSQLException: BŁĄD: deadlock detected
  Szczegóły: Proces 17382 oczekuje na ShareLock na transakcja 1398139; zablokowany przez 21060.
Proces 21060 oczekuje na ShareLock na transakcja 1398140; zablokowany przez 17382.
  Wskazówka: Przejrzyj dziennik serwera by znaleźć szczegóły zapytania.
### The error may involve org.activiti.engine.impl.persistence.entity.JobEntity.updateTimer-Inline
### The error occurred while setting parameters
### SQL: update ACT_RU_JOB      SET REV_ =  ?,        LOCK_EXP_TIME_ = ?,        LOCK_OWNER_ = ?,        RETRIES_ = ?,        EXCEPTION_STACK_ID_ = ?,        EXCEPTION_MSG_ = ?,        DUEDATE_ = ?      where ID_= ?       and REV_ = ?
### Cause: org.postgresql.util.PSQLException: BŁĄD: wykryto zakleszczenie
  Szczegóły: Proces 17382 oczekuje na ShareLock na transakcja 1398139; zablokowany przez 21060.
Proces 21060 oczekuje na ShareLock na transakcja 1398140; zablokowany przez 17382.
  Wskazówka: Przejrzyj dziennik serwera by znaleźć szczegóły zapytania.
   at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:23)
   at org.apache.ibatis.session.defaults.DefaultSqlSession.update(DefaultSqlSession.java:147)
   at org.activiti.engine.impl.db.DbSqlSession.flushUpdates(DbSqlSession.java:560)
   at org.activiti.engine.impl.db.DbSqlSession.flush(DbSqlSession.java:444)
   at org.activiti.engine.impl.interceptor.CommandContext.flushSessions(CommandContext.java:170)
   at org.activiti.engine.impl.interceptor.CommandContext.close(CommandContext.java:117)
   at org.activiti.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:66)
   at org.activiti.spring.SpringTransactionInterceptor$1.doInTransaction(SpringTransactionInterceptor.java:47)
   at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:130)
   at org.activiti.spring.SpringTransactionInterceptor.execute(SpringTransactionInterceptor.java:45)
   at org.activiti.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:31)
   at org.activiti.engine.impl.cfg.CommandExecutorImpl.execute(CommandExecutorImpl.java:40)
   at org.activiti.engine.impl.cfg.CommandExecutorImpl.execute(CommandExecutorImpl.java:35)
   at org.activiti.engine.impl.jobexecutor.AcquireJobsRunnable.run(AcquireJobsRunnable.java:59)
   at java.lang.Thread.run(Thread.java:662)
Caused by: org.postgresql.util.PSQLException: BŁĄD: wykryto zakleszczenie
  Szczegóły: Proces 17382 oczekuje na ShareLock na transakcja 1398139; zablokowany przez 21060.
Proces 21060 oczekuje na ShareLock na transakcja 1398140; zablokowany przez 17382.
  Wskazówka: Przejrzyj dziennik serwera by znaleźć szczegóły zapytania.
   at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2103)
   at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1836)
   at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
   at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:512)
   at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
   at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:381)
   at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.postgresql.ds.jdbc23.AbstractJdbc23PooledConnection$StatementHandler.invoke(AbstractJdbc23PooledConnection.java:455)
   at com.sun.proxy.$Proxy57.execute(Unknown Source)
   at org.jboss.jca.adapters.jdbc.CachedPreparedStatement.execute(CachedPreparedStatement.java:297)
   at org.jboss.jca.adapters.jdbc.WrappedPreparedStatement.execute(WrappedPreparedStatement.java:404)
   at org.apache.ibatis.executor.statement.PreparedStatementHandler.update(PreparedStatementHandler.java:41)
   at org.apache.ibatis.executor.statement.RoutingStatementHandler.update(RoutingStatementHandler.java:66)
   at org.apache.ibatis.executor.SimpleExecutor.doUpdate(SimpleExecutor.java:45)
   at org.apache.ibatis.executor.BaseExecutor.update(BaseExecutor.java:108)
   at org.apache.ibatis.executor.CachingExecutor.update(CachingExecutor.java:75)
   at org.apache.ibatis.session.defaults.DefaultSqlSession.update(DefaultSqlSession.java:145)
   … 13 more


From PostgreSQL log I can see statements  that caused this deadlock:

        Proces 17382: update ACT_RU_JOB
             SET REV_ =  $1,
               LOCK_EXP_TIME_ = $2,
               LOCK_OWNER_ = $3,
               RETRIES_ = $4,
               EXCEPTION_STACK_ID_ = $5,
               EXCEPTION_MSG_ = $6,
               DUEDATE_ = $7
            where ID_= $8
              and REV_ = $9
        Proces 21060: update ACT_RU_JOB
             SET REV_ =  $1,
               LOCK_EXP_TIME_ = $2,
               LOCK_OWNER_ = $3,
               RETRIES_ = $4,
               EXCEPTION_STACK_ID_ = $5,
               EXCEPTION_MSG_ = $6,
               DUEDATE_ = $7
            where ID_= $8
              and REV_ = $9

(though I cant see the history of these transactions)

I suspect that there is a flaw in AcquireJobsCmd class. It selects a set of jobs ready to run and then updates one by one these jobs. But the select doesnt specify "order by" so the order of rows and the order of updates is not guarantied, so i.e. if we have 2 jobs: A and B, then first instance of activiti may get rows in order A,B and the second instance B,A. First instance updates (locks) row A, second instance updates row B. Then first instance tries to update row A and the second instance tries to update row B. And there is a deadlock.

Is this a bug or I am missing something?

1 REPLY 1

jbarrez
Star Contributor
Star Contributor
Thanks for posting this. We had a team call on it and found out that you indeed found a bug in the job executor.

The corresponding Jira issue is http://jira.codehaus.org/browse/ACT-1879#comment-337677
It's fixed at https://github.com/Activiti/Activiti/commit/1ea692de8bd2626fd2836ed31d160a036c3f4aa8

This will be part of the next release. However, you can apply the same simple fix as in the git commit (set the maxJobsPerAcquisition to one).

Thank you very much!
Getting started

Tags


Find what you came for

We want to make your experience in Hyland Connect as valuable as possible, so we put together some helpful links.