cancel
Showing results for 
Search instead for 
Did you mean: 

Optimizing ACT_RU_JOB Table

anandagarwaal
Champ in-the-making
Champ in-the-making
Hi Team,

Kudos on making such a great workflow framework.

We're using Activiti version 5.15.1 with timers in our workflow which sends reminders to users who are in the workflow. Problem is that we're creating jobs in ACT_RU_JOB table and even though there are only 45k odd rows in ACT_RU_JOB table, we're constantly getting slow query alert from the queries running on ACT_RU_JOB table.

A sample slow query alert from database looks like this:


### 253 Queries
### Taking 1.001277 to 1.567178 seconds to complete

SET timestamp=XXX;
select RES.*
    from ACT_RU_JOB RES
    LEFT OUTER JOIN ACT_RU_EXECUTION PI ON PI.ID_ = RES.PROCESS_INSTANCE_ID_
    where (RES.TYPE_ = 'XXX')
      and (RES.DUEDATE_ is not null)
      and (RES.DUEDATE_ < 'XXX')
      and (RES.LOCK_OWNER_ is null or RES.LOCK_EXP_TIME_ < 'XXX')
      and (RES.RETRIES_  > XXX)
      and (
        (RES.EXECUTION_ID_ is null)
        or
        (PI.SUSPENSION_STATE_ = XXX)   
      )
    order by DUEDATE_;


### 225 Queries
### Taking 1.000308 to 1.255745 seconds to complete

SET timestamp=XXX;
select RES.* 
    from ACT_RU_JOB RES
      LEFT OUTER JOIN ACT_RU_EXECUTION PI ON PI.ID_ = RES.PROCESS_INSTANCE_ID_
    where (RETRIES_ > XXX)
      and (DUEDATE_ is null or DUEDATE_ <= 'XXX')
      and (LOCK_OWNER_ is null or LOCK_EXP_TIME_ <= 'XXX')
      and (
          (RES.EXECUTION_ID_ is null)
        or
        (PI.SUSPENSION_STATE_ = XXX)    
      ) 
    LIMIT XXX OFFSET XXX;

The actual query looks something like this:


select RES.*
    from ACT_RU_JOB RES
    LEFT OUTER JOIN ACT_RU_EXECUTION PI ON PI.ID_ = RES.PROCESS_INSTANCE_ID_
    where (RES.TYPE_ = 'timer')
      and (RES.DUEDATE_ is not null)
      and (RES.DUEDATE_ < '2015-07-15 10:42:47.464')
      and (RES.LOCK_OWNER_ is null or RES.LOCK_EXP_TIME_ < '2015-07-15 10:42:47.464')
      and (RES.RETRIES_  > 0)
      and (
        (RES.EXECUTION_ID_ is null)
        or
        (PI.SUSPENSION_STATE_ = 1)   
      )
    order by DUEDATE_;

select RES.*
from ACT_RU_JOB RES
  LEFT OUTER JOIN ACT_RU_EXECUTION PI ON PI.ID_ = RES.PROCESS_INSTANCE_ID_
where (RETRIES_ > 0)
  and (DUEDATE_ is null or DUEDATE_ <= '2015-04-17 14:59:09.982')
  and (LOCK_OWNER_ is null or LOCK_EXP_TIME_ <= '2015-04-17 14:59:09.982')
  and (
      (RES.EXECUTION_ID_ is null)
    or
    (PI.SUSPENSION_STATE_ = 1)
  )
LIMIT 1 OFFSET 0;

select RES.*
    from ACT_RU_JOB RES
    LEFT OUTER JOIN ACT_RU_EXECUTION PI ON PI.ID_ = RES.PROCESS_INSTANCE_ID_
    where (RES.TYPE_ = 'timer')
      and (RES.DUEDATE_ is not null)
      and (RES.DUEDATE_ < '2015-04-20 09:12:13.551')
      and (RES.LOCK_OWNER_ is null or RES.LOCK_EXP_TIME_ < '2015-04-20 09:12:13.551')
      and (RES.RETRIES_  > 0)
      and (
        (RES.EXECUTION_ID_ is null)
        or
        (PI.SUSPENSION_STATE_ = 1)
      )
    order by DUEDATE_;


We've added two composite indices on ACT_RU_JOB (TYPE_, EXECUTION_ID_, LOCK_OWNER_, RETRIES_, LOCK_EXP_TIME_, DUEDATE_) and (RETRIES_,EXECUTION_ID_, LOCK_EXP_TIME_,DUEDATE_) but that doesn't seem to help our cause. Any pointers on how to resolve this would be much appreciated.

12 REPLIES 12

vasile_dirla
Star Contributor
Star Contributor
Hi,
I think that limiting the number of elements involved in join operation will improve the performance of the query.

Please run this query in your environment and post here the new execution time you got.

<code>
select RES.*
from (select * from ACT_RU_JOB t1
where (t1.TYPE_ = 'timer')
      and (t1.DUEDATE_ is not null)
      and (t1.DUEDATE_ < '2015-07-15 10:42:47.464')
      and (t1.LOCK_OWNER_ is null or t1.LOCK_EXP_TIME_ < '2015-07-15 10:42:47.464')
      and (t1.RETRIES_  > 0)) as RES

  LEFT OUTER JOIN ACT_RU_EXECUTION PI ON PI.ID_ = RES.PROCESS_INSTANCE_ID_
where (
        (RES.EXECUTION_ID_ is null)
        or
        (PI.SUSPENSION_STATE_ = 1)
      )
order by DUEDATE_;
</code>

and also will be nice to have a statistic about the execution time for this query:
<code>
select * from ACT_RU_JOB t1
where (t1.TYPE_ = 'timer')
      and (t1.DUEDATE_ is not null)
      and (t1.DUEDATE_ < '2015-07-15 10:42:47.464')
      and (t1.LOCK_OWNER_ is null or t1.LOCK_EXP_TIME_ < '2015-07-15 10:42:47.464')
      and (t1.RETRIES_  > 0)
</code>

anandagarwaal
Champ in-the-making
Champ in-the-making
Hi Vasile,
Thanks for the response. Please find below response times, but is it possible to change these queries in Activiti (5.15.1) ? I thought that only way to optimize these queries is to add indices.
The below query took 1.92s
<code>
select sql_no_cache RES.*
from (select * from ACT_RU_JOB t1
where (t1.TYPE_ = 'timer')
      and (t1.DUEDATE_ is not null)
      and (t1.DUEDATE_ < '2015-07-15 10:42:47.464')
      and (t1.LOCK_OWNER_ is null or t1.LOCK_EXP_TIME_ < '2015-07-15 10:42:47.464')
      and (t1.RETRIES_  > 0)) as RES

  LEFT OUTER JOIN ACT_RU_EXECUTION PI ON PI.ID_ = RES.PROCESS_INSTANCE_ID_
where (
        (RES.EXECUTION_ID_ is null)
        or
        (PI.SUSPENSION_STATE_ = 1)
      )
order by DUEDATE_;
</code>

And this query took 1.58s

<code>
select * from ACT_RU_JOB t1
where (t1.TYPE_ = 'timer')
      and (t1.DUEDATE_ is not null)
      and (t1.DUEDATE_ < '2015-07-15 10:42:47.464')
      and (t1.LOCK_OWNER_ is null or t1.LOCK_EXP_TIME_ < '2015-07-15 10:42:47.464')
      and (t1.RETRIES_  > 0)
</code>

vasile_dirla
Star Contributor
Star Contributor
since it's open source you could modify it as you wish.
have a look into Job.xml (selectNextTimerJobsToExecute)
you could change this query if you think you could construct a more performant one (but you know.. it needs some testing to be sure will give you the correct results) Smiley Happy

if you run the original query how log time will take (is there any improvement if the where clause is done before JOIN?)
it seems that the most time is spent during the job table query.
how many records you have in the job table ?

anandagarwaal
Champ in-the-making
Champ in-the-making
The original query execution time ranges from 200ms to 1.95s during multiple runs. Currently we've 49,539 rows in ACT_RU_JOB.

Is it something to do with locks on table etc?

vasile_dirla
Star Contributor
Star Contributor
What happens if you remove the second index you added on Job table and just disable caching ?
<code>query_cache_size = 0</code>
If this table is updated too often then the cache is invalidated for each insert. (you could try to disable it and check the results)
have a look also here:
http://dev.mysql.com/doc/refman/5.6/en/optimizing-innodb-queries.html

anandagarwaal
Champ in-the-making
Champ in-the-making
Everyday we've around 2000-3000 inserts in this table. We've to ask DBA to disable cache on the table and we will able to do it on Monday. Thanks for the help Vasile.

anandagarwaal
Champ in-the-making
Champ in-the-making
We tried disabling the cache and dropping the second index (RETRIES_,EXECUTION_ID_, LOCK_EXP_TIME_,DUEDATE_) but it doesn't seem to have any effect.

anandagarwaal
Champ in-the-making
Champ in-the-making
Any more ideas/tips on how to resolve this?

jbarrez
Star Contributor
Star Contributor
Not really. It seems it's doing a table scan … which probably could be solved by more indices .. but this is trial and error and database specific.