cancel
Showing results for 
Search instead for 
Did you mean: 

parent_task_id_ in ACT_RU_TASK degrading performance

cduvvuri
Champ in-the-making
Champ in-the-making
We are using activiti 5.9
In our case we do not have scenario of populating parent_task_id_ column in ACT_RU_TASK. After having considerable volume in the activiti tables.

For any new signal/workflow change following query is fired before completing the task.

select * from ACT_RU_TASK where PARENT_TASK_ID_ = ?

As this column is null in all the records, query is taking unusally huge time doing full table scan.

In ACT_RU_TASK we have 1375k records with PARENT_TASK_ID_ as null.

On an avg. query is taking 4s
10 REPLIES 10

cduvvuri
Champ in-the-making
Champ in-the-making
Hi cduvvuri,

I also found same problem while doing performance testing of my application using activiti.
Anyone one have answer about same please post it, it will be quit helpful.

Thanks,
Prakash

martin_grofcik
Confirmed Champ
Confirmed Champ
Hi,

Which DB engine do you use?

(There is similar problem with oracle http://stackoverflow.com/questions/1017239/how-do-null-values-affect-performance-in-a-database-searc...)

Regards
Martin

paksboda
Champ in-the-making
Champ in-the-making
Hi martin.grofcik,

I used to have PARENT_TASK_ID_ null in all rows in ACT_RU_TASK table and I was facing same issue with large data in this table.

I have resolved it by one hack. I have inserted TASK_ID_ which is primary key's value into PARENT_TASK_ID_ with the help of trigger so PARENT_TASK_ID_ column will have unique value now and it will not create problem for oracle while searching.

Anyway we are not using PARENT_TASK_ID_  column anywhere so it will not adversely effect our code.
Even if someone is using PARENT_TASK_ID_  column then they can have check if TASK_ID_ and PARENT_TASK_ID_ are same then it will be root element.

This improves query performance significantly and now it is taking only 400-500 ms for fetching record compare to 4s earlier.
I hope this helps.

Thanks,
Prakash

jbarrez
Star Contributor
Star Contributor
Hmmm, interesting that it would have such an impact on performance. Of course, we are talking about a big number of tasks here.
Im not sure if there is a solution here, except for a workaround if you are not using the parent tasks… The only thing that might work is storing a flag on the task to indicate it has a parent, rather tham doing the query all the times.

paksboda
Champ in-the-making
Champ in-the-making
Hi jbarrez,

Thanks for your answer.

I have implemented below solution.
"I have inserted TASK_ID_ which is primary key's value into PARENT_TASK_ID_ with the help of trigger so PARENT_TASK_ID_ column will have unique value now and it will not create problem for oracle while searching.

Anyway we are not using PARENT_TASK_ID_ column anywhere so it will not adversely effect our code.
Even if someone is using PARENT_TASK_ID_ column then they can have check if TASK_ID_ and PARENT_TASK_ID_ are same then it will be root element."

So far it works fine. Do you see any challenges with this approach in future?

Thanks,
Prakash

jbarrez
Star Contributor
Star Contributor
As long as you are not using the parent task id, I would guess it's fine

smirzai
Champ on-the-rise
Champ on-the-rise
Even in that case, that my be fixed by a hint.
Is it possible to have database specific hints ?

paksboda
Champ in-the-making
Champ in-the-making
Thanks Joram.

solanki
Champ in-the-making
Champ in-the-making
Hi,

We need to use PARENT_TASK_ID_ in our logic to get predecessor of the current task. Currently it is always null.. what can I do to populate the actual PARENT_TASK_ID_ in this table ? And what is the reason behind activiti inserting its value as null, shouldn't it have parent task id by default ?

Also our real requirement is to find the previous user task in a workflow which i can do by querying on history table (sort by start time )…But i have no means to find out if 2 tasks are executing in parallel (as in a subprocess) or in a serial fashion. As in both cases one task will happen before the other. I am hoping by checking that the 2 tasks have same parent task id I can differentiate between parallel/serial tasks