cancel
Showing results for 
Search instead for 
Did you mean: 

Activiti SQL query performance

shailendra1
Champ in-the-making
Champ in-the-making
During one of our performance tuning exercise, we observed that while querying for all variables using

Activity was using following query defined in the xml

${limitBefore}
select distinct RES.*,
VAR.ID_ as VAR_ID_, VAR.NAME_ as VAR_NAME_, VAR.TYPE_ as VAR_TYPE_, VAR.REV_ as VAR_REV_,
VAR.PROC_INST_ID_ as VAR_PROC_INST_ID_, VAR.EXECUTION_ID_ as VAR_EXECUTION_ID_, VAR.TASK_ID_ as VAR_TASK_ID_,
VAR.BYTEARRAY_ID_ as VAR_BYTEARRAY_ID_, VAR.DOUBLE_ as VAR_DOUBLE_,
VAR.TEXT_ as VAR_TEXT_, VAR.TEXT2_ as VAR_TEXT2_, VAR.LONG_ as VAR_LONG_
${limitBetween}

${orderBy}
${limitAfter}

The actual query fired was

select distinct RES.*,
VAR.ID_ as VAR_ID_, VAR.NAME_ as VAR_NAME_, VAR.TYPE_ as VAR_TYPE_, VAR.REV_ as VAR_REV_,
VAR.PROC_INST_ID_ as VAR_PROC_INST_ID_, VAR.EXECUTION_ID_ as VAR_EXECUTION_ID_, VAR.TASK_ID_ as VAR_TASK_ID_,
VAR.BYTEARRAY_ID_ as VAR_BYTEARRAY_ID_, VAR.DOUBLE_ as VAR_DOUBLE_,
VAR.TEXT_ as VAR_TEXT_, VAR.TEXT2_ as VAR_TEXT2_, VAR.LONG_ as VAR_LONG_
from ACT_RU_TASK RES
left outer join ACT_RU_VARIABLE VAR ON RES.PROC_INST_ID_ = VAR.EXECUTION_ID_ and VAR.TASK_ID_ is null;

Having a look at the query it seems that the distinct part is unnecessary as using distinct in left outer join with select including values also from table OTHER THAN Base table does not make sense in my opinion. The rows of resultset of this query without distinct would be distinct anyway.
Because of distinct, the database has to work extra. For e.g., using explain plan on this query for MySQL, it shows that this uses a temporary table to copy first all the results to temp area and then process distinct which is un necessary extra work. If the act_ru_variable table size is large, then this involves huge processing.

Removing the distinct returns the same number of rows only this time the processing done by database is not required.
Of course if the query contains only columns from base tables then using distinct makes sense.

Have I overlooked something here or not aware of the reason why distinct is being used ?
3 REPLIES 3

shailendra1
Champ in-the-making
Champ in-the-making
I would appreciate if somebody can help me in this !

frederikherema1
Star Contributor
Star Contributor
Thanks for reporting this. We are looking into this and have made adjustments to the code. The reason this was introduced was an issue in mssql and row-counts for this (https://github.com/Activiti/Activiti/commit/58993f3e22811ea2c00504697174e8d6effc84aa).

In some circumstances, the distinct was really needed. But after some further investigation and testing (https://github.com/Activiti/Activiti/commit/efdcdf99549456dd9b982b3824940b747d96d4ed - https://github.com/Activiti/Activiti/commit/41108dbb5468562ba942e656694ff173f879e10b) we are pretty sure we can remove the distinct, as the mssql issue is adressed by a db-specific SQL-statement instead (selectTaskWithVariablesByQueryCriteria_mssql_or_db2).

This will be part of the next release (5.15).

shailendra1
Champ in-the-making
Champ in-the-making
Thanks Fred,
Glad to contribute !