Lots of timer tasks, slow lock acquisition times

Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎06-23-2014 05:29 PM
Id like to know if this approach is feasible and I'll try to keep this first post simple.
Requirements call for recurring processes which wait and trigger at configurable intervals (weekly, monthly, yearly, etc). 10M+ active processes exist, with 50-100k becoming candidates any given day, each having the same ISO timestamp (current timestamps in legacy system are day precision).
I had some questions about load so I wrote a couple simple tests which yielded these results, which surprised me. This was done using a quad-core i7 with 16GB ram and MySQL 5.6, using simple process with a timer task that calculates the next date and loops and then waits and so on. The quality on this bites but it was the first hosting site I found
http://postimg.org/image/vj748msh5/
Candidates Total Records Lock Acquisition Time
50k 100k ~330ms
50k 1M ~4s
50k 10M ~2m
Candidates are process instances that have a trigger time that is less than the current time. Total records include active candidates, plus process instances with trigger times in the future. I wrapped the AcquireJobsCmd execution with a timer to obtain these numbers.
I notice no indexes exist on the ACT_RU_JOB table, by adding some indexes on DUEDATE_ and LOCK_EXP_TIME_ I was able to speed up the query dramatically, obtaining ~14s for a 10M record table. But, this may have other undesirable side effects and this is still much slower than I would expect to obtain a lock to execute a process…so this was just an experiment.
In theory, should Activiti be able to handle this better? Or is the approach flawed in the first place? Other options exist, but it's very tempting to have one product do all the queuing and locking work for us 😉 especially since it seems to be integral to the process flow.
Thanks in advance for any input!
Requirements call for recurring processes which wait and trigger at configurable intervals (weekly, monthly, yearly, etc). 10M+ active processes exist, with 50-100k becoming candidates any given day, each having the same ISO timestamp (current timestamps in legacy system are day precision).
I had some questions about load so I wrote a couple simple tests which yielded these results, which surprised me. This was done using a quad-core i7 with 16GB ram and MySQL 5.6, using simple process with a timer task that calculates the next date and loops and then waits and so on. The quality on this bites but it was the first hosting site I found

http://postimg.org/image/vj748msh5/
Candidates Total Records Lock Acquisition Time
50k 100k ~330ms
50k 1M ~4s
50k 10M ~2m
Candidates are process instances that have a trigger time that is less than the current time. Total records include active candidates, plus process instances with trigger times in the future. I wrapped the AcquireJobsCmd execution with a timer to obtain these numbers.
I notice no indexes exist on the ACT_RU_JOB table, by adding some indexes on DUEDATE_ and LOCK_EXP_TIME_ I was able to speed up the query dramatically, obtaining ~14s for a 10M record table. But, this may have other undesirable side effects and this is still much slower than I would expect to obtain a lock to execute a process…so this was just an experiment.
In theory, should Activiti be able to handle this better? Or is the approach flawed in the first place? Other options exist, but it's very tempting to have one product do all the queuing and locking work for us 😉 especially since it seems to be integral to the process flow.
Thanks in advance for any input!
Labels:
- Labels:
-
Archive
3 REPLIES 3
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎07-04-2014 05:05 AM
Interesting experiment.
I'm wondering though where is the bulk % of the time being spent? Is it the job executor? Is it the execution of the sub process?
How do you start this process? Is it just one? Do you spawn one for every of your entries?
The numbers do seem high, so Im sure there is optimization possible here.
I'm wondering though where is the bulk % of the time being spent? Is it the job executor? Is it the execution of the sub process?
How do you start this process? Is it just one? Do you spawn one for every of your entries?
The numbers do seem high, so Im sure there is optimization possible here.

Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎07-09-2014 12:45 PM
Hey, thanks for the reply.
The bulk of the time is being spent in the job executor when trying to aquire new jobs, within the queries "selectNextJobsToExecute_mysql" and "selectExclusiveJobsToExecute_mysql". Extracting the actual queries and params out using debugging and running in a MySQL client yields very similar results, give or take a second.
The included sub-process is a no-op that was just added for testing, just a simple log statement. Changing it to a script task and/or removing it yields the same overall result.
For brevity I've simplified the code below, but the processes are started by essentially the following inside of a java class (main and/or embedded in a webapp).
for(int i = 0; i < numProcesses; i++) {
final Map<String, Object> variableMap = new HashMap<String, Object>();
variableMap.put("ID", String.valueOf(i));
variableMap.put("nextDate", isoDateString);
runtimeService.startProcessInstanceByKey("testProcess", String.valueOf(i), variableMap);
}
Also noticed that when starting up a new node in a cluster and deploying processes, there is a "selectJobsByConfiguration" query which gets executed, which also results in a very long startup time for the process being deployed in a new JVM.
The bulk of the time is being spent in the job executor when trying to aquire new jobs, within the queries "selectNextJobsToExecute_mysql" and "selectExclusiveJobsToExecute_mysql". Extracting the actual queries and params out using debugging and running in a MySQL client yields very similar results, give or take a second.
The included sub-process is a no-op that was just added for testing, just a simple log statement. Changing it to a script task and/or removing it yields the same overall result.
For brevity I've simplified the code below, but the processes are started by essentially the following inside of a java class (main and/or embedded in a webapp).
for(int i = 0; i < numProcesses; i++) {
final Map<String, Object> variableMap = new HashMap<String, Object>();
variableMap.put("ID", String.valueOf(i));
variableMap.put("nextDate", isoDateString);
runtimeService.startProcessInstanceByKey("testProcess", String.valueOf(i), variableMap);
}
Also noticed that when starting up a new node in a cluster and deploying processes, there is a "selectJobsByConfiguration" query which gets executed, which also results in a very long startup time for the process being deployed in a new JVM.
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
‎07-22-2014 02:43 AM
Is there some way you could share your test code for this? I would like to see and interprete the results for myself.
