Hyland Connect

mamue · ‎06-19-2012

Hi,

we provide a list of open tasks to our users.
The same user can display this list in different web sessions.
He or she may proceed the same task from different sessions at nearly the same time.
This results in multiple calls to taskService.complete(taskId) with the same taskId.
I expect that the second and all further calls of taskService.complete(taskd) fail, but they do not - at least if the first call is still be executed.

How can I prevent multiple executions of the same task?

Kind regards,
Markus Mueller

frederikherema1 · ‎06-22-2012

If a flush is needed before the delegate is called, maybe using activiti:async="true" can help. This way, at least, one of the 2 competing consumer-threads will get the OptimisticLockException (also in current codebase without issue resolved) and fail before the time-consuming operation starts. Also, any NEW threads completing the task, will get a "task not found"…

mamue · ‎06-25-2012

Frederik wrote:
> … maybe using activiti:async="true" can help …

Ok, this solves the problem that the failure is time-consuming.
If two users want to complete the same task at the same time, one of them will succeed, the other will fail immediately.

But now there is another problem:
The following asynchronous tasks are executed by the Activiti Job Executor using another thread from a thread pool.
This thread knows
- neither our SessionContext containing the ID of the currently processed tenant
- nor the UI session data of our GUI framework
- nor any data based on Thread Local Storage.

Concerning our concrete use case (production of an ID card on client side) the attribute activiti:async="true" does not help, sorry.

Kind regards,
MaMü

frederikherema1 · ‎06-26-2012

All variables relevant to the process can be stored as process variables. All other stuff that needs to be referenced (this being services, temporal stuff, …) can be done using your own lookup mechanism or alternatively, following JPAVariableType's example, you can create process-variables types that do eg. a lookup in your context for a certain value instead of just storing a string or an int.

But it seems that your process doesn't have an async nature, you do the call and expect, when the thread returns, the process has finished. Then the only option is to create the mutex mechanism to prevent 2 tasks from starting. But if this is a DB-mutex, again this can create the same issues (unless a full-table lock is done)…

mamue · ‎06-26-2012

Hi Frederik,

you wrote:
> But it seems that your process doesn't have an async nature …

Ok, I think so too.

> … the only option is to create the mutex mechanism to prevent 2 tasks from starting.
> But if this is a DB-mutex, again this can create the same issues (unless a full-table lock is done)…

Look at the following figure:
[attachment=0]lock.png[/attachment]

Locking the task completion has to take place in an own, isolated database transaction before taskService.complete() is called.
As soon as taskService.complete() returns (or throws an exception) the task must be unlocked. This has to take place also in an own isolated transaction.

We want to lock because the service task may contain non-transactional operations (sending emails, producing cards, …) which can't be rolled back and which must be prevented to be executed several times.
Locking must be done via database (instead of java.util.HashMap) because there might be several JVMs involved in case of load balancing etc.
The locking transactions must be own and isolated because we want to fail fast (i.e. before the first complete call is executed completely).

Locking/unlocking can be achieved by inserting/deleting a new record in an additional database table with the taskId as key.
This way only one of the simultaneously working users succeeds in completing the task, all other calling users will fail immediately.

I assume that this approach will reliably work also across process boundaries and hosts - provided that the system does not crash.

If a machine crashs and the unlocking is skipped there will remain a locked taskId in the additional database table.
Years ago, in our previous product we implemented a database based mutex with an additional thread which did a "heartbeat" on every active locking record:
Every few seconds the locking record was updated with a current timestamp and an individual ID by the thread which held the lock.
A competing locking thread in another JVM was allowed to replace the record by it's own ID and current timestamp if and only if it detected a foreign old timestamp (e.g. older than one minute).
In this situation the first thread probably has died unexpectedly.
Some more years ago we used a simpler database based locking mechanism which needed user interaction after system has crashed.

Probably the simple solution which needs user interaction after system crash is not state of the art.
So we must implement the heartbeat based mutex in our current product.

On the other hand the requirement to lock taskIds is not specific to our product.
I assume that most users of Activiti will place some non-transactional operations in their processes following user tasks.
Therefore it would be a good idea to implement this feature inside Activiti, isn't it?

Kind regards,
Markus Müller

frederikherema1 · ‎06-26-2012

I see it more as a "special type of activity" or perhaps execution-listener that can be plugged on a service-task (listeners at start, end) that do this locking and unlocking rather than something inside the engine itself, but I can be mistaken. In my opinion this is outside the scope of the engine itself.

Nevertheless, it's a valid additional feature for activiti and could be more like a mechanism "on top".

mamue · ‎06-26-2012

Hi Frederik,

I'm afraid of that if we use execution listeners then locking will take place too often as shown in the following figure:
[attachment=0]lock2.png[/attachment]
Actually we need only the blue lock/unlock, not the multiple green locks/unlocks.
What happens if a competing call to taskService.complete() comes just after the unlock of task A but before the lock of task B?
Where does the taskId in the start and end listeners come from?

Another question is what happens if exceptions are thrown.
Are end listeners called in case of exceptions which are thrown while the service tasks are executed?

What about an additional attribute activiti:threadsafe=none/local/global which can be added to a <process> element?

- none: no locking
- local: locking using java.util.Map or something similar
- global: database based locking

"none" may be used by single user applications.
"local" may be used by web applications, multiuser- and multithreaded-applications if there is at most a single JVM active.
"global" must be used if the application runs on several machines in parallel.

This way the time consuming database based locking can be avoided if it is not needed.

Kind regards,
Markus Müller

frederikherema1 · ‎06-27-2012

Good point about the exceptions… listeners aren't notified about this.

I'm not the only one to decide about this feature but it seems that this is very useful when dealing with non-transactional resources. Indeed, process-level declaration and defaulting to "none" would be the best approach.

It seems you have a clear idea on what needs to be implemented and know a thing or two about the Activiti internals. It would certainly help if you could contribute a patch, if you decide to implement it.

heymjo · ‎06-27-2012

I'm a bit confused here as it seems several problems are being described, but with regards to the task.complete() locking i tend to agree that this should be handled outside of the engine in your application layer. Whatever Activiti would implement to handle this i doubt it would suffice for all possible cases and the added complexity might just not be worth it there.

From what i understand about the problem, would it not be easier to just wrap your task.complete() with a hazelcast-lock on the task id ? http://www.hazelcast.com/documentation.jsp#Lock ?

mamue · ‎06-27-2012

hazelcast.com - very cool library I didn't know until now. Good hint, thank you very much, Jorg.

We could use this library to encapsulate the taskService.complete call:


Lock lock = Hazelcast.getLock("taskId" + taskId);
lock.lock();
try {
    taskService.complete(taskId);
} finally {
    lock.unlock();
}

As far as I know Hazelcast uses by default IP multicast to find other servers.
There additional problems arise:
- What if there are other servers in LAN which are working on other databases or even other applications?
- What if there are firewalls active?
- What if there is no IP at all?

It is possible to configure concrete TCP/IP details of all cooperating servers.
But this will be extra work and perhaps tricky for an administrator which does not know the internals.

We already have a common base - the database.
Our application and also Activiti does not run without a common database.
It is obvious to an administrator that she must configure firewalls etc. so that each server can reach the single database via the JDBC URL etc.

A transparent way to do the locking is therefore to use this single database.
The only drawback is that we do not know the concrete type of RDBMS so the used locking mechanism must be a very basic one (e.g. select+insert+commit and delete+commit)
This way the whole TCP/IP configuration stuff may be avoided.
The remaining configuration options probably will be the time span after which a server is considered dead after its last heartbeat, and the time span between the heartbeats.
I expect that our old default values will also work very well in this situation: The heart of an active lock beats every six seconds and 60 seconds after the last heartbeat a server may be considered as dead by competing servers.
With the additional <process> attribute activiti:threadsafe=none/local/global the overhead of the heartbeats is restricted to distributed deployments. And at all, as long as there are no mutex active there are also no heartbeats.

The following screenshot shows how it may look in the database:
[attachment=0]mutex.png[/attachment]
The table ACT_RU_MUTEX may be used by Activiti for any mutex requirements.
Here the NAME_ is composed of an "class name" and the taskId itself.
The TIME_STAMP_ holds the time of the last heartbeat. The first record seems to belong to a died server and may be evicted.
The GUID_ is a random value which is used by the servers to distinguish own records from foreign records. Each server generates the GUID_ itself.
The BEAT_ stores the configured seconds between the heartbeat.
The LATENCY_ stores the configured time after which a heartbeat may be considered as the last of a died server.

The Mutex implementation may be done apart from any concrete use case so it may be used throughout the whole project.

Kind regards,
Markus Müller

heymjo · ‎06-28-2012

A transparent way to do the locking is therefore to use this single database.

I understand your reasoning but i don't agree with it. For me simplicity goes a very long way, and the 5 lines of java to obtain a solid global lock - even across different applications on different clusters - outweigh any database based mechanism which will always be local to one application.

But I guess it also depends on the technology that is trusted the most in your organisation, that's definitely a factor as well. Which is exactly why activiti should not embed this in the core.

Hyland Connect

How can I prevent a process execution is run multiple times?