Hyland Connect

lichtin · ‎08-05-2011

Hi
I'm thinking about how workflows can be executed in a distributed system where multiple engines will be running.
A message that starts or resumes a workflow process can arrive anywhere in the system.

So for example when a message M2 arrives, the nearest engine E2 cannot just retrieve the workflow process state
from its DB tables and continue, as another engine E1 might currently be executing the flow that was started earlier
by message M1. Some coordination seems necessary.

One approach could be to transfer the state. As soon as engine E1 determines a wait state, the updated process state
would be transferred to the DB tables of engine E2 and E2 signalled to handle message M2. But then messages would
be serialized and could not serve as interrupting events (if such a thing is actually supported).

Alternatively, each workflow process instance could be assigned an engine and messages would always be rerouted
to this engine.

Anyway, just started thinking about it.

What I'm interested in is, has someone had to handle a similar situation?

Martin

ronald_van_kuij · ‎08-06-2011

If there are two concurrent paths that have no relation at all, not even writing variables, there is no reason at all that two parts of the same process instances could not be executed by different engines… Simply because the processinstances is not one big serialized object, each step can be separately executed. Great isn't it?

If another engine is doing something with the same instance that 'collides'(!!!), you will get an optimistic lock exception and should retry. Often a few hundred miliseconds later it will succeed.

Now what is realy cool is that , in 5.7 a retry interceptor will be introduced that does this for you…

lichtin · ‎08-08-2011

Great indeed!
So do I understand you correctly that more than one engine can work with the same DB tables?

Simply because the processinstances is not one big serialized object, each step can be separately executed.

Can you explain "each step"? [Excuse my ignorance, I'm really very new to Activiti..]

If another engine is doing something with the same instance that 'collides'(!!!), you will get an optimistic lock exception and should retry.

Interesting, so when will the engine notice the collision? At the time it wants to write back the process state?
What is the granularity of the optimistic lock? Is perhaps the engine only updating the modified variables?

Thank you for all the insight. And please feel free to point me to some code if necessary 🙂

ronald_van_kuij · ‎08-08-2011

1: yes, same db tables (you SHOULD in fact)
2: step = task, sorry for the wrong choice of words
3: yes, when writing to the db. Records have a version number that changes if something else updates afte you read it. So you have version 3 and something else updates, that will be version 4. This can be detected since you have still version 3.

lichtin · ‎08-10-2011

3: yes, when writing to the db. Records have a version number that changes if something else updates afte you read it. So you have version 3 and something else updates, that will be version 4. This can be detected since you have still version 3.

Sounds good.
So is the granularity of this versioning on the overall process state (coarse)? or is it more fine-grained?
(For example, if the only effect of an engine running a process instance is the change of one workflow variable,
will another engine, running the same process instance concurrently, fail to later update the process state even
though it did not update that variable?)

processinstances is not one big serialized object

Would be interested how the process state is saved..

Another issue I'm concerned with when engines are simultaneously executing the same processes in different JVMs
is that of how the "timers" work. Assuming jobExecutorActivate=true for all engines, won't this fire the timers
in every JVM? Guess I need to understand how this job executor really works.

ronald_van_kuij · ‎08-19-2011

3: yes, when writing to the db. Records have a version number that changes if something else updates afte you read it. So you have version 3 and something else updates, that will be version 4. This can be detected since you have still version 3.
Sounds good.
So is the granularity of this versioning on the overall process state (coarse)?

No

or is it more fine-grained?

Yes

(For example, if the only effect of an engine running a process instance is the change of one workflow variable,
will another engine, running the same process instance concurrently, fail to later update the process state even
though it did not update that variable?)

If A reads, B reads, A updates, B updates it fails
If A reads, A updates, B reads, B updates it succeeds

processinstances is not one big serialized object
Would be interested how the process state is saved..

Look at the DB

Another issue I'm concerned with when engines are simultaneously executing the same processes in different JVMs
is that of how the "timers" work. Assuming jobExecutorActivate=true for all engines, won't this fire the timers
in every JVM?

No, since timers are read and locked before executing (look at the database table for jobs)

Guess I need to understand how this job executor really works.

It's not that difficult, but I have some optimizations in the pipeline that make it perform waaaaayyyy faster. Maybe for 5.9 (or 5.8 if the releases will be every two months from now on)

Hyland Connect

Distributed Workflow Execution