Hyland Connect

romanoff · ‎05-04-2011

Hi,

I'm playing a bit with the PVM APIs and have discovered that



PvmProcessDefinition processDefinition = ProcessDefinitionBuilder …;

ExecutionImpl processInstance = 
     (ExecutionImpl) processDefinition.createProcessInstance();

System.out.println(processInstance.getId());
‍‍‍‍‍‍‍‍‍

would always print null.

This is due to the fact ExecutionImpl.getId() is defined to always return null and there is no setter method for defining your custom id. The comment says that subclasses may override, but ProcessDefinitionImpl.newProcessInstance() method is defined like this:

 protected InterpretableExecution newProcessInstance() {
    return new ExecutionImpl();
  }
‍‍‍‍

It means that it would always use the vanilla ExecutionImpl class for creation of instances.

Is it possible to set the id of the processInstance or to make PVM generate it for you? May be a setId() method could be added to the API?

jbarrez · ‎05-05-2011

In the PVM API, the ids are indeed null, because the PVM is non-persistent environment. So ids have no or little use there.
You'll see that the real entities used by the Activiti BPMN 2.0 layer do override the id methods (look for PersistentObject).

romanoff · ‎05-05-2011

In the PVM API, the ids are indeed null, because the PVM is non-persistent environment. So ids have no or little use there.
You'll see that the real entities used by the Activiti BPMN 2.0 layer do override the id methods (look for PersistentObject).

Yes. This is true. But I was experimenting with my own custom engine/language on top of the PVM API. I do not need persistency (yet) and want to use PVM as my in-memory process model. But I want to be able to distinguish between process instances to implement some features (e.g. correlation of a process instance with an external event or process and also for tracing purposes). This is why I asked this question.

So, to repeat: Is there any cheap possibility to set the Id of a process instance? Would it be problematic for you to allow the setId method for ExecutionImpl instances, even though current Activiti code does not use it yet?

jbarrez · ‎05-09-2011

If you are already adding your own language on top of the PVM, why not simple extend the execution classes with your own stuff, that is how we do it for the BPMN 2.0 layer after all.

romanoff · ‎05-10-2011

If you are already adding your own language on top of the PVM, why not simple extend the execution classes with your own stuff, that is how we do it for the BPMN 2.0 layer after all.

First of all, it is not decided yet, if we base it on PVM or on Activiti BPMN layer. BPMN may be even a nicer basis. But this layer introduces too many dependencies, which are perfect for human-oriented workflows (e.g. history services, the way how persistence is handled, etc), but are not so nice for soft-real time and more machine-to-machine oriented scenarios (see the last messages of my other thread for more details http://forums.activiti.org/en/viewtopic.php?f=6&t=1523&start=10 BTW, there was no reply to the latest messages yet

)

Therefore, PVM is our current basis for experiments as it introduces less dependencies.

Of course, we can completely fork the PVM and do whatever we want to do. But we'd prefer to use the trunk version of it (and avoid any changes to it) and only define our own behaviours, persistence and other mechanisms in a usual way by implementing the corresponding PVM interfaces and using PVM fluent builders and the like to inject them. And if we propose changes to the PVM trunk, then only in cases where it is not specific to our specific problems and could be of general interest. This minor change could be such an example, IMHO.

jbarrez · ‎05-12-2011

Well, it is true that the PVM is used as basis, but with Activiti our goal has always been BPMN 2.0 (not multi languages, as we did in the past).
So I would not take the pvm on trunk as something 'stable', nor would I myself base my own language on top of it.

Why would BPMN 2.0 be not good enough for machine-machine communication? You could always extract a subset of the constructs or even write a simple language on top of it if you really want to …

romanoff · ‎05-12-2011

Well, it is true that the PVM is used as basis, but with Activiti our goal has always been BPMN 2.0 (not multi languages, as we did in the past).

OK. Thanks for this clear statement.

Why would BPMN 2.0 be not good enough for machine-machine communication? You could always extract a subset of the constructs or even write a simple language on top of it if you really want to …

See the end of the other thread that I mentioned previously. It provides an insight about requirements of machine-to-machine domain as compared to human-oriented workflows.

BPMN 2.0 as such may be conceptually OK for machine-to-machine. May be with a small DSL on top, which would be converted into pure BPMN by the tooling.

But Activiti's implementation of BPMN is not quite ready for it yet, IMHO. To repeat another thread, it is the way how it handles process persistence, lack of asynchronous continuations (at least at the moment), limited (if not absent) support for really parallel and asynchronous execution of activities that makes it not such a good solution for the machine-to-machine domain. These features (or lack thereof) and design decisions introduce too much overhead and/or makes it not scalable enough for machine-to-machine applications, which are often soft-real time, have a very short life-time (in the order of milliseconds or seconds), and have a much higher load than human-oriented workflows. And disabling of certain features or providing any alternative pluggable implementations (e.g. for persistence and for concurrency) is not possible without very big modifications to Activiti's BPMN implementation, at least at the moment.

PVM, on the other side, does not mandate any of the features and just provides an in-memory model for process representation and basic blocks for modeling process execution. Persistence, asynchrony, async continuations are all left outside. So, it is very low-level. And you can implement whatever you want, though with a very significant effort probably. But, according to our measurements and tests, pure PVM (no persistence, in-memory custom-modeled async-continuations) is very competitive performance-wise for machine-to-machine. And this is where we are with our experiments currently …

In the ideal world, it would be cool if Acitiviti's BPMN 2.0 implementation would allow for custom defined ways to persist process state (both when and how to persist, e.g. disk storage, data grid, etc), to implement async continuations, to implement parallel or async execution of activities (especially invocations of external services) and so on. Even better would be, if Acitivi's implementation would provide very high-performance and scalable default implementations of these features, so that most machine-to-machine scenarios can be served out of the box.

jbarrez · ‎05-13-2011

Your analysis is correct: Activiti's focus is around human involvement - and our persistence model is tweaked to that.

If I understand your requirements, it feels like you would better be served by a solution such as Mule, which ofc doesnt have BPMN - but it does machine to machine communication excellent in my experiences. If you want it even more lower-level, you might want to look into Akka … but that is really lowlevel stuff.

ronald_van_kuij · ‎05-13-2011

I will create a new branch shortly with a changed implementation of the job scheduler ( syncing my current branch seems to fail). I'll also include a version of async support in there. We already used it locally, it is not that hard to implement if you know the internals, but writign the testcases, documentation and examples is what makes it more work. If someone would volunteer for this, that would be great (currently to busy during the day with paid projects and tennis/golf in my free time). Maybe I'll create a different branch for that so I can easier keep things separated and they can be more easily merged into trunk if the core team likes it.

And afaics, this 'new' job scheduler could be used to create a way where 'multi-threading' parallel execution would be possible without using persistence for the jobs, but I'm not fully sure about this. So instead of an async attribute with a value of true or false (or async) it could be something different like 'execute' with values 'async' (transaction demarcation) or 'parallel' which makes it multi-threaded but running in the same transaction. But I'm no expert in passing on transactions between threads etc, so maybe I'm nonsense here (or talk out of your neck as we say in the lowlands)

romanoff · ‎05-16-2011

I will create a new branch shortly with a changed implementation of the job scheduler ( syncing my current branch seems to fail). I'll also include a version of async support in there. We already used it locally, it is not that hard to implement if you know the internals, but writign the testcases, documentation and examples is what makes it more work. If someone would volunteer for this, that would be great (currently to busy during the day with paid projects and tennis/golf in my free time). Maybe I'll create a different branch for that so I can easier keep things separated and they can be more easily merged into trunk if the core team likes it.

And afaics, this 'new' job scheduler could be used to create a way where 'multi-threading' parallel execution would be possible without using persistence for the jobs, but I'm not fully sure about this. So instead of an async attribute with a value of true or false (or async) it could be something different like 'execute' with values 'async' (transaction demarcation) or 'parallel' which makes it multi-threaded but running in the same transaction. But I'm no expert in passing on transactions between threads etc, so maybe I'm nonsense here (or talk out of your neck as we say in the lowlands)

Thanks for the info! Interesting news.

Are there any plans to improve the persistence layer as well? Both in terms of flexibility and configurability, but also in terms of supported backends, e.g. SQL, NoSQL, in-memory data grids like Terracotta, Infinispan, Coherence, Hazalcast?

Hyland Connect

ExecutionImpl.getId() always returns null