<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Synchronization issue in parallel multi-instance call activity in Alfresco Archive</title>
    <link>https://connect.hyland.com/t5/alfresco-archive/synchronization-issue-in-parallel-multi-instance-call-activity/m-p/211456#M164586</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi guys,&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I am working on a task, where I need to have a multi-instance callActivity, which iterates over a collection and is executed in parallel. However, I may have found an issue in the synchronization of the threads that execute the started sub-processes, because in most of my tests, two threads try to update the values of &lt;/SPAN&gt;&lt;STRONG&gt;nrOfCompletedInstances&lt;/STRONG&gt;&lt;SPAN&gt; and &lt;/SPAN&gt;&lt;STRONG&gt;nrOfActiveInstances&lt;/STRONG&gt;&lt;SPAN&gt; at the same time. When this happens, these two variables are left with wrong values.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;For example, I found the following lines in Activiti's debug logs (they are also attached):&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;11:17:53,623 [pool-1-thread-2] Multi-instance 'Activity(callactivity1)' instance completed. Details: loopCounter=1, nrOrCompletedInstances=1,nrOfActiveInstances=7,nrOfInstances=8&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;11:17:53,626 [pool-1-thread-1] Multi-instance 'Activity(callactivity1)' instance completed. Details: loopCounter=0, nrOrCompletedInstances=1,nrOfActiveInstances=7,nrOfInstances=8&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;As you can see, these two threads try to update the variables at nearly the same time and the second one sets wrong values for them. Instead of setting 2 for &lt;/SPAN&gt;&lt;STRONG&gt;nrOfCompletedInstances&lt;/STRONG&gt;&lt;SPAN&gt; and 6 for &lt;/SPAN&gt;&lt;STRONG&gt;nrOfActiveInstances&lt;/STRONG&gt;&lt;SPAN&gt;, it sets 1 and 7, respectively.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I looked around the Activiti engine's source code and the issue seems to be in the &lt;/SPAN&gt;&lt;STRONG&gt;org.activiti.engine.impl.bpmn.behavior.ParallelMultiInstanceBehavior&lt;/STRONG&gt;&lt;SPAN&gt;'s &lt;/SPAN&gt;&lt;STRONG&gt;leave()&lt;/STRONG&gt;&lt;SPAN&gt; method. There, the values of &lt;/SPAN&gt;&lt;STRONG&gt;nrOfCompletedInstances&lt;/STRONG&gt;&lt;SPAN&gt; and &lt;/SPAN&gt;&lt;STRONG&gt;nrOfActiveInstances&lt;/STRONG&gt;&lt;SPAN&gt; are retrieved from the execution, incremented (or decremented) and then set back into the execution. These two operations are not in a synchronized block, however, nor are they in a separate DB transaction. This probably leads to the following situation:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;1) Thread 1 retrieves the variables and they have values: nrOfCompletedInstances = 0, nrOfActiveInstances = 8.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;2) Thread 2 retrieves the variables, before Thread 1 updates them and they have the same values: nrOfCompletedInstances = 0, nrOfActiveInstances = 8.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;3) Both of them modify the variables independently, resulting in the values: nrOfCompletedInstances = 1, nrOfActiveInstances = 7.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;4) Both of them attempt to set them back into the execution with these wrong values.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Is this a known issue and if so, have you made any plans for fixing it soon (I can try to submit a pull request)? Should I create an issue in Activiti's JIRA?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Note: The attached maven project contains my BPMN diagrams (I attached it as a .txt file, since the forum does not allow ZIP files *pardon*).&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thanks and best regards,&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Alexander&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 23 Nov 2016 15:07:06 GMT</pubDate>
    <dc:creator>alexander_tsvet</dc:creator>
    <dc:date>2016-11-23T15:07:06Z</dc:date>
    <item>
      <title>Synchronization issue in parallel multi-instance call activity</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/synchronization-issue-in-parallel-multi-instance-call-activity/m-p/211456#M164586</link>
      <description>Hi guys,I am working on a task, where I need to have a multi-instance callActivity, which iterates over a collection and is executed in parallel. However, I may have found an issue in the synchronization of the threads that execute the started sub-processes, because in most of my tests, two threads</description>
      <pubDate>Wed, 23 Nov 2016 15:07:06 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/synchronization-issue-in-parallel-multi-instance-call-activity/m-p/211456#M164586</guid>
      <dc:creator>alexander_tsvet</dc:creator>
      <dc:date>2016-11-23T15:07:06Z</dc:date>
    </item>
    <item>
      <title>Re: Synchronization issue in parallel multi-instance call activity</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/synchronization-issue-in-parallel-multi-instance-call-activity/m-p/211457#M164587</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;I forgot to mention, that this leads to some weird behavior. For example, in my test project, the parent process sets a "messages" variable containing 8 strings (the numbers from 1 to 8 represented as strings). Then it starts a sub-process (via a call activity) for each of the strings. These sub-processes then print the string, for which they were started. Normally, this should result in an output similar to the following:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;2 1 3 4 5 6 7 8&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;However, it looks like Activiti is executing some of the sub-processes a second or even a third time, because in each run, it prints three or four more messages (which appear to be chosen at random):&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;2 1 3 4 5 6 7 8 &lt;/SPAN&gt;&lt;STRONG&gt;1 4 4&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;This is a problem for me, since in my real project, these sub-processes should do network calls, which should not be repeated.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 26 Nov 2016 08:44:38 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/synchronization-issue-in-parallel-multi-instance-call-activity/m-p/211457#M164587</guid>
      <dc:creator>alexander_tsvet</dc:creator>
      <dc:date>2016-11-26T08:44:38Z</dc:date>
    </item>
    <item>
      <title>Re: Synchronization issue in parallel multi-instance call activity</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/synchronization-issue-in-parallel-multi-instance-call-activity/m-p/211458#M164588</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Try adding asynchronous exclusive step at the end of subprocess. So work will be done in parallel, results persisted, and then executions will close one by one.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Repeated calls are due to transaction nature of processes engine. &lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 28 Nov 2016 05:34:24 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/synchronization-issue-in-parallel-multi-instance-call-activity/m-p/211458#M164588</guid>
      <dc:creator>warper</dc:creator>
      <dc:date>2016-11-28T05:34:24Z</dc:date>
    </item>
    <item>
      <title>Re: Synchronization issue in parallel multi-instance call activity</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/synchronization-issue-in-parallel-multi-instance-call-activity/m-p/211459#M164589</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;SPAN&gt;Hi Warper and thanks for the response!&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;I added another asynchronous step in the end of the sub-process. This does workaround the issue and now the task, which prints the message is only executed once. However, I noticed that the new task is once again executed more times than necessary, which increases the time it takes to execute the entire main process. In the test project I attached, it adds 10 seconds for 8 sub-processes. However, in my real scenario there could be 100 sub-processes, which would mean that more tasks will be retried and as a result it will take more time for the job executor to retry them.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Actually, I just tried to spawn 100 sub-processes and the main process never even finished. This may be due to the fact that Activiti retries a failed job 3 times and then gives up (69 tasks were retried this time). So the workaround you suggested works for small number of sub-processes but not for 100 or more. Should I open a bug in Activiti's JIRA?&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Also, does "exclusive" have any effect on this issue? As far as I understood from Activiti's User Guide, "exclusive=true" prevents jobs from a &lt;/SPAN&gt;&lt;STRONG&gt;single&lt;/STRONG&gt;&lt;SPAN&gt; process instance to execute concurrently. In my situation, there are &lt;/SPAN&gt;&lt;STRONG&gt;multiple&lt;/STRONG&gt;&lt;SPAN&gt; sub-processes (which I assume are separate process instances), which execute in parallel, but all steps in them are executed sequentially.&lt;/SPAN&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 28 Nov 2016 12:07:56 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/synchronization-issue-in-parallel-multi-instance-call-activity/m-p/211459#M164589</guid>
      <dc:creator>alexander_tsvet</dc:creator>
      <dc:date>2016-11-28T12:07:56Z</dc:date>
    </item>
    <item>
      <title>Re: Synchronization issue in parallel multi-instance call activity</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/synchronization-issue-in-parallel-multi-instance-call-activity/m-p/211460#M164590</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Alexander,&lt;BR /&gt;I think you may have actually stumbled across an edge case that is not handled properly.&lt;BR /&gt;The intended behavior of "exclusive" is (as you already understand) to prevent concurrent execution within a single process instance. It's very useful for parallel joins which can lead to pre-emptive DB lock contention and can also be useful in multi-instance scenarios.&lt;BR /&gt;&lt;BR /&gt;Looking at your model, I see you are already using Exclusive and Async on the Sub Process Call, I doubt this will have any effect since Called Activities (sub Processes) are their own individual process. Async may help a little since the parallel sub processes may not get executed simultaneously, but there is really no guarantee of this.&lt;BR /&gt;&lt;BR /&gt;I agree with your analysis that the retrieval and updating of the nrOfCompletedInstances variable in the leave method of the behavior class should be in a synchronized block. For the majority of scenarios, the sub-process instances will not complete at exactly the same time, so there will be little if any performance impact, however for your scenario, it will resolve the issue.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Can you go ahead and create a Jira and then log the Jira number back into the forum for reference?&lt;BR /&gt;&lt;BR /&gt;Thanks for your patience.&lt;/P&gt;&lt;P&gt;&amp;nbsp;Regads,&lt;/P&gt;&lt;P&gt;Greg&lt;/P&gt;&lt;P&gt;&lt;A href="https://migration33.stage.lithium.com/t5/tag/bp3/tg-p"&gt;&lt;/A&gt;‌&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 09 Dec 2016 19:27:40 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/synchronization-issue-in-parallel-multi-instance-call-activity/m-p/211460#M164590</guid>
      <dc:creator>gdharley</dc:creator>
      <dc:date>2016-12-09T19:27:40Z</dc:date>
    </item>
    <item>
      <title>Re: Synchronization issue in parallel multi-instance call activity</title>
      <link>https://connect.hyland.com/t5/alfresco-archive/synchronization-issue-in-parallel-multi-instance-call-activity/m-p/211461#M164591</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Greg,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks for the response! I opened a new issue in Activiti's JIRA. Here's the link:&lt;/P&gt;&lt;P&gt;&lt;A class="link-titled" href="https://activiti.atlassian.net/browse/ACT-4255" title="https://activiti.atlassian.net/browse/ACT-4255" rel="nofollow noopener noreferrer"&gt;https://activiti.atlassian.net/browse/ACT-4255&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;Alexander&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 04 Jan 2017 10:25:47 GMT</pubDate>
      <guid>https://connect.hyland.com/t5/alfresco-archive/synchronization-issue-in-parallel-multi-instance-call-activity/m-p/211461#M164591</guid>
      <dc:creator>alexander_tsvet</dc:creator>
      <dc:date>2017-01-04T10:25:47Z</dc:date>
    </item>
  </channel>
</rss>

