cancel
Showing results for 
Search instead for 
Did you mean: 

What's the best way to progress a stuck flow?

cwhite99
Champ on-the-rise
Champ on-the-rise

Hi,

I'm using the free version of Activiti 5.20.0 embedded into my Java application running on Wildfly 10. It's sharing the same database as my application. 

At the moment I'm just creating the workflows in bpmn format within intellij (using the XML).

I have a few instances of a workflow in a live system that are stuck and I'm looking for the best way to progress them. For some strange reason the java delegate that was supposed to fire after a timer didn't actually happen and there are no active tasks. It happened  when there were a lot of concurrent requests and I think the root cause can be fixed by upgrading to Activiti 5.22, I have tests that suggest that is the case. This problem isn't my main concern, fixing the broken processes is.

I haven't used Activiti Explorer but would this be a suitable way to interrogate the state of the processes and force a step to happen again or move the process forward? I've been told only the paid-for enterprise edition has these features. I'm not quite sure how this would connect to an embedded instance inside my app.

The alternative plan is to just add in a couple of additional rest end points to the application to allow for this and run some curl commands to push them on.

Any advice would be much appreciated.

Craig

9 REPLIES 9

cwhite99
Champ on-the-rise
Champ on-the-rise

Update: I've got Activiti Explorer running in a separate Tomcat instance and using the same database as my webapp running in Wildfly. I can see all of the jobs and processes. Unfortunately the ActivitiExplorer doesn't allow me to execute any of the timers manually since it doesn't know what to do with my user defined variables.:

I've also tried to "Replay" the process instances but nothing happens when I click the button. Ideally I'd like to "execute next event"

FYI below is my basic process flow. The process waits for a document to be added to a job and a timer to finish (number of days). Once these two are complete, it fires a java delegate to complete the job. For some reason the timer and documentAdded events occurred but the completeJob java delegate didn't fire. This only happens 1 out of 100 times and when there are a lot of concurrent threads. So what I'm trying to do is get the completeJob java delegate to fire again. Alternatively I'll just do curl requests to manually do the same action as the java delegate would have and then attempt to close down the activiti process.

cjose
Elite Collaborator
Elite Collaborator

is completeJob/the parallel gateway set to be async? If not, it appears that there are some errors when it is executing the logic in completeJob upon the firing of timer and then rolling all the way back to the timer. Can you share the bpmn xml?

Ciju

cwhite99
Champ on-the-rise
Champ on-the-rise

Hi Ciju,

Thanks for the response. Those components weren't se to be async and it worked in 99% of the cases. We did try setting them to async and the problem still occurred intermittently. 

It turned out the root cause of the issue seems to have been a concurrency problem in Activiti 5.20. We upgraded to 5.22 and it has solved the problem and is now working 100% of the time.

The issue I'm trying to solve is how to progress those processes that are "stuck". Without an admin UI I'm not sure what the best way to achieve that is.

FYI here is the bpmn xml:

<?xml version="1.0" encoding="UTF-8"?>
<definitions xmlns="http://www.omg.org/spec/BPMN/20100524/MODEL" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:activiti="http://activiti.org/bpmn" xmlns:bpmndi="http://www.omg.org/spec/BPMN/20100524/DI" xmlns:omgdc="http://www.omg.org/spec/DD/20100524/DC" xmlns:omgdi="http://www.omg.org/spec/DD/20100524/DI" typeLanguage="http://www.w3.org/2001/XMLSchema" expressionLanguage="http://www.w3.org/1999/XPath" targetNamespace="http://www.activiti.org/test" id="m1479934344474" name="">
  <process id="processjobCreated" name="processjobCreated" isExecutable="true">
    <startEvent id="jobCreated" name="jobCreated"></startEvent>
    <endEvent id="processEnd" name="processEnded"></endEvent>
    <parallelGateway id="parallelGatewayStart" name="parallelGatewayStart"></parallelGateway>
    <parallelGateway id="parallelGatewayEnd" name="parallelGatewayEnd"></parallelGateway>
    <receiveTask id="documentAdded" name="documentAdded"></receiveTask>
    <intermediateCatchEvent id="waitUntilNoticeEndDate" name="intermediateCatchingEvent" activiti:async="true">
      <timerEventDefinition>
        <timeDate>${noticeEndedDate}</timeDate>
      </timerEventDefinition>
    </intermediateCatchEvent>
    <serviceTask id="markForCompletion" name="markForCompletion" activiti:delegateExpression="${markForCompletion}"></serviceTask>
    <sequenceFlow id="jobCreatedToParallelGatewayStart" sourceRef="jobCreated" targetRef="parallelGatewayStart"></sequenceFlow>
    <sequenceFlow id="parallelGatewayStartToDocumentAdded" sourceRef="parallelGatewayStart" targetRef="documentAdded"></sequenceFlow>
    <sequenceFlow id="timerToGatewayEnd" sourceRef="waitUntilNoticeEndDate" targetRef="parallelGatewayEnd"></sequenceFlow>
    <sequenceFlow id="documentAddedToParallelGatewayEnd" sourceRef="documentAdded" targetRef="parallelGatewayEnd"></sequenceFlow>
    <sequenceFlow id="parallelGatewayStartToWaitUntilNoticeEndDate" sourceRef="parallelGatewayStart" targetRef="waitUntilNoticeEndDate"></sequenceFlow>
    <sequenceFlow id="parallelGatewayEndToMarkForCompletion" sourceRef="parallelGatewayEnd" targetRef="markForCompletion"></sequenceFlow>
    <sequenceFlow id="markForCompletionToProcessEnd" sourceRef="markForCompletion" targetRef="processEnd"></sequenceFlow>
  </process>
  <bpmndi:BPMNDiagram id="BPMNDiagram_processjobCreated">
    <bpmndi:BPMNPlane bpmnElement="processjobCreated" id="BPMNPlane_processjobCreated">
      <bpmndi:BPMNShape bpmnElement="jobCreated" id="BPMNShape_jobCreated">
        <omgdc:Bounds height="35.0" width="35.0" x="0.0" y="72.0"></omgdc:Bounds>
      </bpmndi:BPMNShape>
      <bpmndi:BPMNShape bpmnElement="processEnd" id="BPMNShape_processEnd">
        <omgdc:Bounds height="35.0" width="35.0" x="560.0" y="75.0"></omgdc:Bounds>
      </bpmndi:BPMNShape>
      <bpmndi:BPMNShape bpmnElement="parallelGatewayStart" id="BPMNShape_parallelGatewayStart">
        <omgdc:Bounds height="40.0" width="40.0" x="80.0" y="67.0"></omgdc:Bounds>
      </bpmndi:BPMNShape>
      <bpmndi:BPMNShape bpmnElement="parallelGatewayEnd" id="BPMNShape_parallelGatewayEnd">
        <omgdc:Bounds height="40.0" width="40.0" x="320.0" y="69.0"></omgdc:Bounds>
      </bpmndi:BPMNShape>
      <bpmndi:BPMNShape bpmnElement="documentAdded" id="BPMNShape_documentAdded">
        <omgdc:Bounds height="60.0" width="100.0" x="170.0" y="130.0"></omgdc:Bounds>
      </bpmndi:BPMNShape>
      <bpmndi:BPMNShape bpmnElement="waitUntilNoticeEndDate" id="BPMNShape_waitUntilNoticeEndDate">
        <omgdc:Bounds height="35.0" width="35.0" x="205.0" y="0.0"></omgdc:Bounds>
      </bpmndi:BPMNShape>
      <bpmndi:BPMNShape bpmnElement="markForCompletion" id="BPMNShape_markForCompletion">
        <omgdc:Bounds height="60.0" width="100.0" x="410.0" y="60.0"></omgdc:Bounds>
      </bpmndi:BPMNShape>
      <bpmndi:BPMNEdge bpmnElement="jobCreatedToParallelGatewayStart" id="BPMNEdge_jobCreatedToParallelGatewayStart">
        <omgdi:waypoint x="35.0" y="89.0"></omgdi:waypoint>
        <omgdi:waypoint x="80.0" y="87.0"></omgdi:waypoint>
      </bpmndi:BPMNEdge>
      <bpmndi:BPMNEdge bpmnElement="parallelGatewayStartToDocumentAdded" id="BPMNEdge_parallelGatewayStartToDocumentAdded">
        <omgdi:waypoint x="120.0" y="87.0"></omgdi:waypoint>
        <omgdi:waypoint x="132.0" y="87.0"></omgdi:waypoint>
        <omgdi:waypoint x="132.0" y="160.0"></omgdi:waypoint>
        <omgdi:waypoint x="170.0" y="160.0"></omgdi:waypoint>
      </bpmndi:BPMNEdge>
      <bpmndi:BPMNEdge bpmnElement="timerToGatewayEnd" id="BPMNEdge_timerToGatewayEnd">
        <omgdi:waypoint x="240.0" y="17.0"></omgdi:waypoint>
        <omgdi:waypoint x="282.0" y="15.0"></omgdi:waypoint>
        <omgdi:waypoint x="282.0" y="89.0"></omgdi:waypoint>
        <omgdi:waypoint x="320.0" y="89.0"></omgdi:waypoint>
      </bpmndi:BPMNEdge>
      <bpmndi:BPMNEdge bpmnElement="documentAddedToParallelGatewayEnd" id="BPMNEdge_documentAddedToParallelGatewayEnd">
        <omgdi:waypoint x="270.0" y="160.0"></omgdi:waypoint>
        <omgdi:waypoint x="282.0" y="160.0"></omgdi:waypoint>
        <omgdi:waypoint x="282.0" y="89.0"></omgdi:waypoint>
        <omgdi:waypoint x="320.0" y="89.0"></omgdi:waypoint>
      </bpmndi:BPMNEdge>
      <bpmndi:BPMNEdge bpmnElement="parallelGatewayStartToWaitUntilNoticeEndDate" id="BPMNEdge_parallelGatewayStartToWaitUntilNoticeEndDate">
        <omgdi:waypoint x="120.0" y="87.0"></omgdi:waypoint>
        <omgdi:waypoint x="132.0" y="87.0"></omgdi:waypoint>
        <omgdi:waypoint x="132.0" y="15.0"></omgdi:waypoint>
        <omgdi:waypoint x="205.0" y="17.0"></omgdi:waypoint>
      </bpmndi:BPMNEdge>
      <bpmndi:BPMNEdge bpmnElement="parallelGatewayEndToMarkForCompletion" id="BPMNEdge_parallelGatewayEndToMarkForCompletion">
        <omgdi:waypoint x="360.0" y="89.0"></omgdi:waypoint>
        <omgdi:waypoint x="372.0" y="89.0"></omgdi:waypoint>
        <omgdi:waypoint x="372.0" y="90.0"></omgdi:waypoint>
        <omgdi:waypoint x="410.0" y="90.0"></omgdi:waypoint>
      </bpmndi:BPMNEdge>
      <bpmndi:BPMNEdge bpmnElement="markForCompletionToProcessEnd" id="BPMNEdge_markForCompletionToProcessEnd">
        <omgdi:waypoint x="510.0" y="90.0"></omgdi:waypoint>
        <omgdi:waypoint x="560.0" y="92.0"></omgdi:waypoint>
      </bpmndi:BPMNEdge>
    </bpmndi:BPMNPlane>
  </bpmndi:BPMNDiagram>
</definitions>

cjose
Elite Collaborator
Elite Collaborator

Without an admin UI, you can either do it either using the REST APIs or the JAVA APIs. If you take the JAVA API path, the following will give you a list of failed jobs.  

managementService.createJobQuery().withException().noRetriesLeft().list();

For each job in the list, managementService.executeJob(job.getId()); will execute the job for you.

Hope this helps.

Ciju

cwhite99
Champ on-the-rise
Champ on-the-rise

Thanks for the advice Ciju. I'm not entirely sure how to embed the rest service into my application (entirely J2EE EJBs), is it just a matter of including it as a maven dependency? From the github download it looks like it's just available as a standalone war which I don't think is going to work since the workflow relies on Java delegates that are only present in my application.

daisuke-yoshimo
Star Collaborator
Star Collaborator

To investigate the cause, I would like to get a stack trace of the exception log if there is one.

Or, I would like to get a thread dump when a problem occurs.

Hi Daisuke,

Unfortunately there was no exception or error in the logs when this occurred. We're not too worried about fixing the issue since after we upgraded to 5.22 it doesn't happen.

Probably due to one of these from the release note:

  • Includes an important bug fix for a concurrency bug caused when deleting a variable (task or process variable) and attempting to save the delete variable event. [5.22]
  • It fixes some cases where the end time was not set for activities in a process definition under certain conditions. [5.21]
  • A concurrency bug was discovered when using delegateExpressions together with field injection. Make sure to read the updated documentation section on the use of 'delegateExpression'. [5.21]

warper
Star Contributor
Star Contributor

There are three ways for process to get stuck: the good, the bad and the ugly.

The ugly one - process definition was upgraded and due to bug in engine, timer job was deleted. It's known issues for version 5.18, but as far as I remember, it was fixed by 5.20, so you should not encounter it.

1) Here you can recover DB state of ACT_RU_JOBS table from DB back-ups for affected processes and try to put such records into DB.

2) It's also possible to recover process/form data and use it for specifically crafted process that will simply call completeJob, since all the data is already ready for processing.

The bad one - process is passivated on parallel gateway due to bugs in concurrency equations. As far as I remember, here we have some sort of job that can be (re)started manually.

3) You can run job in test environment or through specifically crafted rest interface in your application. Also you can do 2) 

Good one - process runs out of tries for timer/async job. It can happen in heavy load environment, since default number of retries is only 3. 

4) The easiest one. Find jobs for this process in ACT_RU_JOBS, make sure they run out of tries (retries counter 0 or negative), increase retries counter directly in DB. I did such manipulations in testing of notifications for out-of-tries conditions. 

cwhite99
Champ on-the-rise
Champ on-the-rise

Thanks for the insightful response. I think it was a case of "process is passivated on parallel gateway due to bugs in concurrency equations" since we hadn't done an upgrade. There's nothing in the act_ru_jobs table for those instances and no errors reported so bumping up the retries in the db wasn't an option.

We ended up writing a script to bypass Activiti by making rest calls to get the jobs to complete (essentially what the Java delegate would have done if the timer had fired). We now have some Activiti processes that aren't technically in an end state so are working on integrating the REST API into our app and then will call it to terminate the processes.