cancel
Showing results for 
Search instead for 
Did you mean: 

Controlling Extremely Large Exception Stack Traces

codetom
Champ in-the-making
Champ in-the-making
Hi all,

We're using Activiti heavily in my organization, and I wanted to request help from the community on an issue that is currently plaguing us right now: extremely long stack traces.  I know it sounds ridiculous, but it's a serious problem.  For a single issue (say a database insertion error, invalid column), we might get 3500 lines in that 1 stack trace, between the trace itself and the various "Caused by" statements.

Allow me to explain … for our purposes, we might have 5-10 nested process flows within a primary process flow.  Some of those process flows are also tagged as "Multi-Instance" flows as well … thus leading to some additional flows.  In addition to that, we might have up to 20 individual ServiceTask or ScriptTask elements in a flow.  Each ServiceTask may be calling one or more DAO operations to perform database CRUD operations.  So, you can imagine that if one of those DAO operations fails with an exception that is then propagated into the BPMN flow … we're in for a world of hurt.

I know there might be a couple easy answers to this:
  1. Why are you propagating the exception into the BPMN flow?  Why not trap the exception before it reaches back to the BPMN flow?  The answer is that Spring / MyBatis transaction management operate based on thrown Exceptions if you want to use it out of the box.  Manually managing transaction creation and roll back in MyBatis can be a major pain.  In our case, we want to make sure we can roll back Spring-initiated transaction that might occur in the ServiceTasks.
  2. Ignore the extraneous logs … you can always find the root of the problem.  While its true that when looking at the "Caused by" we get the error, but the sheer amount of log spam generated by the errors occurring in the first place if overwhelming all of our otherwise useful log messaging, and is drastically reducing our capability to debug issues quickly as a result.  One 3000 line trace may not be an issue, but throw 100 users pounding away at the system and you start having some major issues.

If you have experience with Activiti, its exception handling strategy or logging strategy, and you can share information / point me in the right direction .. or if you've dealt with similar issues yourself .. I would love to hear about it.

Thanks in advance,
Tom
7 REPLIES 7

tidetom
Champ in-the-making
Champ in-the-making
I found this post on the issue: http://jira.codehaus.org/browse/ACT-685

Is this issue currently being worked on?  We're getting StackOverflowErrors for processes with large numbers of tasks.

trademak
Star Contributor
Star Contributor
Hi,

What's important to understand is that Activiti executes automatic tasks synchronously .
So if you have a really large number of automatic tasks ( > 100) executing after eachother this could lead to issues.
A solution is to add async continuations or non-automatic tasks.

Best regards,

tidetom
Champ in-the-making
Champ in-the-making
Hi Tijs,

Thank you for the tip.  I will look into the flows to see if more asynchronous continuations can be designed into them.  The downside to this is naturally that (since no user interaction is required in these flows) the Java code which instantiates the original business process then must become an orchestrator, having to know much about the internal workings of the BPMN flow, rather than being relatively agnostic to the flow (as it is right now).  

Without knowing too much about the Activiti Engine architecture, I was thinking that perhaps an elegant solution to the problem would be to spin up each sub-process in its own thread, but to use the Callable and Future classes (and the Future.get() method) from the Java Concurrency API to ensure the "primary" thread waits for the sub-process (invoked using a CallActivity task) to complete before the primary thread is allowed to continue.  This would give developers the benefit of working with synchronous flows (a big plus!) without having to worry as much about over-running the stack trace for complex, hierarchical flows.

If implemented like this, it would at least reduce the scope of the scalability problem that Activiti currently has to ensuring that developers design process flows which use no more than ~100 tasks in a given process flow, rather than in an entire aggregation of a process flow and all sub-process flows. 

I think it's definitely reasonable to limit a single process flow to < ~100 tasks because frankly it's probably a poorly designed flow if things are done this way.  However to limit a process flow and all sub-processes it might call to ~100 might be underestimating the power that Activiti can bring to some very complex, non-user-oriented business processes.

Any thoughts on this?

Thanks again for your response,
Tom

tidetom
Champ in-the-making
Champ in-the-making
As an aside for the purposes of helping others, I did notice that increasing the -Xss fixed the larger of the 2 problems in this particular case (the stack overflow exception).

Setting "-Xss8m" in the CATALINA_OPTS allowed the process to complete successfully so far with flawless frequency, however I do feel that I'm only patching the problem, as I don't know how big is big enough for the stack size.  For example, if I use a multi-instance (set on a CallActivity) and send in 100 values for the loop will it still work?  How about 200? (you get the picture…)

Generally I think the issue would be better addressed in attempting to fix the growth of the stack vs opening up for a larger stack, but in the meantime this has allowed us to continue working on our flows.

Thanks,
Tom

ronald_van_kuij
Champ on-the-rise
Champ on-the-rise
Tom,

Making services async does not require any changes in the services, just set an attribute that designates them as async. See the docs

tidetom
Champ in-the-making
Champ in-the-making
Hi Ronald,

Thank you for your response.  I think you may have misunderstood the proposals made above.  I do know that there are configuration parameters to execute services asynchronously … however switching to using asynchronous process invocations does not solve the issue at hand.

The solution I was proposing was to the problem of Activiti scability using synchronous service calls.  Using synchronous services, you will eventually run into stack trace problems if you design larger process flows.  The solution to this problem is not as simple as changing to invoke services asynchronously - doing so would have enormous impact on the code which invokes the process flows in the first place.  Furthermore, doing so would force the invoking code to understand far more about the contents of the business process flows (thus tightly coupling the invoking code to the business process flows - something that I think we can all agree is not something that should be promoted).

My proposal was to use the "Future" and "Callback" classes in the Java Concurrency API when invoking sub-processes in the CallActivity tasks to allow for the introduction of threading into the flows, but while still ensuring that threads are implemented synchronously.  The introduction of threading will alleviate the problem that Activiti currently has scaling to larger process flow implementations.  Using Future.get() to maintain synchronous execution will ensure that the code invoking the process flows can continue to remain decoupled from the flow itself.

If in fact this is already implemented, and can be configured using some configuration parameters, it would be great if you can point me to those in the documentation directly, because I don't see any that support threading while maintaining synchronously executing process flows.

Thanks again,
Tom

trademak
Star Contributor
Star Contributor
Hi,

When you have a lot (> 100) of automatic activities running after eachother it's simply a matter of a too large callstack, and you get an exception eventually.
As you said you can increase the callstack and you'll be able to run more automatic activities.
The solution Activiti provides is to use async continuations. And yes this also impacts the client logic.

Best regards,
Getting started

Tags


Find what you came for

We want to make your experience in Hyland Connect as valuable as possible, so we put together some helpful links.