cancel
Showing results for 
Search instead for 
Did you mean: 

Race hazard in DefaultDeploymentCache?

zmb
Champ in-the-making
Champ in-the-making
Hi all,

First of all, sorry for my english, I'm not a native speaker.

While I were playing with Activiti, I noticed the deployment cache is just a HashMap. I digged into the code of Activiti, but I didn't find any synchonization when the code access the cache. My experience, and the javadoc of the HashMap tells me that this will soon or later cause a  stuck thread. Unfortunately I couldn't produce the problem, but I saw system running for years before the simultaneous writing caused stuck thread.

Can somebody (who knows the code of Activiti in detail) confirm me that this is indeed a bug, or show me why won't be a problem?
10 REPLIES 10

trademak
Star Contributor
Star Contributor
When would it create a stuck thread exactly?

Best regards,

zmb
Champ in-the-making
Champ in-the-making
Let's say we have the following two threads:
<code>
"runner" prio=10 tid=0x00007fa5b850b800 nid=0x120c runnable [0x00007fa59befd000]
   java.lang.Thread.State: RUNNABLE
at java.util.HashMap.get(HashMap.java:419)
at org.activiti.engine.impl.persistence.deploy.DefaultDeploymentCache.get(DefaultDeploymentCache.java:57)
at org.activiti.engine.impl.persistence.deploy.DeploymentManager.resolveProcessDefinition(DeploymentManager.java:108)
at org.activiti.engine.impl.persistence.deploy.DeploymentManager.findDeployedLatestProcessDefinitionByKey(DeploymentManager.java:77)
at org.activiti.engine.impl.cmd.StartProcessInstanceCmd.execute(StartProcessInstanceCmd.java:72)
at org.activiti.engine.impl.cmd.StartProcessInstanceCmd.execute(StartProcessInstanceCmd.java:37)
at org.activiti.engine.impl.interceptor.CommandInvoker.execute(CommandInvoker.java:24)
at org.activiti.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:57)
at org.activiti.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:31)
at org.activiti.engine.impl.cfg.CommandExecutorImpl.execute(CommandExecutorImpl.java:40)
at org.activiti.engine.impl.cfg.CommandExecutorImpl.execute(CommandExecutorImpl.java:35)
at org.activiti.engine.impl.RuntimeServiceImpl.startProcessInstanceByKey(RuntimeServiceImpl.java:62)
at activiti.Main2$2.run(Main2.java:51)
</code>

<code>
"deployer" prio=10 tid=0x00007fa5b850a800 nid=0x120a runnable [0x00007fa59bffe000]
   java.lang.Thread.State: RUNNABLE
at java.util.HashMap.put(HashMap.java:491)
at org.activiti.engine.impl.persistence.deploy.DefaultDeploymentCache.add(DefaultDeploymentCache.java:61)
at org.activiti.engine.impl.bpmn.deployer.BpmnDeployer.deploy(BpmnDeployer.java:229)
at org.activiti.engine.impl.persistence.deploy.DeploymentManager.deploy(DeploymentManager.java:50)
at org.activiti.engine.impl.cmd.DeployCmd.execute(DeployCmd.java:80)
at org.activiti.engine.impl.cmd.DeployCmd.execute(DeployCmd.java:35)
at org.activiti.engine.impl.interceptor.CommandInvoker.execute(CommandInvoker.java:24)
at org.activiti.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:57)
at org.activiti.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:31)
at org.activiti.engine.impl.cfg.CommandExecutorImpl.execute(CommandExecutorImpl.java:40)
at org.activiti.engine.impl.cfg.CommandExecutorImpl.execute(CommandExecutorImpl.java:35)
at org.activiti.engine.impl.RepositoryServiceImpl.deploy(RepositoryServiceImpl.java:78)
at org.activiti.engine.impl.repository.DeploymentBuilderImpl.deploy(DeploymentBuilderImpl.java:156)
at activiti.Main2$1.run(Main2.java:41)
</code>

The runner thread is starting processes, the deployer thread deploy new processes. Both threads access the same DefaultDeploymentCache instance, therfore they simultaneously read and write the same HashMap. If two thread read and write the HashMap at the same time, it could happen the linked list in the HashMap become to a linked cirle, so the one of the Thread would never exit from the linked list traversion loop.
Note that the example above is not an actual thread stuck. I put breakpoints in the HashMap to demonstrate how the call stack would look like.

trademak
Star Contributor
Star Contributor
It sounds like a very edge case. What would be a good solution to solve this in your opinion?

Best regards,

zmb
Champ in-the-making
Champ in-the-making
Agree, it's indeed an edge case. I think replacing the HashMap to a ConcurrentHashMap would be a good solution.

jbarrez
Star Contributor
Star Contributor
That would indeed solve that issue …. but im afraid of the runtime impact on performance. Altough, probably gets from the concurrent hashmap are not synchronized?

zmb
Champ in-the-making
Champ in-the-making
The internet is full with concurrentmap performance tests, you can decide what performance penalty it has.

grzechu
Champ in-the-making
Champ in-the-making
Not synchronized access to HashMap in most cases leads to endless loop (I've observed this problem previously in jbpm).
This time it caused null key in LinkedHashMap so org.activiti.engine.impl.persistence.deploy.DefaultDeploymentCache could not evict last entry and consumed whole heap.

See this:
15:04:31,658 TRACE [org.activiti.engine.impl.persistence.deploy.DefaultDeploymentCache] (pool-12-thread-19177) Cache limit is reached, null will be evicted

trademak
Star Contributor
Star Contributor
How is it possible that a null key is inserted? That should not be possible at all.

Best regards,

tomsko
Champ in-the-making
Champ in-the-making
Hi. There is a problem with cache on multithreaded environment and we're suffering memory leaks because of that.
Following 50 lines of code explain why there should be no discussion about weather to synchronize (or not) access to LinkedHashMap.
<java>
import java.util.Collections;
import java.util.LinkedHashMap;
import java.util.Map;

public class TestMapCache implements Runnable {
    private static class CacheMap extends LinkedHashMap<String, String> {
        private final static long serialVersionUID = 1L;
        private final static int MAX_ENTRIES = 20;
        private String name;

        public CacheMap(String name) {
            this.name = name;
        }

        @Override
        protected boolean removeEldestEntry(java.util.Map.Entry<String, String> eldest) {
            return size() > MAX_ENTRIES;
        }
        @Override
        public String toString() {
            return "Cache " + name + " size is " + size();
        }
    };

    private static Map<String, String> cache1 = Collections.synchronizedMap(new CacheMap("sync_cache"));
    private static Map<String, String> cache2 = new CacheMap("not_sync_cache");

    public static void main(String[] args) throws InterruptedException {
        Thread[] threads = new Thread[] {
                new Thread(new TestMapCache()),
                new Thread(new TestMapCache()),
                new Thread(new TestMapCache()) };
               
        System.out.println(cache1);
        System.out.println(cache2);
       
        for (Thread t : threads)
            t.start();
        for (Thread t : threads)
            t.join();
       
        System.out.println(cache1);
        System.out.println(cache2);
    }

    @Override
    public void run() {
        for (int i = 0; i < 10000; i++) {
            cache1.put("time" + System.nanoTime(), "x");
            cache2.put("time" + System.nanoTime(), "x");
        }
    }
}
</java>

My output is:
<code>
Cache sync_cache size is 0
Cache not_sync_cache size is 0
Cache sync_cache size is 20
Cache not_sync_cache size is 18721
</code>
Getting started

Tags


Find what you came for

We want to make your experience in Hyland Connect as valuable as possible, so we put together some helpful links.