Hyland Connect

jorell · ‎09-18-2014

I am currently getting about 2 job completions per second from a clustered environment and multiple job executors running. I am trying to drastically improve this (> 200 jobs per second). From the looks of it it seems I am thread bound. Hardware is not a concern for me and I created my own JobExecutor (basically copied the DefaultJobExecutor), and replaced the default values to this:

protected int queueSize = 50;
protected int corePoolSize = 75;
private int maxPoolSize = 200;

This doesn't seem to be having any effect, I still see only three threads at any given time running jobs. My process engine config code is:

    processEngineConfigurationImpl
        .setDatabaseType(DatabaseConfig.getDatabaseType())
        .setJobExecutor(new CustomJobExecutor())
        .setJobExecutorActivate(true)
        .setDatabaseSchemaUpdate(databaseSchemaUpdate);
ProcessEngine processEngine = processEngineConfigurationImpl.buildProcessEngine();

Is there something I am missing? If anyone else has experience in getting more throughput from the job executor I'd really appreciate it if they could share their experiene and any best practices.
Thanks

jorell · ‎09-18-2014

On further examination it seems that since each server in the cluster only fetches one job at a time, there is no increase in paralellism. The threadPoolExecutor doesnt go beyond one thread. My jobs are lasting about 200-300ms. I see the comment in the code to not increase the acquire jobs count to more than one for a clustered enviroment so not sure what to do at this point.

jorell · ‎09-19-2014

Just bouncing ideas while I wait for a response

I was thinking of modifying my custom job executor to have a list of AcquireJobsRunnable threads instead of just one thread. Since each thread will atempt to acquire a single job, we should still be able to avoid the deadlock mentioned in http://jira.codehaus.org/browse/ACT-1879. But having multiple threads fetching job should allow us to increase throughput. Does this sound reasonable?

jorell · ‎09-22-2014

I noticed we are fetching jobs twice. The first time AcquireJobsRunnable fetches the job(s). But only stores the ids. This id is then used to fetch the job again in the ExecuteJobsCmd, so we are doing the same query twice. This seems redundant. Why not pass the whole job the the execute command instead of just the id?
By the way, I noticed I haven't gotten a single response so far. Please let me know if I am not posting enough information or there is something else wrong with what I am doing.

Hyland Connect

Getting more throughput from JobExecutor