08-11-2020 11:11 AM
The error we are seeing in the logs is:
1465547 [pool-2-thread-1] WARN o.h.e.jdbc.spi.SqlExceptionHelper - SQL Error: 1205, SQLState: 41000
1465547 [pool-2-thread-1] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - Lock wait timeout exceeded; try restarting transaction
1465549 [pool-2-thread-1] ERROR c.a.s.DefaultScheduledTaskRunner - failed to execute scheduled task 'event-processing
javax.persistence.LockTimeoutException: could not execute statement
Caused by: org.hibernate.exception.LockTimeoutException: could not execute statement
Caused by: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
APS is enacting a series of parallel processes for the management, tracking, and delivery of assets from the ACS application. During this there are several periods where the content is handed off to another service for activities to take place, and then because of the asynchronous nature of that service, the process polls for completion.
The processes to date have been written such that the polling took place as a service task, with the status handling as an exclusive gateway in the process definition, to determine if the other service has completed its activities, with a loop.
Due to an outage, the remote service failed, and the loop continued running for several thousand processes over a weekend.
The result that we have identified is that the ACT_EVT_LOG table now contains 43 million rows, and the APS services attempting to work with it are grinding to a halt.
This was initially identified as two servers in a deadlock battle over which could get the lock over process events, but even after resolving this by taking all servers out of action apart from one we’re still seeing that the attempt to get lock over process events is timing out.
Is there is a way to remove unnecessary entries from the ACT_EVT_LOG table so that the activities using it take less time, and the system can attempt to resume functionality?
There are other things we need to do as a result of this; changing the polling task which waits from being a set of workflow events to being a piece of code seems initially to be a good way to prevent this, but are there others we should be careful of and look to ensure we don’t make mistakes that will cripple performance like this in the future.
Results of a couple of queries are:
mysql> show full processlist;
+-----+----------+---------------------+---------+---------+------+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+----------+---------------------+---------+---------+------+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1 | rdsadmin | localhost | NULL | Sleep | 0 | delayed send ok done | NULL |
| 2 | rdsadmin | localhost | NULL | Sleep | 0 | cleaned up | NULL |
| 3 | rdsadmin | localhost | NULL | Sleep | 4 | cleaned up | NULL |
| 4 | rdsadmin | localhost | NULL | Sleep | 9 | delayed send ok done | NULL |
| 985 | activiti | 10.160.35.138:34026 | process | Sleep | 2 | delayed send ok done | NULL |
| 988 | activiti | 10.160.35.138:34153 | process | Sleep | 371 | delayed send ok done | NULL |
| 989 | activiti | 10.160.35.138:34156 | process | Sleep | 371 | delayed send ok done | NULL |
| 990 | activiti | 10.160.35.138:34157 | process | Query | 334 | updating | update ACT_EVT_LOG set LOCK_OWNER_='5035a137-1388-4db5-ac3c-594d12b4fe0f_expired', LOCK_TIME_='2020-08-11 14:00:00.007' where IS_PROCESSED_=0 and (LOCK_TIME_ is not null) and LOCK_TIME_<'2020-08-11 13:50:00.007' |
| 991 | activiti | 10.160.35.138:34197 | process | Sleep | 371 | delayed send ok done | NULL |
| 992 | activiti | 10.160.35.138:34205 | process | Sleep | 371 | delayed send ok done | NULL |
| 993 | activiti | 10.160.35.138:34234 | process | Sleep | 371 | delayed send ok done | NULL |
| 994 | activiti | 10.160.35.138:34233 | process | Sleep | 345 | delayed send ok done | NULL |
| 995 | activiti | 10.160.35.138:34257 | process | Sleep | 2 | delayed send ok done | NULL |
| 996 | activiti | 10.160.35.138:34258 | process | Query | 8 | updating | update ACT_EVT_LOG set LOCK_OWNER_='5035a137-1388-4db5-ac3c-594d12b4fe0f', LOCK_TIME_='2020-08-11 14:05:26.529' where LOG_NR_>=3996255 and LOG_NR_<3996355 and IS_PROCESSED_=0 and (LOCK_OWNER_ is null) and (LOCK_TIME_ is null) |
| 997 | activiti | 10.160.35.138:34279 | NULL | Query | 0 | init | show full processlist |
+-----+----------+---------------------+---------+---------+------+----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
15 rows in set (0.00 sec)
mysql> SELECT * FROM `information_schema`.`innodb_trx` ORDER BY `trx_started`;
+-----------+-----------+---------------------+-------------------------+---------------------+------------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+-------------------+-------------------+------------------+-----------------------+-----------------+-------------------+-------------------------+---------------------+-------------------+------------------------+----------------------------+---------------------------+---------------------------+------------------+----------------------------+
| trx_id | trx_state | trx_started | trx_requested_lock_id | trx_wait_started | trx_weight | trx_mysql_thread_id | trx_query | trx_operation_state | trx_tables_in_use | trx_tables_locked | trx_lock_structs | trx_lock_memory_bytes | trx_rows_locked | trx_rows_modified | trx_concurrency_tickets | trx_isolation_level | trx_unique_checks | trx_foreign_key_checks | trx_last_foreign_key_error | trx_adaptive_hash_latched | trx_adaptive_hash_timeout | trx_is_read_only | trx_autocommit_non_locking |
+-----------+-----------+---------------------+-------------------------+---------------------+------------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+-------------------+-------------------+------------------+-----------------------+-----------------+-------------------+-------------------------+---------------------+-------------------+------------------------+----------------------------+---------------------------+---------------------------+------------------+----------------------------+
| 427836316 | RUNNING | 2020-08-11 14:00:00 | NULL | NULL | 5354375 | 990 | update ACT_EVT_LOG set LOCK_OWNER_='5035a137-1388-4db5-ac3c-594d12b4fe0f_expired', LOCK_TIME_='2020-08-11 14:00:00.007' where IS_PROCESSED_=0 and (LOCK_TIME_ is not null) and LOCK_TIME_<'2020-08-11 13:50:00.007' | fetching rows | 1 | 1 | 1367422 | 376 | 35135454 | 3986953 | 0 | REPEATABLE READ | 1 | 1 | NULL | 0 | 0 | 0 | 0 |
| 427838524 | LOCK WAIT | 2020-08-11 14:06:31 | 427838524:706:204356:22 | 2020-08-11 14:06:31 | 2 | 985 | update ACT_EVT_LOG set LOCK_OWNER_='5035a137-1388-4db5-ac3c-594d12b4fe0f', LOCK_TIME_='2020-08-11 14:06:31.099' where LOG_NR_>=3996255 and LOG_NR_<3996355 and IS_PROCESSED_=0 and (LOCK_OWNER_ is null) and (LOCK_TIME_ is null) | starting index read | 1 | 1 | 2 | 376 | 1 | 0 | 0 | REPEATABLE READ | 1 | 1 | NULL | 0 | 0 | 0 | 0 |
+-----------+-----------+---------------------+-------------------------+---------------------+------------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+-------------------+-------------------+------------------+-----------------------+-----------------+-------------------+-------------------------+---------------------+-------------------+------------------------+----------------------------+---------------------------+---------------------------+------------------+----------------------------+
2 rows in set (0.03 sec)
08-11-2020 12:02 PM
Hi @dayc,
Have you raised a support request?
08-13-2020 05:20 AM
We have raised a support ticket.
08-13-2020 04:57 AM
Can you provide some kind of stacktraces from the Java side of things when these issues occur, e.g. using the jstack utility? I have been trying to find instances in APS code where event log entries are locked, but so far have not been successful using either IDE call hierarchy tracing or call analysis on decompiled code. To my knowledge, these act_evt_log entries should only be used for the Elastic Search based indexing / query capabality for end-user search support in APS. I have never been able to identify the classes that handle this or whether it would be possible to simply clear the table and have APS re-create the search index from the pure DB state. I also have encountered various issues with event logging and processing in general (different ones, but regular and annoying as log spam) at a customer of mine, but they never prioritised / approved the efforts it would have required to tear APS apart to actually find the responsible code.
Edit: Strike my request for the stack traces - I actually found the code in APS that handles locking which, as is unfortunately typical for APS, circumvents the low level entity managing code in the core Activiti engine and uses some bolted-on JPA repository.
08-13-2020 05:20 AM
This is the trace taken from Cloudwatch today:
Caused by: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1094) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4208) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4140) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2597) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2758) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2826) at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2082) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2334) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2262) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2246) at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeUpdate(NewProxyPreparedStatement.java:384) at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.executeUpdate(ResultSetReturnImpl.java:186) ... 66 common frames omitted" 68060341 [pool-3-thread-1] WARN c.a.s.r.PublishAdminAppEventsService - Error posting events to Activiti Admin Application: uatworkflow.content.oup.com:443 failed to respond 68098445 [pool-1-thread-1] WARN o.a.h.c.p.ResponseProcessCookies - Invalid cookie header: "Set-Cookie: AWSALB=1ibUE2ESbZkLBZs4GHAFlqu16/atpNpLLE0NysMt0e6EoXH1ZkqNJNdHX82axi0HkQ5AomHaimnTBPcQv16O5SG2qnjWZLFYdIRSXTQ054MYWDZ9iVc5koxKmzxH; Expires=Thu, 20 Aug 2020 08:55:38 GMT; Path=/". Invalid 'expires' attribute: Thu, 20 Aug 2020 08:55:38 GMT 68098445 [pool-1-thread-1] WARN o.a.h.c.p.ResponseProcessCookies - Invalid cookie header: "Set-Cookie: AWSALBCORS=1ibUE2ESbZkLBZs4GHAFlqu16/atpNpLLE0NysMt0e6EoXH1ZkqNJNdHX82axi0HkQ5AomHaimnTBPcQv16O5SG2qnjWZLFYdIRSXTQ054MYWDZ9iVc5koxKmzxH; Expires=Thu, 20 Aug 2020 08:55:38 GMT; Path=/; SameSite=None". Invalid 'expires' attribute: Thu, 20 Aug 2020 08:55:38 GMT 68109881 [pool-3-thread-5] WARN o.h.e.jdbc.spi.SqlExceptionHelper - SQL Error: 1205, SQLState: 41000 68109881 [pool-3-thread-5] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - Lock wait timeout exceeded; try restarting transaction 68109883 [pool-3-thread-5] ERROR c.a.s.DefaultScheduledTaskRunner - failed to execute scheduled task 'event-processing "javax.persistence.LockTimeoutException: could not execute statement at org.hibernate.ejb.AbstractEntityManagerImpl.wrapLockException(AbstractEntityManagerImpl.java:1440) at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1339) at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1310) at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:1397) at org.hibernate.ejb.AbstractQueryImpl.executeUpdate(AbstractQueryImpl.java:111) at org.springframework.data.jpa.repository.query.JpaQueryExecution$ModifyingExecution.doExecute(JpaQueryExecution.java:240) at org.springframework.data.jpa.repository.query.JpaQueryExecution.execute(JpaQueryExecution.java:87) at org.springframework.data.jpa.repository.query.AbstractJpaQuery.doExecute(AbstractJpaQuery.java:116) at org.springframework.data.jpa.repository.query.AbstractJpaQuery.execute(AbstractJpaQuery.java:106) at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:492) at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:475) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.data.projection.DefaultMethodInvokingMethodInterceptor.invoke(DefaultMethodInvokingMethodInterceptor.java:56) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99) at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:282) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:136) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.data.jpa.repository.support.CrudMethodMetadataPostProcessor$CrudMethodMetadataPopulatingMethodInterceptor.invoke(CrudMethodMetadataPostProcessor.java:133) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.data.repository.core.support.SurroundingTransactionDetectorMethodInterceptor.invoke(SurroundingTransactionDetectorMethodInterceptor.java:57) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213) at com.sun.proxy.$Proxy255.lockBlockOfEvents(Unknown Source) at sun.reflect.GeneratedMethodAccessor812.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:333) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:136) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213) at com.sun.proxy.$Proxy256.lockBlockOfEvents(Unknown Source) at com.activiti.service.runtime.ActivitiEventProcessingService$2.doInTransaction(ActivitiEventProcessingService.java:268) at com.activiti.service.runtime.ActivitiEventProcessingService$2.doInTransaction(ActivitiEventProcessingService.java:265) at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133) at com.activiti.service.runtime.ActivitiEventProcessingService.lockBlockOfEvents(ActivitiEventProcessingService.java:265) at com.activiti.service.runtime.ActivitiEventProcessingService.internalExecuteProcessing(ActivitiEventProcessingService.java:219) at com.activiti.service.runtime.ActivitiEventProcessingService.findAndProcessEvents(ActivitiEventProcessingService.java:157) at com.activiti.service.runtime.ActivitiEventProcessingService.executeEventProcessing(ActivitiEventProcessingService.java:151) at com.activiti.service.runtime.ActivitiEventProcessingServiceSchedulers$1.executeTask(ActivitiEventProcessingServiceSchedulers.java:48) at com.activiti.scheduling.DefaultScheduledTaskRunner.runTask(DefaultScheduledTaskRunner.java:19) at com.activiti.service.runtime.ActivitiEventProcessingServiceSchedulers.executeEventProcessing(ActivitiEventProcessingServiceSchedulers.java:79) at sun.reflect.GeneratedMethodAccessor1017.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65) at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:81) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748)" "Caused by: org.hibernate.exception.LockTimeoutException: could not execute statement at org.hibernate.dialect.MySQLDialect$1.convert(MySQLDialect.java:408) at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49) at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:124) at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:109) at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.executeUpdate(ResultSetReturnImpl.java:189) at org.hibernate.hql.internal.ast.exec.BasicExecutor.doExecute(BasicExecutor.java:108) at org.hibernate.hql.internal.ast.exec.BasicExecutor.execute(BasicExecutor.java:77) at org.hibernate.hql.internal.ast.QueryTranslatorImpl.executeUpdate(QueryTranslatorImpl.java:415) at org.hibernate.engine.query.spi.HQLQueryPlan.performExecuteUpdate(HQLQueryPlan.java:321) at org.hibernate.internal.SessionImpl.executeUpdate(SessionImpl.java:1215) at org.hibernate.internal.QueryImpl.executeUpdate(QueryImpl.java:116) at org.hibernate.ejb.QueryImpl.internalExecuteUpdate(QueryImpl.java:198) at org.hibernate.ejb.AbstractQueryImpl.executeUpdate(AbstractQueryImpl.java:102) ... 58 common frames omitted" "Caused by: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1094) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4208) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4140) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2597) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2758) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2826) at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2082) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2334) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2262) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2246) at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeUpdate(NewProxyPreparedStatement.java:384) at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.executeUpdate(ResultSetReturnImpl.java:186) ... 66 common frames omitted" 68120317 [pool-3-thread-1] WARN c.a.s.r.PublishAdminAppEventsService - Error posting events to Activiti Admin Application: uatworkflow.content.oup.com:443 failed to respond 68158402 [pool-1-thread-1] WARN o.a.h.c.p.ResponseProcessCookies - Invalid cookie header: "Set-Cookie: AWSALB=2O06zr3mkVZm7dhrsmEyZa4qDZPfZ9722xYO8mCEmyFhMM5//WwMoePyE90Z/3ahiSn7gOrsffr5uEOobVytN0yNgUTWpR/IVQ/hogccBPxB9CedBV8jFkj7W680; Expires=Thu, 20 Aug 2020 08:56:38 GMT; Path=/". Invalid 'expires' attribute: Thu, 20 Aug 2020 08:56:38 GMT 68158402 [pool-1-thread-1] WARN o.a.h.c.p.ResponseProcessCookies - Invalid cookie header: "Set-Cookie: AWSALBCORS=2O06zr3mkVZm7dhrsmEyZa4qDZPfZ9722xYO8mCEmyFhMM5//WwMoePyE90Z/3ahiSn7gOrsffr5uEOobVytN0yNgUTWpR/IVQ/hogccBPxB9CedBV8jFkj7W680; Expires=Thu, 20 Aug 2020 08:56:38 GMT; Path=/; SameSite=None". Invalid 'expires' attribute: Thu, 20 Aug 2020 08:56:38 GMT 68174053 [pool-3-thread-5] WARN o.h.e.jdbc.spi.SqlExceptionHelper - SQL Error: 1205, SQLState: 41000 68174053 [pool-3-thread-5] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - Lock wait timeout exceeded; try restarting transaction 68174055 [pool-3-thread-5] ERROR c.a.s.DefaultScheduledTaskRunner - failed to execute scheduled task 'event-processing "javax.persistence.LockTimeoutException: could not execute statement at org.hibernate.ejb.AbstractEntityManagerImpl.wrapLockException(AbstractEntityManagerImpl.java:1440) at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1339) at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1310) at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:1397) at org.hibernate.ejb.AbstractQueryImpl.executeUpdate(AbstractQueryImpl.java:111) at org.springframework.data.jpa.repository.query.JpaQueryExecution$ModifyingExecution.doExecute(JpaQueryExecution.java:240) at org.springframework.data.jpa.repository.query.JpaQueryExecution.execute(JpaQueryExecution.java:87) at org.springframework.data.jpa.repository.query.AbstractJpaQuery.doExecute(AbstractJpaQuery.java:116) at org.springframework.data.jpa.repository.query.AbstractJpaQuery.execute(AbstractJpaQuery.java:106) at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:492) at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:475) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.data.projection.DefaultMethodInvokingMethodInterceptor.invoke(DefaultMethodInvokingMethodInterceptor.java:56) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99) at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:282) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:136) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.data.jpa.repository.support.CrudMethodMetadataPostProcessor$CrudMethodMetadataPopulatingMethodInterceptor.invoke(CrudMethodMetadataPostProcessor.java:133) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.data.repository.core.support.SurroundingTransactionDetectorMethodInterceptor.invoke(SurroundingTransactionDetectorMethodInterceptor.java:57) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213) at com.sun.proxy.$Proxy255.lockBlockOfEvents(Unknown Source) at sun.reflect.GeneratedMethodAccessor812.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:333) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:136) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213) at com.sun.proxy.$Proxy256.lockBlockOfEvents(Unknown Source) at com.activiti.service.runtime.ActivitiEventProcessingService$2.doInTransaction(ActivitiEventProcessingService.java:268) at com.activiti.service.runtime.ActivitiEventProcessingService$2.doInTransaction(ActivitiEventProcessingService.java:265) at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133) at com.activiti.service.runtime.ActivitiEventProcessingService.lockBlockOfEvents(ActivitiEventProcessingService.java:265) at com.activiti.service.runtime.ActivitiEventProcessingService.internalExecuteProcessing(ActivitiEventProcessingService.java:219) at com.activiti.service.runtime.ActivitiEventProcessingService.findAndProcessEvents(ActivitiEventProcessingService.java:157) at com.activiti.service.runtime.ActivitiEventProcessingService.executeEventProcessing(ActivitiEventProcessingService.java:151) at com.activiti.service.runtime.ActivitiEventProcessingServiceSchedulers$1.executeTask(ActivitiEventProcessingServiceSchedulers.java:48) at com.activiti.scheduling.DefaultScheduledTaskRunner.runTask(DefaultScheduledTaskRunner.java:19) at com.activiti.service.runtime.ActivitiEventProcessingServiceSchedulers.executeEventProcessing(ActivitiEventProcessingServiceSchedulers.java:79) at sun.reflect.GeneratedMethodAccessor1017.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65) at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:81) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748)"
08-13-2020 06:17 AM
Ok, the core issue is simply that APS does not perform proper pagination or lock refreshing or even concurrent execution protection during event processing. Whenever the scheduled task to process expired events is triggered, it locks ALL events older than a specific time, loads ALL in one SELECT into Java memory to be processed one-by-one, and never refreshes the locks it has applied when the operation takes longer than expected. Given that you have millions of event rows, the lock is only valid for 10 minutes, and the scheduled task runs every 30 minutes by default - without checking if the previous task is still running - you can run into DB concurrency / locking issues even when running APS on a single server.
A temporary workaround without any code changes or manual DB accesses may be to explicitly configure the "event.processing.expired.cronExpression" property in activiti-app.properties to a single, explicit point in time after the next restart, restart one APS server and let it run until the all / majority of entries in the act_evt_log table have the IS_PROCESSED_ = 1 state.
It would also be good to reconfigure the "event.processing.processed.action.cronExpression" and "event.processing.cronExpression". I would set the CRON for "event.processing.cronExpression" and "event.processing.processed.action.cronExpression" to never run (e.g. use the year 2099) for the first execution, then, when all act_evt_log have IS_PROCESSED_ = 1, restart with "event.processing.processed.action.cronExpression" set to a single, explicit point in time after the next restart ("event.processing.expired.cronExpression" can be removed / set to default value), and wait until the majority of the millions of rows have been deleted. Then, in a third and final restart, I would reset all CRONs back to the default. This would be the time that the system should be back to a normal mode of operation in which it should also be safe to start up clustered nodes.
08-13-2020 09:12 AM
Thank you for the information and temporary workaround. The workaround was attempted (first restart step) with the following crons:
event.processing.expired.cronExpression=0 45 12 * * *
event.processing.cronExpression=0 0 12 * * *
event.processing.processed.action.cronExpression=0 0 12 * * *
It failed with the error starting:
692862 [pool-2-thread-1] WARN o.h.e.jdbc.spi.SqlExceptionHelper - SQL Error: 0, SQLState: 08S01 692862 [pool-2-thread-1] ERROR o.h.e.jdbc.spi.SqlExceptionHelper - Communications link failure The last packet successfully received from the server was 240,051 milliseconds ago. The last packet sent successfully to the server was 240,051 milliseconds ago. 692874 [pool-2-thread-1] WARN c.m.v.c3p0.impl.NewPooledConnection - [c3p0] A PooledConnection that has already signalled a Connection error is still in use! 692875 [pool-2-thread-1] WARN c.m.v.c3p0.impl.NewPooledConnection - [c3p0] Another error has occurred [ com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after connection closed. ] which will not be reported to listeners! com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after connection closed. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:408) at com.mysql.jdbc.Util.getInstance(Util.java:383) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1023) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:997) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:983) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:928) at com.mysql.jdbc.ConnectionImpl.throwConnectionClosedException(ConnectionImpl.java:1314) at com.mysql.jdbc.ConnectionImpl.checkClosed(ConnectionImpl.java:1306) at com.mysql.jdbc.ConnectionImpl.versionMeetsMinimum(ConnectionImpl.java:5798) at com.mysql.jdbc.StatementImpl.getWarnings(StatementImpl.java:2469) at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.getWarnings(NewProxyPreparedStatement.java:1806) at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.handleAndClearWarnings(SqlExceptionHelper.java:291) at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.logAndClearWarnings(SqlExceptionHelper.java:250) at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.close(JdbcCoordinatorImpl.java:489) at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.release(JdbcCoordinatorImpl.java:381) at org.hibernate.hql.internal.ast.exec.BasicExecutor.doExecute(BasicExecutor.java:112) at org.hibernate.hql.internal.ast.exec.BasicExecutor.execute(BasicExecutor.java:77) at org.hibernate.hql.internal.ast.QueryTranslatorImpl.executeUpdate(QueryTranslatorImpl.java:415) at org.hibernate.engine.query.spi.HQLQueryPlan.performExecuteUpdate(HQLQueryPlan.java:321) at org.hibernate.internal.SessionImpl.executeUpdate(SessionImpl.java:1215) at org.hibernate.internal.QueryImpl.executeUpdate(QueryImpl.java:116) at org.hibernate.ejb.QueryImpl.internalExecuteUpdate(QueryImpl.java:198) at org.hibernate.ejb.AbstractQueryImpl.executeUpdate(AbstractQueryImpl.java:102) at org.springframework.data.jpa.repository.query.JpaQueryExecution$ModifyingExecution.doExecute(JpaQueryExecution.java:240) at org.springframework.data.jpa.repository.query.JpaQueryExecution.execute(JpaQueryExecution.java:87) at org.springframework.data.jpa.repository.query.AbstractJpaQuery.doExecute(AbstractJpaQuery.java:116) at org.springframework.data.jpa.repository.query.AbstractJpaQuery.execute(AbstractJpaQuery.java:106) at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:492) at
08-13-2020 10:16 AM
Since the stack trace is cut short a bit too much, I don't know about the specific context / operation in which the exception occurs. But I can imagine that the DB driver may need to be configured to not kill connections that are waiting too long for response messages from the server.
Explore our Alfresco products with the links below. Use labels to filter content by product module.