Hyland Connect

marcog · ‎09-25-2023

Hello everybody.

Today I use Alfresco through the docker-compose.yml architecture in the following versions:

Docker Image versions

ALFRESCO_CE_TAG=7.2.0
SEARCH_CE_TAG=2.0.3
SHARE_TAG=7.2.0
ACA_TAG=2.9.0
TRANSFORM_ENGINE_TAG=2.5.7
ACTIVEMQ_TAG=5.16.1

And with Postgresql version 13.3 on another server configuration with db_pool_max at 300 in docker-compose.yml and in the Postgres server's postgresql.conf at 400.
The Alfresco Server is Ubuntu 22.04 with 98 GB RAM and 24 cores, while the Postgres server is also Ubuntu 22.04 with 26 GB RAM and 16 cores.

During a few times a day (sporadically) the alfresco container reaches 1600% CPU processing (through docker stats). The alfresco container's memory configuration in docker-compose.yml is 68 GB and even so it reaches these high CPU processing numbers.

At these moments, Postgresql begins to generate time outs in active processes running and thus the entire environment needs to be rebooted (Postgresql and Alfresco).

We have already advised development teams to no longer use CMIS and only REST API to send, search, update and delete nodes in Alfresco. There are thousands of GET/PUT requests that are received by Alfresco. All applications use only 1 user to connect to Alfresco.

In Postgresql we monitor in particular a query that executes as follows:

select
assoc.id as id,
parentNode.id as parentNodeId,
parentNode.version as parentNodeVersion,
parentStore.protocol as parentNodeProtocol,
parentStore.identifier as parentNodeIdentifier,
parentNode.uuid as parentNodeUuid,
childNode.id as childNodeId,
childNode.version as childNodeVersion,
childStore.protocol as childNodeProtocol,
childStore.identifier as childNodeIdentifier,
childNode.uuid as childNodeUuid,
assoc.type_qname_id as type_qname_id,
assoc.child_node_name_crc as child_node_name_crc,
assoc.child_node_name as child_node_name,
assoc.qname_ns_id as qname_ns_id,
assoc.qname_localname as qname_localname,
assoc.is_primary as is_primary,
assoc.assoc_index as assoc_index
from
alf_child_assoc assoc
join alf_node parentNode on (parentNode.id = assoc.parent_node_id)
join alf_store parentStore on (parentStore.id = parentNode.store_id)
join alf_node childNode on (childNode.id = assoc.child_node_id)
left join alf_store childStore on (childStore.id = childNode.store_id)
where
parentNode.id = 988

All processes in Postgresql basically run this query and it returns thousands of documents at a time (sometimes millions). This query gets stuck running for a long time in Postgresql with many deadlocks and when the processes start to turn red it shows an error:
ERROR: relation "alf_bootstrap_lock" does not exist at character 15

Has anyone come across this type of scenario? If you need more data, I will provide it without any problems.

There are a ticket wiht #1018 in this link github.com/Alfresco/acs-deployment/issues/1018 that CHUNT answer about put the ticket here.

I would like to add that the query above, we discovered using APM Search to monitor what it is triggered by Solr (by API Solr / GET), could anyone help me with this? Or give me some direction?

Thanks

gait · ‎02-13-2024

Hi Marco

Did you solve this issue.

We appear to be battling the same issue.

We have random spikes in CPU which lock up the whole server and requiring a restart.

Similarly, plenty of server power, dockerised Alfresco.

alfresco-content-repository-community:23.1.0

alfresco-transform-core-aio:5.0.1

alfresco-share:23.1.1

alfresco-content-app:4.3.0

alfresco-search-services:2.0.8.2

alfresco-activemq:5.18-jre17-rockylinux8

alfresco-control-center:8.3.0

Running Postgres 16 undockerised on the host.

Regards

Marc

Hyland Connect

Alfresco Engine Container High Throughput

Docker Image versions