This post is a practical guide for Alfresco administrators and platform engineers who need to move beyond the default single-node deployment of the Elasticsearch connector and push more indexing throughput out of it. It covers Docker Compose first and then Kubernetes via the official Helm charts in acs-deployment.
Before scaling anything, it helps to understand the data flow. Repository events land on an ActiveMQ topic (alfresco.repo.event2). The Mediation service is the only subscriber on that topic; it fans each event out to dedicated queues, one per downstream concern:
ACS repo
│ (ActiveMQ topic: alfresco.repo.event2)
▼
┌─────────────┐
│ Mediation │ ← durable topic subscriber (single instance)
└──────┬──────┘
│ fan-out to queues
├──────────────────────────────┬──────────────────────────┐
│ │ │
▼ ▼ ▼
org.alfresco.search org.alfresco.search org.alfresco.search
.metadata.event .content.event .path.event
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌─────────────────────┐ ┌─────────────┐
│ Metadata │ │ Content │ │ Path │
│ (scalable) │ │ (scalable) │ │ (single) │
└──────────────┘ └─────────────────────┘ └─────────────┘
│ │ │
└──────────────────────────────┴──────────────────────────┘
│
Elasticsearch
| Component | Can scale out? | Why |
|---|---|---|
| Mediation | No | The channel is configured as consumer-sjms:topic:alfresco.repo.event2?durableSubscriptionName=LiveIndexingSubscription&clientId=LiveIndexing. The clientId is hard-coded, creating a single named durable JMS subscription. A second instance with a different clientId would create a second independent subscription—every event would be delivered to both instances and duplicated into all downstream queues. |
| Metadata | Yes | Consumes a plain queue with concurrentConsumers=10. Multiple instances act as competing consumers. ES update scripts use a metadataIndexingLastUpdate timestamp guard that silently no-ops stale writes, so out-of-order delivery across instances is safe. |
| Content | Yes | All three inbound channels (content events, transform replies, refresh events) use concurrentConsumers=10. The transform request embeds the shared reply-queue name in the message payload itself, and replies are decoded entirely from the message body and clientData field—there is no instance-local state that would break under parallel consumers. |
| Path | No | The inbound channel has no concurrentConsumers. The project’s own reliability tests explicitly document that path is “single instance with single consume” and that path events must be processed in creation order. The path processor also reads current index state and synchronously rewrites all descendants when a folder is moved, which is not safe under parallel consumers. |
The short rule: keep mediation and path at exactly one replica; scale metadata and content.
.env fileAll image versions are kept in a single .env file. Place it next to your compose files before running any command, or pass it explicitly with --env-file. Check quay.io/alfresco for the latest stable tags before deploying.
# .env LIVE_INDEXING_TAG=5.4.0 LIVE_REINDEXING_TAG=5.4.0 ALFRESCO_TAG=26.1.0-A.22 SHARE_TAG=26.1.0-A.22 POSTGRES_TAG=16.6 TRANSFORM_ROUTER_TAG=4.4.0 TRANSFORM_CORE_AIO_TAG=5.4.0 SHARED_FILE_STORE_TAG=4.4.0 ACTIVE_MQ_TAG=6.2-jre17-rockylinux8 DIGITAL_WORKSPACE_TAG=4.4.1 ACS_NGINX_TAG=3.4.2 ELASTICSEARCH_TAG=8.17.0 KIBANA_TAG=8.17.0 ELASTICSEARCH_INDEX_NAME=alfresco
The default docker-compose.yml from the distribution ships the alfresco-elasticsearch-live-indexing image that bundles all four components. You cannot scale individual components out of a bundled image. Create the following override file alongside the main compose file:
# docker-compose.live-indexing-split.yml
services: live-indexing-mediation: image: quay.io/alfresco/alfresco-elasticsearch-live-indexing-mediation:${LIVE_INDEXING_TAG} depends_on: - activemq - elasticsearch - alfresco - transform-core-aio environment: ELASTICSEARCH_INDEXNAME: ${ELASTICSEARCH_INDEX_NAME:-alfresco} SPRING_ELASTICSEARCH_REST_URIS: http://elasticsearch:9200 SPRING_ACTIVEMQ_BROKERURL: nio://activemq:61616 SPRING_ACTIVEMQ_USER: admin SPRING_ACTIVEMQ_PASSWORD: admin ALFRESCO_ACCEPTEDCONTENTMEDIATYPESCACHE_BASEURL: http://transform-core-aio:8090/transform/config live-indexing-path: image: quay.io/alfresco/alfresco-elasticsearch-live-indexing-path:${LIVE_INDEXING_TAG} depends_on: - activemq - elasticsearch environment: ELASTICSEARCH_INDEXNAME: ${ELASTICSEARCH_INDEX_NAME:-alfresco} SPRING_ELASTICSEARCH_REST_URIS: http://elasticsearch:9200 SPRING_ACTIVEMQ_BROKERURL: nio://activemq:61616 SPRING_ACTIVEMQ_USER: admin SPRING_ACTIVEMQ_PASSWORD: admin live-indexing-metadata: image: quay.io/alfresco/alfresco-elasticsearch-live-indexing-metadata:${LIVE_INDEXING_TAG} depends_on: - activemq - elasticsearch environment: ELASTICSEARCH_INDEXNAME: ${ELASTICSEARCH_INDEX_NAME:-alfresco} SPRING_ELASTICSEARCH_REST_URIS: http://elasticsearch:9200 SPRING_ACTIVEMQ_BROKERURL: nio://activemq:61616 SPRING_ACTIVEMQ_USER: admin SPRING_ACTIVEMQ_PASSWORD: admin live-indexing-content: image: quay.io/alfresco/alfresco-elasticsearch-live-indexing-content:${LIVE_INDEXING_TAG} depends_on: - activemq - elasticsearch - shared-file-store - transform-core-aio environment: ELASTICSEARCH_INDEXNAME: ${ELASTICSEARCH_INDEX_NAME:-alfresco} SPRING_ELASTICSEARCH_REST_URIS: http://elasticsearch:9200 SPRING_ACTIVEMQ_BROKERURL: nio://activemq:61616 SPRING_ACTIVEMQ_USER: admin SPRING_ACTIVEMQ_PASSWORD: admin ALFRESCO_SHAREDFILESTORE_BASEURL: http://shared-file-store:8099/alfresco/api/-default-/private/sfs/versions/1/file/ ALFRESCO_ACCEPTEDCONTENTMEDIATYPESCACHE_BASEURL: http://transform-core-aio:8090/transform/config
Run both files together and suppress the bundled all-in-one service by scaling it to zero:
docker compose \ -f docker-compose.yml \ -f docker-compose.live-indexing-split.yml \ --env-file .env \ up -d \ --scale live-indexing=0
docker compose \ -f docker-compose.yml \ -f docker-compose.live-indexing-split.yml \ --env-file .env \ up -d \ --scale live-indexing=0 \ --scale live-indexing-metadata=2 \ --scale live-indexing-content=2
A sensible starting configuration for a busy repository:
| Service | Replicas |
|---|---|
live-indexing-mediation |
1 (fixed) |
live-indexing-path |
1 (fixed) |
live-indexing-metadata |
2 |
live-indexing-content |
2 |
Each metadata and content instance opens up to 10 consumer threads against its queue by default. That is the concurrentConsumers=10 parameter baked into the Camel channel URI:
# metadata.properties in.alfresco.metadata.event.channel=consumer-sjms:org.alfresco.search.metadata.event?concurrentConsumers=10 # content.properties in.alfresco.content.event.channel=consumer-sjms:org.alfresco.search.content.event?concurrentConsumers=10 in.alfresco.content.availability.channel=consumer-sjms:org.alfresco.search.contentstore.event?concurrentConsumers=10 in.alfresco.content.refresh.event.channel=consumer-sjms:org.alfresco.search.contentrefresh.event?concurrentConsumers=10
Because concurrentConsumers is embedded in the URI string rather than a standalone property, you override it by redefining the whole channel via a Spring environment variable. Add this to the relevant service environment block:
# Raise metadata consumers to 20 threads per instance
live-indexing-metadata:
environment:
IN_ALFRESCO_METADATA_EVENT_CHANNEL: >-
consumer-sjms:org.alfresco.search.metadata.event?concurrentConsumers=20
# Raise content consumers to 20 threads per instance
live-indexing-content:
environment:
IN_ALFRESCO_CONTENT_EVENT_CHANNEL: >-
consumer-sjms:org.alfresco.search.content.event?concurrentConsumers=20
IN_ALFRESCO_CONTENT_AVAILABILITY_CHANNEL: >-
consumer-sjms:org.alfresco.search.contentstore.event?concurrentConsumers=20
IN_ALFRESCO_CONTENT_REFRESH_EVENT_CHANNEL: >-
consumer-sjms:org.alfresco.search.contentrefresh.event?concurrentConsumers=20
Add replicas first; only raise concurrentConsumers when extra replicas alone stop reducing queue depth.
More content consumers only move the bottleneck downstream to the Transform Service (ATS) and the Shared File Store (SFS). When the acs-repo-transform-request queue starts growing, scale transform-core-aio alongside content:
docker compose \ -f docker-compose.yml \ -f docker-compose.live-indexing-split.yml \ --env-file .env \ up -d \ --scale live-indexing=0 \ --scale live-indexing-metadata=2 \ --scale live-indexing-content=3 \ --scale transform-core-aio=2
Note on SFS in Compose: The distribution mounts SFS on a named tmpfs volume on a single Docker host. Multiple SFS containers sharing that volume on the same host works, but this is not a distributed filesystem; it is only viable for single-host Compose deployments. For multi-host setups, use Kubernetes with a ReadWriteMany storage class.
Open the ActiveMQ web console at http://localhost:8161 (default credentials admin/admin) and navigate to Queues. Scale the component whose queue is growing:
| Queue | Growing means | Scale this |
|---|---|---|
org.alfresco.search.metadata.event |
Metadata consumers are behind | --scale live-indexing-metadata=N and/or raise concurrentConsumers |
org.alfresco.search.content.event |
Content consumers are behind | --scale live-indexing-content=N and/or raise concurrentConsumers |
acs-repo-transform-request |
Transform Service is the bottleneck | --scale transform-core-aio=N |
org.alfresco.search.contentstore.event |
SFS reads or content indexing backlogged | Scale content replicas; review SFS if latency is high |
org.alfresco.search.contentrefresh.event |
Transform retries piling up | Scale transform-core-aio and check SFS throughput |
org.alfresco.search.path.event |
Path processor is behind | Cannot scale. Check CPU/memory headroom on the path container, or investigate whether a bulk folder-move triggered the spike. |
The acs-deployment repo provides a production-ready Helm chart (alfresco-content-services) that already ships all four split live-indexing services via the alfresco-search-enterprise subchart. There is no all-in-one image in the Helm deployment, the split is already the default.
alfresco-content-services/
├── values.yaml
└── subcharts
├── alfresco-search-enterprise ← mediation, metadata, content, path
├── alfresco-transform-service ← transform router, renderers, SFS
├── elastic ← Elasticsearch
└── activemq ← ActiveMQ broker
Within alfresco-search-enterprise, each live-indexing service maps to a Kubernetes resource:
| Component | Kind | Default replicas |
|---|---|---|
| Mediation | StatefulSet | 1 |
| Metadata | Deployment | 1 |
| Content | Deployment | 1 |
| Path | Deployment | 1 |
Mediation is a StatefulSet (not a Deployment) because its durable JMS subscription is tied to a stable pod identity. Do not change its replica count.
Override replicaCount for the scalable services in your own values file:
# my-values.yaml
alfresco-search-enterprise:
liveIndexing:
metadata:
replicaCount: 3
content:
replicaCount: 3
# mediation and path intentionally omitted — keep at 1
Apply with:
helm upgrade --install acs alfresco/alfresco-content-services \ -f my-values.yaml \ --namespace alfresco
No HPA for live-indexing services: The upstream chart does not define a HorizontalPodAutoscaler for any live-indexing component. Scaling is manual via replicaCount. If you want to automate it based on ActiveMQ queue depth, KEDA with an ActiveMQ trigger is the right tool, but that is outside the scope of the upstream chart current practices.
Same mechanism as Compose: override the full channel URI via the service environment block in values:
alfresco-search-enterprise:
liveIndexing:
metadata:
replicaCount: 3
environment:
IN_ALFRESCO_METADATA_EVENT_CHANNEL: >-
consumer-sjms:org.alfresco.search.metadata.event?concurrentConsumers=20
content:
replicaCount: 3
environment:
IN_ALFRESCO_CONTENT_EVENT_CHANNEL: >-
consumer-sjms:org.alfresco.search.content.event?concurrentConsumers=20
IN_ALFRESCO_CONTENT_AVAILABILITY_CHANNEL: >-
consumer-sjms:org.alfresco.search.contentstore.event?concurrentConsumers=20
IN_ALFRESCO_CONTENT_REFRESH_EVENT_CHANNEL: >-
consumer-sjms:org.alfresco.search.contentrefresh.event?concurrentConsumers=20
The chart ships conservative defaults. Review these against your node capacity before raising replica counts:
# alfresco-search-enterprise defaults (applies to all four live-indexing pods)
resources:
requests:
cpu: "0.5"
memory: 256Mi
limits:
cpu: "2"
memory: 2048Mi
With three metadata replicas and three content replicas you are requesting 3 CPU and 1.5 Gi just for those six pods. Tune requests to match observed usage before committing to a replica count.
The upstream chart already defines HPA for every transform worker (pdfrenderer, imagemagick, libreoffice, tika, transformmisc) with this default policy:
# defaults in alfresco-transform-service autoscaling: enabled: true minReplicas: 1 maxReplicas: 3 targetCPUUtilizationPercentage: 75
The Transform Router defaults to 2 replicas. If you are scaling content consumers aggressively, raise the HPA ceiling or the router replica count:
alfresco-transform-service:
transformrouter:
replicaCount: 3
pdfrenderer:
autoscaling:
maxReplicas: 6
tika:
autoscaling:
maxReplicas: 6
libreoffice:
autoscaling:
maxReplicas: 6
SFS defaults to 1 replica with a Recreate deployment strategy and a ReadWriteOnce PVC. To run more than one replica you must provide a storage class that supports ReadWriteMany (NFS, Azure Files, Amazon EFS, etc.):
alfresco-transform-service:
filestore:
replicaCount: 2
persistence:
accessModes:
- ReadWriteMany
storageClass: "your-rwx-storage-class"
Without ReadWriteMany, a second SFS pod will fail to mount the same PVC. In practice, a single well-resourced SFS pod is rarely the bottleneck; start by scaling transform workers and revisit SFS only if org.alfresco.search.contentstore.event continues to grow after ATS is adequately scaled.
Reach the ActiveMQ web console with a port-forward:
kubectl port-forward svc/alfresco-activemq 8161:8161 -n alfresco # then open http://localhost:8161 → Queues
Use the same decision table as for Compose:
| Queue | Growing → scale this |
|---|---|
org.alfresco.search.metadata.event |
liveIndexing.metadata.replicaCount |
org.alfresco.search.content.event |
liveIndexing.content.replicaCount |
acs-repo-transform-request |
ATS worker autoscaling.maxReplicas |
org.alfresco.search.contentstore.event |
Content replicas and/or SFS |
org.alfresco.search.contentrefresh.event |
ATS and SFS |
org.alfresco.search.path.event |
Cannot scale. Check pod resources. |
For automated alerting, scrape queue-depth metrics through the ActiveMQ Prometheus exporter and alert when any non-path queue depth exceeds your SLA threshold for a sustained period (e.g., >1,000 messages for >5 minutes).
mediation → always 1 replica (StatefulSet in k8s) path → always 1 replica metadata → start at 2; watch org.alfresco.search.metadata.event content → start at 2; watch org.alfresco.search.content.event ATS → HPA already handles this in Helm; raise maxReplicas if acs-repo-transform-request grows SFS → start at 1; needs ReadWriteMany PVC to go beyond 1 replica
Scale in this order when throughput is insufficient:
concurrentConsumers on that service.acs-repo-transform-request or org.alfresco.search.contentstore.event then grows, scale ATS workers.org.alfresco.search.path.event is growing, scaling is not the answer: investigate whether the path processor has enough CPU/memory, or whether a bulk folder operation triggered a temporary spike that will self-resolve.You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.