Overview
Alfresco Content Services supports the Elasticsearch(ES) platform for searching within the repository using Alfresco Search Enterprise 3.X. While Elasticsearch is a robust technology, it has number of limitations whereas in some situations new index creation and re-indexing is required.
However, considering the re-indexing speed this can be achieved within hours without downtime.
This documentation describes the steps which need to follow to build the indexes offline by connecting to same metadas store(Database).
Steps.
Elasticsearch
Create the Elasticsearch index with new name (Example ‘alfresco-new’) with desired number of shards, replicas, total fields etc.
curl -XPUT 'http://ESHOST:PORT/alfresco-new?pretty' -H 'Content-Type: application/json' -d' { "settings" :{ "number_of_shards":5, "number_of_replicas":0, "index.mapping.total_fields.limit":2000 } }'
Alfresco Content Service (ACS)
elasticsearch.indexName=alfresco-new server.allowWrite=false
Note: If contentModel changes being made, then make sure to start the repo with new changes deployed.
2. Perform a search to start up the search subsystem, which will then automatically create the relevant mappings in newly created index.
Verification steps
a. Use curl command to Elasticsearch index and validate few of the property mappings and "dynamic" : "false",
curl -XGET 'http://ESHOST:ESPORT/alfresco-new?pretty'
b. Below loggers will appear in catalina.out
2023-06-23 12:21:47,403 INFO [elasticsearch.contentmodelsync.ContentModelSynchronizer] [elasticsearch-initializer] Successfully loaded analysers.
2023-06-23 12:21:47,543 INFO [elasticsearch.contentmodelsync.ContentModelSynchronizer] [elasticsearch-initializer] Successfully loaded basic mappings.
2023-06-23 12:21:47,553 INFO [elasticsearch.contentmodelsync.ElasticsearchInitialiser] [elasticsearch-initializer] Successfully connected to Elasticsearch index.
3. Generate the reindex.prefixes-file.json
4. Execute the re-indexing commands by passing the additional parameter ‘elasticsearch.indexName’ set to the new index. By default, this is set to alfresco.
java -Xmx4G -jar alfresco-elasticsearch-reindexing-3.3.0.1-app.jar \ --server.port=9090 \ --alfresco.reindex.jobName=reindexByIds \ --spring.elasticsearch.rest.uris=http://localhost:9200/alfresco-new \ --spring.elasticsearch.rest.username=username \ --spring.elasticsearch.rest.password=password \ --alfresco.accepted-content-media-types-cache.enabled=false \ --spring.activemq.broker-url=nio://localhost:61616 \ --alfresco.reindex.fromId=0 \ --alfresco.reindex.toId=5000000 \ --alfresco.reindex.multithreadedStepEnabled=true \ --alfresco.reindex.concurrentProcessors=30 \ --alfresco.reindex.metadataIndexingEnabled=true \ --alfresco.reindex.contentIndexingEnabled=false \ --alfresco.reindex.pathIndexingEnabled=false \ --alfresco.reindex.prefixes-file=file:reindex.prefixes-file.json \ --alfresco.reindex.pagesize=10000 \ --alfresco.reindex.batchSize=1000\ --elasticsearch.indexName=alfresco-new > reindexing.log &
5. Validate the document count.
curl -XGET 'http://ESHOST:ESPORT/alfresco-new/_count?pretty'
6. Point ACS cluster nodes to new index: alfresco-new
7. Optional: Destroy the newly created ACS instance and old ElasticSearch index.
8. Happy searching.