11-29-2022 02:19 AM
We are using the following community version setup
Alfresco 7.1
Search Services 2.0.2
solr-spec 6.6.5
With the following spacestore.
alfresco
numDocs:31212220
maxDoc:37005238
archive
numDocs:21257292
maxDoc:21445273
Our questions are
1. To fully index the above files, we checked it need 21 hours to complete, how to speed it up? We are nearly using default solr6 setting which may be not for production purpose
2. Any good way to check if the indexing completed? The above 21 hours is checked by us regular to see if the numDocs unchanged
3. We found that there are some snapshot folder gerenated in solr6 and keep adding. Is there a way to do a clean up on that?
Thank you
11-29-2022 07:52 AM
Hi, AFAIK:
1. IMO it is not the worst reindexing time I've seen. This time depends of many things (resources, documents - number and mimetypes, parametrization). For speeding, you may adjust several things including the JVM resources of the SOLR machine (if you have one dedicated for indexing), the number of documents per batch you are indexing, the cron for querying the database for transactions (solrcore.properties). In addition, in Alfresco 7.x, there exist some additional indexes in database that may improve the indexing process. Other possibilities may reduce the index, and maybe the time for full processing (indexing only metadata, disabling automatic metadata extraction, not using cross locale / exact term queries, not using suggest feature, not using fingerprints..).
2. The indexing is completed when in Alfresco Admin Console (with OOTB Support Tools in Community) the Search Service has 0 transactions to index (in both cores).
3. The snapshot folder that you refer is probably a backup that is daily done during the night. It is useful for restoring indices in a backup procedure. It may occupy several Gb too (be careful with the disk), so it is a good idea to point to a proper directory. In 7.1 the path is configured in solrcore.properties. You may also want to keep a smaller number of backups (cause 3 is the default value). For disabling this backup, you may configure the cron property for SOLR backups (in alfresco-global.properties) in a future date, in 2029 for example.
Kind regards.
--C.
11-29-2022 12:06 PM
Thank you @cesarista
For #2, any link for how to install it? I am using Windows OS
For #3, I can't find any related setting in alfresco-global.properties. Can you point me? Also, in my previous testing, the snapshot at least last for 6 days (and then I delete all of them manually for rebuilding). So, so how I thought the snapshot is not cleaned. Is there any config. to set to limit the number of backups (such as 3 you mentioned?)
12-05-2022 12:18 PM
Hi:
For #2: You may get OOTB package (AMP) from docker-installer and to install it as any other standard AMP.
For #3:
In alfresco-global.properties you may set something like for preserving a number of SOLR backups:
solr.backup.alfresco.numberToKeep=3
solr.backup.archive.numberToKeep=3
Regards.
--C.
Explore our Alfresco products with the links below. Use labels to filter content by product module.