04-01-2022 10:43 AM
Goal:
Configure the ASS backup location in our Alfresco setup, so that the SOLR backup is written to a persistent storage.
Environment:
ACS 7.0.1
ASS 2.0.2
AKS (Azure Kubernetes)
Approach:
I am using the documentation of ASS 2.0 to configure the backup location of the alfresco and archive core in https://docs.alfresco.com/search-services/latest/admin/
Results:
First problem I have is that I cannot see the backup location input fields in the admin console of the Alfresco repository.
The builtin remote backup location should be set via admin console, but these fields are not available anymore in the admin console.
The documentation shows following screenshot.
So try these settings out via admin console does not work.
Second problem is that setting these values via alfresco-global.properties as mentioned in the guide does not work.This is my favorite approach as I do have all configuration outside the repository.
So it is mentioned to set the remoteBackupLocation(s) of both cores should be possible via alfresco-global.properties file which has been done this way:
solr.backup.alfresco.remoteBackupLocation /opt/alfresco-search-services/data/backup/alfresco solr.backup.archive.remoteBackupLocation /opt/alfresco-search-services/data/backup/archive
The JMX dump of the running alfresco shows following entries:
Attribute Name Attribute Value $type solr ** Object Name Alfresco:Type=Configuration,Category=Search,id1=managed,id2=solr ** Object Type Search$managed$solr instancePath [managed, solr] search.solrShardRegistry.maxAllowedReplicaTxCountDifference 1000 search.solrShardRegistry.purgeOnInit false search.solrShardRegistry.shardInstanceTimeoutInSeconds 300 search.solrTrackingSupport.enabled true search.solrTrackingSupport.ignorePathsForSpecificAspects false search.solrTrackingSupport.ignorePathsForSpecificTypes false solr.backup.alfresco.cronExpression 0 0 2 * * ? solr.backup.alfresco.numberToKeep 3 solr.backup.alfresco.remoteBackupLocation /opt/alfresco-search-services/data/backup/alfresco solr.backup.archive.cronExpression 0 0 4 * * ? solr.backup.archive.numberToKeep 3 solr.backup.archive.remoteBackupLocation /opt/alfresco-search-services/data/backup/archive
When using another cron expression (trigger every 5 minutes) to trigger the remote backup, the backup will not be created in this configured location.
We have configured DEBUG log level for org.apache.solr.handler.SnapShooter via SOLR Admin console to see log output
When the backup is triggered via ACS, we will see following entry in the log file of ASS.
Output ASS:
│ alfresco-search 2022-04-01 14:16:08.943 INFO (Thread-16) [ x:alfresco] o.a.s.h.SnapShooter Creating backup snapshot: <not named> at file:///opt/alfresco-search-services/solrhome/alfresco/
Result:
The location is wrong- it should be the remoteBackupLocation ”/opt/alfresco-search-services/data/backup/alfresco”
Next attempt is to use the mentioned backup command via url. So I tried to trigger the backup via GET to this url:
http://localhost:8983/solr/alfresco/replication?command=backup&location=&numberToKeep=4&wt=xml
Output ASS:
│ alfresco-search 2022-04-01 14:16:35.936 INFO (Thread-17) [ x:alfresco] o.a.s.h.SnapShooter Creating backup snapshot <not named> at file:///opt/alfresco-search-services/solrhome/alfresco/
Result:
Triggering it works, but it still writes to the default location. It should be the value of remoteBackupLocation for the alfresco core “/opt/alfresco-search-services/data/backup/alfresco”
Now I try to set the location via http parameter with an absolute path like mentioned in the documentation:
http://localhost:8983/solr/alfresco/replication?command=backup&location=/opt/alfresco-search-services/data/backup/alfresco&numberToKeep=4&wt=xml
Output ASS:
│ alfresco-search 2022-04-01 14:17:48.047 INFO (Thread-18) [ x:alfresco] o.a.s.h.SnapShooter Creating backup snapshot <not named> at file:///opt/alfresco-search-services/solrhome/alfresco/
Result:
Triggering the backup works, but it still writes to the default location.
It should be the given location parameter value “/opt/alfresco-search-services/data/backup/alfresco”
Deeper analysis:
My analysis to the location parameter shows that setting this location parameter does not work anymore.
When using this endpoint in earlier versions, it was possible to trigger the backup with a specified location.
There is a simple reason that it does not work anymore - Hyland has fixed a CVE and breaks the whole backup location functionality.
Looking into ASS 2.0.2 container and inspect the file solrconfig.xml in /opt/alfresco-search-services/solrhome/templates/rerank/conf/
you will notice that the location parameter cannot be used anymore due to a mentioned CVE. It now can only be set via solr.backup.dir variable in ASS.
<!-- 1181 │ This invariant is needed to prevent the usage of location parameter in the replication handler APIs. 1182 │ There is no validation for location parameter. This results in a vulnerability described in https://nvd.nist.gov/vuln/detail/CVE-2020-13941 1183 │ --> 1184 │ <lst name="invariants"> 1185 │ <str name="location">${solr.backup.dir:.}</str> 1186 │ </lst>
So I am trying to use the solr.backup.dir value in the file solrcore.properties in /opt/alfresco-search-services/solrhome/templates/rerank/conf/
Oh, triggering a backup works now, but wait... We have two cores alfresco and archive. Setting the value in this file solrcore.properties which gets copied at the start of the container in both created cores, I cannot have two different locations for these cores anymore.
The result is that snapshots are created in the same directory and I cannot distinguish them anymore.
As we have immutable containers, the alfresco and archive core are recreated after each restart of the container and we cannot directly change the solr.backup.dir in each core this way.
Can anybody confirm this behaviour? Is there any other workaround?
Thanks
Jens
04-02-2022 01:09 PM
Did you review https://hub.alfresco.com/t5/alfresco-content-services-blog/search-services-2-0-2-release/ba-p/308070?
I guess this part may help you:
SEARCH-2995: Remove backup location in alfresco search service admin screen and SOLR REST API
04-04-2022 05:11 AM
Hi Angel,
thanks for the pointer.
It is nice that this is part of the release notes - mainly my source of truth is the official documentation. The official documentation still contains the old configuration and is not updated! IMHO, this must be part of a commit that the documentation is updated as well.
I have also tried the mentioned configuration to set the solr.backup.dir in solrcore.properties. It works technically, but you cannot set the backup dir independently for alfresco and archive core anymore when you use something like Kubernetes. The cores (alfresco and archive) needs to be created each time the Kubernetes pod is starting, so I have to set the backup dir in solrcore.properties in the templates directory. And this template file will get copied into both cores, so that we cannot distinguish between the backups when created.
I guess one way to restore the old configuration and have a solid security would be to allow to configure multiple allowed backup directories (solr.backup.dir.1 = ..., solr.backup.dir.2=..." in SOLR and ACS / REST API can refer to these preconfigured directories.
The other way would be include the solr core name (alfresco + archive) into the backup snapshot name
"snapshot.20220401112857454" will be changed to "snapshot-alfresco.20220401112857454"
Or you simply create subfolders in the backup routine for each core - when using solr.backup.dir = /backup the snapshot for alfresco will be created in /backup/alfresco and for archive in /backup/archive.
Currently, ASS 2.0.2 in this situation breaks the functionality in my eyes.
Thanks
Jens
02-08-2023 07:45 AM
For me the follwing worked:
1) set solr.backup.dir in solr.in.sh
<str name="location">${solr.backup.dir:.}</str>
<str name="location">${solr.backup.dir:.}/${data.dir.store:}</str>
Explore our Alfresco products with the links below. Use labels to filter content by product module.