cancel
Showing results for 
Search instead for 
Did you mean: 

ACS 7.0.1 / ASS 2.0.2: Setting SOLR backup location is broken

jego
Star Contributor
Star Contributor

Goal:

Configure the ASS backup location in our Alfresco setup, so that the SOLR backup is written to a persistent storage.

 

Environment:

ACS 7.0.1

ASS 2.0.2

AKS (Azure Kubernetes)

Approach:

I am using the documentation of ASS 2.0 to configure the backup location of the alfresco and archive core in https://docs.alfresco.com/search-services/latest/admin/

Results:

First problem I have is that I cannot see the backup location input fields in the admin console of the Alfresco repository.

image

The builtin remote backup location should be set via admin console, but these fields are not available anymore in the admin console.

The documentation shows following screenshot.

Image

So try these settings out via admin console does not work.

 

Second problem is that setting these values via alfresco-global.properties as mentioned in the guide does not work.This is my favorite approach as I do have all configuration outside the repository.

 

So it is mentioned to set the remoteBackupLocation(s) of both cores should be possible via alfresco-global.properties file which has been done this way:

 

solr.backup.alfresco.remoteBackupLocation                   /opt/alfresco-search-services/data/backup/alfresco
solr.backup.archive.remoteBackupLocation                    /opt/alfresco-search-services/data/backup/archive

 

 

The JMX dump of the running alfresco shows following entries:

 

Attribute Name                                              Attribute Value

$type                                                       solr

** Object Name                                              Alfresco:Type=Configuration,Category=Search,id1=managed,id2=solr

** Object Type                                              Search$managed$solr

instancePath                                                [managed, solr]
search.solrShardRegistry.maxAllowedReplicaTxCountDifference 1000
search.solrShardRegistry.purgeOnInit                        false
search.solrShardRegistry.shardInstanceTimeoutInSeconds      300
search.solrTrackingSupport.enabled                          true
search.solrTrackingSupport.ignorePathsForSpecificAspects    false
search.solrTrackingSupport.ignorePathsForSpecificTypes      false
solr.backup.alfresco.cronExpression                         0 0 2 * * ?
solr.backup.alfresco.numberToKeep                           3
solr.backup.alfresco.remoteBackupLocation                   /opt/alfresco-search-services/data/backup/alfresco
solr.backup.archive.cronExpression                          0 0 4 * * ?
solr.backup.archive.numberToKeep                            3
solr.backup.archive.remoteBackupLocation                    /opt/alfresco-search-services/data/backup/archive

When using another cron expression (trigger every 5 minutes) to trigger the remote backup, the backup will not be created in this configured location.

We have configured DEBUG log level for org.apache.solr.handler.SnapShooter via SOLR Admin console to see log output

When the backup is triggered via ACS, we will see following entry in the log file of ASS.

Output ASS:

│ alfresco-search 2022-04-01 14:16:08.943 INFO  (Thread-16) [   x:alfresco] o.a.s.h.SnapShooter Creating backup snapshot: <not named> at file:///opt/alfresco-search-services/solrhome/alfresco/

Result:

The location is wrong- it should be the remoteBackupLocation ”/opt/alfresco-search-services/data/backup/alfresco”

 

Next attempt is to use the mentioned backup command via url. So I tried to trigger the backup via GET to this url:

http://localhost:8983/solr/alfresco/replication?command=backup&location=&numberToKeep=4&wt=xml

 

Output ASS:

│ alfresco-search 2022-04-01 14:16:35.936 INFO  (Thread-17) [   x:alfresco] o.a.s.h.SnapShooter Creating backup snapshot <not named> at file:///opt/alfresco-search-services/solrhome/alfresco/       

                                      

Result:

Triggering it works, but it still writes to the default location. It should be the value of remoteBackupLocation  for the alfresco core “/opt/alfresco-search-services/data/backup/alfresco”

 

Now I try to set the location via http parameter with an absolute path like mentioned in the documentation:

 

http://localhost:8983/solr/alfresco/replication?command=backup&location=/opt/alfresco-search-services/data/backup/alfresco&numberToKeep=4&wt=xml

 

Output ASS:

│ alfresco-search 2022-04-01 14:17:48.047 INFO  (Thread-18) [   x:alfresco] o.a.s.h.SnapShooter Creating backup snapshot <not named> at file:///opt/alfresco-search-services/solrhome/alfresco/   

Result:

Triggering the backup works, but it still writes to the default location.
It should be the given location parameter value “/opt/alfresco-search-services/data/backup/alfresco”

 

Deeper analysis:

My analysis to the location parameter shows that setting this location parameter does not work anymore.
When using this endpoint in earlier versions, it was possible to trigger the backup with a specified location.

There is a simple reason that it does not work anymore - Hyland has fixed a CVE and breaks the whole backup location functionality.

Looking into ASS 2.0.2 container and inspect the file solrconfig.xml in /opt/alfresco-search-services/solrhome/templates/rerank/conf/

 you will notice that the location parameter cannot be used anymore due to a mentioned CVE. It now can only be set via solr.backup.dir variable in ASS.

 

<!--
1181   │       This invariant is needed to prevent the usage of location parameter in the replication handler APIs.
1182   │       There is no validation for location parameter. This results in a vulnerability described in https://nvd.nist.gov/vuln/detail/CVE-2020-13941
1183   │       -->
1184   │       <lst name="invariants">
1185   │           <str name="location">${solr.backup.dir:.}</str>
1186   │       </lst>

 

So I am trying to use  the solr.backup.dir value  in the file solrcore.properties in /opt/alfresco-search-services/solrhome/templates/rerank/conf/

Oh, triggering a backup works now, but wait... We have two cores alfresco and archive. Setting the value in this file solrcore.properties which gets copied at the start of the container in both created cores, I cannot have two different locations for these cores anymore.

The result is that snapshots are created in the same directory and I cannot distinguish them anymore.

 

image

As we have immutable containers, the alfresco and archive core are recreated after each restart of the container and we cannot directly change the solr.backup.dir in each core this way.

Can anybody confirm this behaviour? Is there any other workaround?

 

Thanks

Jens

3 REPLIES 3

angelborroy
Community Manager Community Manager
Community Manager

Did you review https://hub.alfresco.com/t5/alfresco-content-services-blog/search-services-2-0-2-release/ba-p/308070?

I guess this part may help you:

SEARCH-2995: Remove backup location in alfresco search service admin screen and SOLR REST API

Hyland Developer Evangelist

jego
Star Contributor
Star Contributor

Hi Angel,

thanks for the pointer.

It is nice that this is part of the release notes - mainly my source of truth is the official documentation. The official documentation still contains the old configuration and is not updated! IMHO, this must be part of a commit that the documentation is updated as well.

I have also tried the mentioned configuration to set the solr.backup.dir in solrcore.properties. It works technically, but you cannot set the backup dir independently for alfresco and archive core anymore when you use something like Kubernetes. The cores (alfresco and archive) needs to be created each time the Kubernetes pod is starting, so I have to set the backup dir in solrcore.properties in the templates directory. And this template file will get copied into both cores, so that we cannot distinguish between the backups when created.

I guess one way to restore the old configuration and have a solid security would be to allow to configure multiple allowed backup directories (solr.backup.dir.1 = ..., solr.backup.dir.2=..." in SOLR  and ACS / REST API can refer to these preconfigured directories.

The other way would be include the solr core name (alfresco + archive) into the backup snapshot name

"snapshot.20220401112857454" will be changed to "snapshot-alfresco.20220401112857454" 

Or you simply create subfolders in the backup routine for each core - when using solr.backup.dir = /backup the snapshot for alfresco will be created in /backup/alfresco and for archive in /backup/archive.

Currently, ASS 2.0.2 in this situation breaks the functionality in my eyes.

Thanks 
Jens

swagner
Confirmed Champ
Confirmed Champ

For me the follwing worked:

1) set solr.backup.dir in solr.in.sh

2) update solrhome/templates/rerank/conf/solrconfig.xml
change
<str name="location">${solr.backup.dir:.}</str>
to
<str name="location">${solr.backup.dir:.}/${data.dir.store:}</str>