Hyland Connect

joe_l3 · ‎02-23-2026

Hello,

does high availability work for a Solr sharding with DB_ID_RANGE method ?

I have 3 shards (manually) distributed on 3 Solr nodes with replication factor 2. Here my setup:

Alfresco Content service 7.4.0.1
Alfresco Search service : 2.0 with Solr 6
Sharding Method: DB_ID_RANGE

Solr node 1 (port 8181)

alfresco-0
alfresco-1

Solr node 2 (port 8282)

alfresco-1
alfresco-2

Solr node 3 (port 8383)

alfresco-0
alfresco-2

Alfresco Content Service (alfresco-global.properties)

solr6.store.mappings.value.solrMappingAlfresco.nodeString=solr-host-1:8181/solr/alfresco,solr-host-2:8282/solr/alfresco,solr-host-3:8383/solr/alfresco
solr6.store.mappings.value.solrMappingAlfresco.numShards=3
solr6.store.mappings.value.solrMappingAlfresco.replicationFactor=2

Assuming 1000+ ID rows on my DB, I created the 3 shards (and replicas) with the following ID range:
alfresco-0: 0-400
alfresco-1: 401-800
alfresco-2: 801-1200

All indexes are working fine, because I individually tested each Solr node with some FTS query directly on Solr Rest API (by their query web console). Alfresco-Share works fine as well and the repository looks as expected.

The issue comes after intentionally stopping a Solr node to test the high availability. Alfresco shows this error:

026-02-23T16:17:34,849 [] ERROR [extensions.webscripts.AbstractRuntime] Failed to execute search: PATH:"/app:company_home//*" AND ASPECT:"{http://www.alfresco.org/model/content/1.0}taggable" 
Caused by: org.alfresco.repo.search.QueryParserException: 01230759
Caused by: java.net.ConnectException: Connection refused
	at sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
...
01230037 Solr request failed with 500 /solr/alfresco-2/afts?wt=json&fl=DBID,score&rows=100&shards=http://solr-1:8181/solr/alfresco-0,http://solr-2:8282/solr/alfresco-1,http://solr-3:8383/solr/alfresco-2&df=TEXT&start=0&locale=it&alternativeDic=DEFAULT_DICTIONARY&fq={!afts}AUTHORITY_FILTER_FROM_JSON&fq={!afts}TENANT_FILTER_FROM_JSON

Pay attention to the Error 500 Response and look at the "shards" url parameter:

shards=http://localhost:8181/solr/alfresco-0,http://localhost:8282/solr/alfresco-1,http://localhost:8383/so...

It looks like Alfresco builds a shard chain sequentially alfresco-0,alfresco-1,alfresco-2. Is that an expected behaviour for high availability ?

angelborroy · ‎02-24-2026

Yes, what you are seeing is expected.

In your configuration Alfresco only knows one Solr endpoint per logical shard, even though you created replicas. When it builds the shards= parameter, it selects one URL for each shard (alfresco-0, alfresco-1, alfresco-2). If the node hosting one of those shard instances is down, there is no alternative endpoint for that shard, so the query fails with Connection refused.

Sharding itself does not automatically provide high availability. HA only works if Alfresco can see and use multiple shard instances per shard (either via dynamic shard registration or by explicitly configuring all shard-instance endpoints).

In your current setup, there is no failover for a shard, so the behavior is correct.

Hyland Developer Evangelist

joe_l3 · ‎02-24-2026

Hello Angel,
thanks for your answer.
What you said makes sense. In other words there's no guarantee you can have HA out-of-the-box with DB_ID_RANGE shard method.
I have never experienced explicit shard endpoints configuration and the dynamic shard registration didn't help with Alfresco CE + DB_ID_RANGE.
I tried a minimal dynamic registration (no HA), but it seems only the first specified solr node is registered:

## alfresco-global.properties
## 
index.subsystem.name=solr6
solr.host=host-1
solr.port=8181

## this should be the DSR trigger on alfresco-side
solr.useDynamicShardRegistration=true
search.solrShardRegistry.purgeOnInit=true
search.solrShardRegistry.shardInstanceTimeoutInSeconds=300
search.solrShardRegistry.dbidRangeRefreshTimeoutInSeconds=30
search.solrShardRegistry.maxAllowedReplicaTxCountDifference=1000

## Solr (host-1:8181)
##
shard.method=DB_ID_RANGE
numShards=3
numNodes=3
nodeInstance=1
shard.range=0-400
shard.instance=0

## Solr (host-2:8282)
##
shard.method=DB_ID_RANGE
numShards=3
numNodes=3
nodeInstance=2
shard.range=401-800
shard.instance=1

## Solr (host-3:8383)
##
shard.method=DB_ID_RANGE
numShards=3
numNodes=3
nodeInstance=3
shard.range=801-1200
shard.instance=2

All 3 shards are created succesfully but Alfresco runs queries on Solr host-1 only.
At the end, it seems the only way to prevent a SPOF with DB_ID_RANGE mode, is by classic external HA solutions like http load balancing, clustering and so on.

joe_l3 · ‎02-25-2026

After debugging the source code, I think the dynamic shard registry has been removed from Alfresco Community Edition. At query time, alfresco apparently instantiates always the ExplicitSolrStoreMappingWrapper instead of the DynamicSolrStoreMappingWrapper. At least with DB_ID_RANGE sharding. That happens even if you have explicitly enabled the dynamic registry (solr.useDynamicShardRegistration=true).

That because shardRegistry object is null when alfresco tries to extract and build the Solr mapping by SolrClientUtil:

// org.alfresco.repo.search.impl.solr.SolrClientUtil
//
public static SolrStoreMappingWrapper extractMapping(StoreRef store, 
            HashMap<StoreRef, SolrStoreMappingWrapper> mappingLookup, ShardRegistry shardRegistry,
            boolean useDynamicShardRegistration,BeanFactory beanFactory)
    {
	// shardRegistry is  null here !!!
        if((shardRegistry != null) && useDynamicShardRegistration)
        {
            SearchParameters sp = new SearchParameters();
            ...
            return DynamicSolrStoreMappingWrapperFactory.wrap(slice, beanFactory);
        }
        else
        { 
	       // ok there's another wrapper for you,
            // this is the concrete ExplicitSolrStoreMappingWrapper here
            SolrStoreMappingWrapper mappings = mappingLookup.get(store);
	    ....
            return mappings;
        }
    }

I think devs have intentionally removed part of shard registry implementation on community edition. I can see a specific search-enterprise-context.xml for the enterprise edition that is missing on community ed.

<bean id="search.solrQueryHTTPCLient" class="org.alfresco.repo.search.impl.solr.SolrQueryHTTPClient" init-method="init">
...
 <property name="shardRegistry">
   <ref bean="search.SolrShardRegistry"/>
 </property>

And this is the implementation in common-search-enterprise-context.xml

<bean id="search.solrShardRegistry" class="org.alfresco.repo.index.shard.ShardRegistryImpl" init-method="init">
 <property name="purgeOnInit">
   <value>${search.solrShardRegistry.purgeOnInit}</value>
</property>

I think the official documentation should be updated about that

https://support.hyland.com/r/Alfresco/Alfresco-Search-and-Insight-Engine/2.0/Alfresco-Search-and-Ins...

joe_l3 · ‎03-02-2026

Came back here...

about the HA which was the initial point of this topic, I didn't find any out-of-the-box solution able to prevent a singlePointFailure for DB_ID_RANGE sharding method. The only valid distribution of shards seems to be the creation of every instances on each Solr node. That means:

Solr node 1: shard 0, 1, 2
Solr node 2: shard 0, 1, 2
Solr node 3: shard 0, 1, 2

In that case, Alfresco is able to build at query time, the correct Solr chain. Unfortunatellly, in order to refresh the shard registration, you have to force Alfresco restart all the time a Solr node is down.

None of those examples mentioned High Availability part of the official documentation works as expected with DB_ID_RA_NGE shard method.

Hyland Connect

Solr Sharding DB_ID_RANGE and High Availability