cancel
Showing results for 
Search instead for 
Did you mean: 
angelborroy
Community Manager Community Manager
Community Manager

This blog post describes the procedure to remove a number of transactions from Alfresco SOLR Index so they can be re-indexed from scratch. This procedure could help with upgrading and re-indexing scenarios.

Alfresco SOLR cores (alfresco, archive) include different DOC_TYPE documents:

  • Node: documents with properties, content and permissions information
  • Acl: details of document permissions
  • Tx: details of transactions
  • AclTx: details of permission changes
  • State: current status of indexing process

Details on the number of different document types can be obtained by using following URL:

http://localhost:8983/solr/#/alfresco/schema?field=DOC_TYPE

image

Following steps are required to remove and re-index latest transactions in SOLR Core:

  • PURGE desired TX Ids from SOLR Core using SOLR Admin REST API
  • Get transaction properties for desired transaction (Id 40 in the sample below)
  • Disable SOLR tracking
  • Update (technically is "add") TRACKER!STATE!TX Solr Document with properties of desired transaction
  • Enable SOLR tracking
  • Verify pending transactions are indexed

Properties for the document type "State" include latest indexed transaction and permission list:

http://localhost:8983/solr/alfresco/select?fl=*,[cached]&indent=on&q={!term%20f=DOC_TYPE}State&wt=js...

[
      {
        "id":"TRACKER!STATE!ACLTX",
        "_version_":1736233537222213632,
        "S_ACLTXID":9,
        "S_INACLTXID":9,
        "S_ACLTXCOMMITTIME":1655799514574,
        "DOC_TYPE":"State",
        "LAST_INCOMING_CONTENT_VERSION_ID":-10},
      {
        "id":"TRACKER!STATE!TX",
        "_version_":1736234575631220736,
        "S_TXID":43,
        "S_INTXID":43,
        "S_TXCOMMITTIME":1655799968144,
        "DOC_TYPE":"State",
        "LAST_INCOMING_CONTENT_VERSION_ID":-10
     }
]

In the sample above, for alfresco core, latest indexed transaction is 43 with a commit time 1655799968144

In order to remove transactions 41, 42 and 43 from SOLR Core, get properties for transaction 40.

http://localhost:8983/solr/alfresco/select?fl=*,[cached]&indent=on&q=DOC_TYPE:%22Tx%22%20AND%20TXID:...

{
    "id":"TRACKER!TX!8000000000000028",
    "_version_":1736233557160886272,
    "TXID":40,
    "INTXID":40,
    "TXCOMMITTIME":1655799960307,
    "DOC_TYPE":"Tx",
    "int@s_@cascade":0,
    "LAST_INCOMING_CONTENT_VERSION_ID":-10}]
}

Remove the transactions from SOLR Core using the SOLR Admin REST API:

$ curl --location --request GET 'http://localhost:8983/solr/admin/cores?action=purge&txid=43'

$ curl --location --request GET 'http://localhost:8983/solr/admin/cores?action=purge&txid=42'

$ curl --location --request GET 'http://localhost:8983/solr/admin/cores?action=purge&txid=41'

This operation may take a while, verify the latest transaction in SOLR Core is the expected one (40 in this example) before moving forward.

http://localhost:8983/solr/alfresco/select?fl=*,[cached]&indent=on&q=DOC_TYPE:%22Tx%22&rows=1&sort=T...

{
        "id":"TRACKER!TX!8000000000000028",
        "_version_":1736233557160886272,
        "TXID":40,
        "INTXID":40,
        "TXCOMMITTIME":1655799960307,
        "DOC_TYPE":"Tx",
        "int@s_@cascade":0,
        "LAST_INCOMING_CONTENT_VERSION_ID":-10}]
  }

Once the transactions have been removed from SOLR Core, the status document TRACKER!STATE!TX needs to be modified. Before performing this udpate, stop Alfresco Search Services and include following configuration in solrcore.properties to disable tracking process. You need to set this property in both cores: alfresco, archive.

enable.alfresco.tracking=false

Once Alfresco Search Services is up & running again, use the following command to update the status with the properties of transaction 40

$ curl --location --request POST \
'http://localhost:8983/solr/alfresco/update?commitWithin=1000&overwrite=true&wt=json' \
--header 'Content-Type: application/json' \
--data-raw '[
    {
        "id":"TRACKER!STATE!TX",
        "_version_":1,
        "S_TXID":40,
        "S_INTXID":40,
        "S_TXCOMMITTIME":1655799960307,
        "DOC_TYPE":"State",
        "LAST_INCOMING_CONTENT_VERSION_ID":-10
    }
]'

Stop Alfresco Search Services again and revert previous configuration in solrcore.properties files

enable.alfresco.tracking=true

Once Alfresco Search Services is up & running, transactions from Id 40 will be indexed on the regular tracking process. After a while, latest transaction can be verified as 43 in both TX and TRACKER!STATE!TX documents.

http://localhost:8983/solr/alfresco/select?fl=*,[cached]&indent=on&q={!term%20f=DOC_TYPE}State&wt=js...

{
        "id":"TRACKER!STATE!TX",
        "_version_":1736237510552453120,
        "S_TXID":43,
        "S_INTXID":43,
        "S_TXCOMMITTIME":1655799968144,
        "DOC_TYPE":"State",
        "LAST_INCOMING_CONTENT_VERSION_ID":-10
}

http://localhost:8983/solr/alfresco/select?fl=*,[cached]&indent=on&q=DOC_TYPE:%22Tx%22&rows=1&sort=T...

{
        "id":"TRACKER!TX!800000000000002b",
        "_version_":1736237530581303296,
        "TXID":43,
        "INTXID":43,
        "TXCOMMITTIME":1655799968144,
        "DOC_TYPE":"Tx",
        "int@s_@cascade":0,
        "LAST_INCOMING_CONTENT_VERSION_ID":-10
}

Additional notes

An alternative approach to disable indexing, contributed by @morganp1, is the use of SOLR REST API "disable indexing" action:

http://localhost:8983/solr/admin/cores?action=disable-indexing

<response>
<lst name="action">
  <lst name="alfresco">
   <bool name="CASCADE">false</bool>
   <bool name="CONTENT">false</bool>
   <bool name="ACL">false</bool>
   <bool name="METADATA">false</bool>
 </lst>
 <lst name="archive">
   <bool name="CASCADE">false</bool>
   <bool name="CONTENT">false</bool>
   <bool name="ACL">false</bool>
   <bool name="METADATA">false</bool>
 </lst>
</lst>
</response>

This operation doesn't require re-starting the SOLR Server, that may be recommended for some use cases.

In order to restore the indexing process again, use the action in the opposite way:

http://localhost:8983/solr/admin/cores?action=enable-indexing

<response>
</lst>
<lst name="action">
  <lst name="alfresco">
   <bool name="CASCADE">true</bool>
   <bool name="CONTENT">true</bool>
   <bool name="ACL">true</bool>
   <bool name="METADATA">true</bool>
 </lst>
 <lst name="archive">
   <bool name="CASCADE">true</bool>
   <bool name="CONTENT">true</bool>
   <bool name="ACL">true</bool>
   <bool name="METADATA">true</bool>
 </lst>
</lst>
</response>