cancel
Showing results for 
Search instead for 
Did you mean: 

Search service is calling transaction API 100 times per second

cajova_houba
Champ on-the-rise
Champ on-the-rise

Hello,

we're running Alfresco Content Services Community 7.3 and Alfresco Search Services 2.0.5.1, each on its own separate server. Lately, we've noticed quite a high traffic between these two and it turns out the search service keeps calling the following endpoints:

  • /alfresco/service/api/solr/nextTransaction
  • /alfresco/service/api/solr/trasactions

Each endpoint is called about 50 times per second. I've tried to make some sense out of this. Using the search service report API I was able to get the timestamp of the last successful transaction - 20.9. 2024. I then used a SW for capturing packets and found out the search services is trying to fetch new transactions for hourly intervals from 20.9.2024 until the most current date. Again and again. The repository node always responds with the last transaction time being "20.9. 2024".

The only possibly related configuration I found is the alfresco.cron property, which is currently set to 0/10 * * * * ? * meaning the search services should poll Alfresco for new changes every 10 seconds.

I'm not quite sure on how to approach this nor where the problem is. The only Does anyone get any idea where I should look or what the problem might be?

 

Thanks

1 REPLY 1

afaust
Legendary Innovator
Legendary Innovator

The number of calls to these endpoints is independent of the cron. The cron only specifies when one indexing run is started, but each run - depending on the index/DB data - may run an arbitrary number of calls to get the job done. In systems that don't have (m)any changes in data, I have observed the calls get more numerous as the time without change increased. This is based on how the metadata tracker looks for new transactions to index.

  • take the last index commit time as a starting point (startTime)
  • use the current time as the end point (endTime)
  • then repeat the following until startTime >= endTime or some (new) transactions were found
    • try to find transactions in interval [startTime, startTime + timeStep] via the transactions endpoint
    • increase startTime by adding timeStep as an offset
    • if no transactions found, try the nextTransaction endpoint to find a later commit time from the startTime to use in a new attempt to find transactions in interval [nextTransactionTime, nextTransactionTime + timeStep]

In your case, you have no later commit time, so the nextTransaction endpoint always yields the value -1. The metadata tracker also does not find any transactions, so it repeats the loop with a new "startTime = startTime + timeStep" value.

You can significantly decrease the number of calls to these endpoints by configuring a higher timeStep in the solrcore.properties via the property alfresco.metadata.tracker.timestep. This property does not appear in the default solrcore.properties file because it uses a programmatic default value of 3600000 (1 hour).

You could say that it is a bug in the indexer that it does not stop immediately when the nextTransaction endpoint yields the value of -1 to mean that no newer transaction exists. And it is as simple as a simple break statement that is missing. I created a pull request for this issue in GitHub. But since the SOLR-based SearchServices are essentially close to their deprecation / end-of-life declaration, I doubt this will be included by Alfresco/Hyland.