Hyland Connect

simonsimonson · ‎05-26-2015

Hi,

We are developing a team calendar using alfresco share as the backend. There is a significant lag (up to 10 seconds) when creating/updating/deleting events. What the application does is post events to the share, then submits a query to get the updated list of events and updates the front end components with the new list. I use CMIS workbench to check if the content is already in the repository immediately after creation/change (which it is), but it takes some time for the query to fetch the new result set. From my investigations up to this point, I found that search indexing might be lagging. The alfresco.lag parameter is set at 1000 milliseconds, which seems to be fine. We tried tried this on a completely new instance of alfresco, the problem still persists. Any ideas where to look next?

mike38 · ‎07-08-2016

A little late but if I am right it might be useful to people having the same problem.
The lag as indicated by alfresco.lag is NOT the maximum lag of solr, but the minimum one…

solr will look for new stuff according to the schedule defined in alfresco.cron which is usually every 15th second.
It will ask for new stuff in an interval starting (alfresco.hole.retention) ms before its LAST INDEXED data, up to current time LESS alfresco.lag.

All times in millisecond.
Default alfresco.hole.retention is one hour.

Thus with default setting, solr will index every 15th second from 1 hour before last known data up to current time less 1 second. This gives the average lag of 10 seconds when something new happens. <strong>If you want to reduce it, you must change the alfresco.cron property</strong>.

I absolutely do not understand why the 1 second lag is for. Nor why it would repeatedly and blindly ask for all new stuff 1 hour before that last indexed data even if nothing was added for days. The result is that 4 times a minute the system runs by itself, produces huge access logs in tomcat, and is still significantly lagging whatever the load of the indexer. It looks to me like if the goal was to generate as much workload and logs as possible and still carefully avoid being up-to-date at any time ? A kind of demonstration ab absurdo of why a pull design is flawed ?

Hyland Connect

Lag when searching alfresco share